CRAN/E | dataprep

dataprep

Efficient and Flexible Data Preprocessing Tools

Installation

About

Efficiently and flexibly preprocess data using a set of data filtering, deletion, and interpolation tools. These data preprocessing methods are developed based on the principles of completeness, accuracy, threshold method, and linear interpolation and through the setting of constraint conditions, time completion & recovery, and fast & efficient calculation and grouping. Key preprocessing steps include deletions of variables and observations, outlier removal, and missing values (NA) interpolation, which are dependent on the incomplete and dispersed degrees of raw data. They clean data more accurately, keep more samples, and add no outliers after interpolation, compared with ordinary methods. Auto-identification of consecutive NA via run-length based grouping is used in observation deletion, outlier removal, and NA interpolation; thus, new outliers are not generated in interpolation. Conditional extremum is proposed to realize point-by-point weighed outlier removal that saves non-outliers from being removed. Plus, time series interpolation with values to refer to within short periods further ensures reliable interpolation. These methods are based on and improved from the reference: Liang, C.-S., Wu, H., Li, H.-Y., Zhang, Q., Li, Z. & He, K.-B. (2020) doi:10.1016/j.scitotenv.2020.140923.

Key Metrics

Version 0.1.5
R ≥ 3.5.0
Published 2022-01-15 837 days ago
Needs compilation? no
License GPL-2
License GPL-3
CRAN checks dataprep results

Downloads

Yesterday 14 -22%
Last 7 days 92 -3%
Last 30 days 329 +5%
Last 90 days 915 -25%
Last 365 days 3.922 +0%

Maintainer

Maintainer

Chun-Sheng Liang

liangchunsheng@lzu.edu.cn

Authors

Chun-Sheng Liang
Hao Wu
Hai-Yan Li
Qiang Zhang
Zhanqing Li
Ke-Bin He
Lanzhou University
Tsinghua University

Material

Reference manual
Package source

Vignettes

dataprep: data preprocessing and plots

macOS

r-release

arm64

r-oldrel

arm64

r-release

x86_64

r-oldrel

x86_64

Windows

r-devel

x86_64

r-release

x86_64

r-oldrel

x86_64

Old Sources

dataprep archive

Depends

R ≥ 3.5.0

Imports

ggplot2
scales
foreach
doParallel
dplyr
reshape2
data.table
zoo

Suggests

knitr
rmarkdown