CRAN/E | datanugget

datanugget

Create, and Refine Data Nuggets

Installation

About

Creating, and refining data nuggets. Data nuggets reduce a large dataset into a small collection of nuggets of data, each containing a center (location), weight (importance), and scale (variability) parameter. Data nugget centers are created by choosing observations in the dataset which are as equally spaced apart as possible. Data nugget weights are created by counting the number observations closest to a given data nugget’s center. We then say the data nugget 'contains' these observations and the data nugget center is recalculated as the mean of these observations. Data nugget scales are created by calculating the trace of the covariance matrix of the observations contained within a data nugget divided by the dimension of the dataset. Data nuggets are refined by 'splitting' data nuggets which have scales or shapes (defined as the ratio of the two largest eigenvalues of the covariance matrix of the observations contained within the data nugget) Reference paper: [1] Cherasia, K. E., Cabrera, J., Fernholz, L. T., & Fernholz, R. (2022). Data Nuggets in Supervised Learning. \emph{In Robust and Multivariate Statistical Methods: Festschrift in Honor of David E. Tyler} (pp. 429-449). Cham: Springer International Publishing. [2] Beavers, T., Cheng, G., Duan, Y., Cabrera, J., Lubomirski, M., Amaratunga, D., Teigler, J. (2023). Data Nuggets: A Method for Reducing Big Data While Preserving Data Structure (Submitted for Publication).

Key Metrics

Version 1.2.4
R ≥ 4.0
Published 2023-11-28 142 days ago
Needs compilation? no
License GPL-2
CRAN checks datanugget results

Downloads

Yesterday 5 0%
Last 7 days 22 -56%
Last 30 days 191 -8%
Last 90 days 4.262 -90%
Last 365 days 49.152 +1920%

Maintainer

Maintainer

Yajie Duan

yajieritaduan@gmail.com

Authors

Yajie Duan

cre / ctb

Traymon Beavers

aut

Javier Cabrera

aut

Ge Cheng

aut

Kunting Qi

aut

Mariusz Lubomirski

aut

Material

Reference manual
Package source

macOS

r-release

arm64

r-oldrel

arm64

r-release

x86_64

r-oldrel

x86_64

Windows

r-devel

x86_64

r-release

x86_64

r-oldrel

x86_64

Old Sources

datanugget archive

Depends

R ≥ 4.0
doSNOW ≥ 1.0.16
foreach ≥ 1.5.1
parallel ≥ 4.0.5
Rfast ≥ 2.0.7

Reverse Depends

WCluster