CRAN/E | textreuse

textreuse

Detect Text Reuse and Document Similarity

Installation

About

Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.

docs.ropensci.org/textreuse
github.com/ropensci/textreuse
Bug report File report

Key Metrics

Version 0.1.5
R ≥ 3.1.1
Published 2020-05-15 1450 days ago
Needs compilation? yes
License MIT
License File
CRAN checks textreuse results

Downloads

Yesterday 27 0%
Last 7 days 137 -20%
Last 30 days 518 -11%
Last 90 days 1.510 -17%
Last 365 days 6.349 -85%

Maintainer

Maintainer

Lincoln Mullen

lincoln@lincolnmullen.com

Authors

Lincoln Mullen

aut / cre

Material

README
NEWS
Reference manual
Package source

In Views

NaturalLanguageProcessing

Vignettes

Text alignment
Introduction to the textreuse packages
Minhash and locality-sensitive hashing
Pairwise comparisons for document similarity

macOS

r-release

arm64

r-oldrel

arm64

r-release

x86_64

r-oldrel

x86_64

Windows

r-devel

x86_64

r-release

x86_64

r-oldrel

x86_64

Old Sources

textreuse archive

Depends

R ≥ 3.1.1

Imports

assertthat ≥ 0.1
digest ≥ 0.6.8
dplyr ≥ 0.8.0
NLP ≥ 0.1.8
Rcpp ≥ 0.12.0
RcppProgress ≥ 0.1
stringr ≥ 1.0.0
tibble ≥ 3.0.1
tidyr ≥ 0.3.1

Suggests

testthat ≥ 0.11.0
knitr ≥ 1.11
rmarkdown ≥ 0.8
covr

LinkingTo

BH
Rcpp
RcppProgress

Reverse Suggests

textrank