CRAN/E | stringdist

stringdist

Approximate String Matching, Fuzzy Text Search, and String Distance Functions

Installation

About

Implements an approximate string matching version of R's native 'match' function. Also offers fuzzy text search based on various string distance measures. Can calculate various string distances based on edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal sting alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided as well. Distances can be computed between character vectors while taking proper care of encoding or between integer vectors representing generic sequences. This package is built for speed and runs in parallel by using 'openMP'. An API for C or C++ is exposed as well. Reference: MPJ van der Loo (2014) doi:10.32614/RJ-2014-011.

Citation stringdist citation info
github.com/markvanderloo/stringdist
Bug report File report

Key Metrics

Version 0.9.12
R ≥ 2.15.3
Published 2023-11-28 143 days ago
Needs compilation? yes
License GPL-3
CRAN checks stringdist results

Downloads

Yesterday 2.423 0%
Last 7 days 12.332 -29%
Last 30 days 63.174 -11%
Last 90 days 192.689 +55%
Last 365 days 525.277 +17%

Maintainer

Maintainer

Mark van der Loo

mark.vanderloo@gmail.com

Authors

Mark van der Loo

aut / cre

Jan van der Laan

ctb

R Core Team

ctb

Nick Logan

ctb

Chris Muir

ctb

Johannes Gruber

ctb

Brian Ripley

ctb

Material

README
NEWS
Reference manual
Package source

In Views

NaturalLanguageProcessing
OfficialStatistics

Vignettes

RJournal 6 111-122 (2014)
stringdist C/C++ API

macOS

r-release

arm64

r-oldrel

arm64

r-release

x86_64

r-oldrel

x86_64

Windows

r-devel

x86_64

r-release

x86_64

r-oldrel

x86_64

Old Sources

stringdist archive

Depends

R ≥ 2.15.3

Imports

parallel

Suggests

tinytest

Reverse Depends

AurieLSHGaussian
blink
diathor
VDJgermlines

Reverse Imports

available
bdc
bdlp
bibliometrix
biblioverlap
BiocCheck
bioseq
CAGEr
campfin
ChromSCape
clickR
ClustIRR
clustringr
cols4all
concatipede
Conigrave
countries
CTDquerier
daqapo
deductive
DeepPINCS
discoverableresearch
dupree
fastLink
fcuk
fedmatch
flora
fossilbrush
fuzzyjoin
genBaRcode
geneHapR
GeneStructureTools
GetLattesData
immunarch
immuneSIM
labourR
levitate
LexisNexisTools
lingtypology
LinTInd
LymphoSeq
MEDseq
messy.cats
occupationMeasurement
ogrdbstats
openPrimeR
palaeoverse
PGRdup
qdap
r2dii.match
Load all 67 items
(warning: might lead to performance issues and take some time)

Reverse Suggests

embed
epiCleanr
GenProSeq
googleLanguageR
lavaanExtra
rlist
seqtrie
sjmisc
WorldFlora

Reverse LinkingTo

refinr