CRAN/E | udpipe

udpipe

Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

Installation

About

This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at . The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at doi:10.18653/v1/K17-3009. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.

bnosac.github.io/udpipe/en/index.html
github.com/bnosac/udpipe

Key Metrics

Version 0.8.11
R ≥ 2.10
Published 2023-01-06 470 days ago
Needs compilation? yes
License MPL-2.0
CRAN checks udpipe results

Downloads

Yesterday 171 0%
Last 7 days 943 -30%
Last 30 days 4.804 +4%
Last 90 days 15.407 -11%
Last 365 days 56.716 +5%

Maintainer

Maintainer

Jan Wijffels

jwijffels@bnosac.be

Authors

Jan Wijffels

aut / cre / cph

BNOSAC

cph

Institute of Formal
Applied Linguistics
Faculty of Mathematics
Physics
Charles University in Prague
Czech Republic

cph

Milan Straka

ctb / cph

Jana Straková

ctb / cph

Material

README
NEWS
Reference manual
Package source

In Views

NaturalLanguageProcessing

Vignettes

UDPipe Natural Language Processing - Annotating text
UDPipe Natural Language Processing - Parallel
UDPipe Natural Language Processing - Model Building
UDPipe Natural Language Processing - Try it out
UDPipe Natural Language Processing - Universe
UDPipe Natural Language Processing - Basic Analytical Use Cases
UDPipe Natural Language Processing - Topic Modelling Use Cases

macOS

r-release

arm64

r-oldrel

arm64

r-release

x86_64

r-oldrel

x86_64

Windows

r-devel

x86_64

r-release

x86_64

r-oldrel

x86_64

Old Sources

udpipe archive

Depends

R ≥ 2.10

Imports

Rcpp ≥ 0.11.5
data.table ≥ 1.9.6
Matrix
methods
stats

Suggests

knitr
rmarkdown
topicmodels
lattice
parallel

LinkingTo

Rcpp

Reverse Imports

CEDARS
cleanNLP
corpustools
TextForecast

Reverse Suggests

BTM
crfsuite
doc2vec
nametagger
ruimtehol
text2vec
textplot
textrank
textrecipes
topicmodels.etm
word2vec

Reverse Enhances

NLP