CRAN/E | wordpiece

wordpiece

R Implementation of Wordpiece Tokenization

Installation

About

Apply 'Wordpiece' () tokenization to input text, given an appropriate vocabulary. The 'BERT' () tokenization conventions are used by default.

github.com/macmillancontentscience/wordpiece
Bug report File report

Key Metrics

Version 2.1.3
R ≥ 3.3.0
Published 2022-03-03 795 days ago
Needs compilation? no
License Apache License (≥ 2)
CRAN checks wordpiece results

Downloads

Yesterday 9 0%
Last 7 days 65 -22%
Last 30 days 234 +2%
Last 90 days 660 -22%
Last 365 days 2.519 -12%

Maintainer

Maintainer

Jonathan Bratt

jonathan.bratt@macmillan.com

Authors

Jonathan Bratt

aut / cre

Jon Harmon

aut

Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning

cph

Material

README
NEWS
Reference manual
Package source

Vignettes

Using wordpiece

macOS

r-release

arm64

r-oldrel

arm64

r-release

x86_64

r-oldrel

x86_64

Windows

r-devel

x86_64

r-release

x86_64

r-oldrel

x86_64

Old Sources

wordpiece archive

Depends

R ≥ 3.3.0

Imports

dlr ≥ 1.0.0
fastmatch ≥ 1.1
memoise ≥ 2.0.0
piecemaker ≥ 1.0.0
rlang
stringi ≥ 1.0
wordpiece.data ≥ 1.0.2

Suggests

covr
knitr
rmarkdown
testthat ≥ 3.0.0

Reverse Suggests

textrecipes