Skip to content

Issues: huggingface/tokenizers

Training a model from in-memory data
#198 by loicbarrault was closed Nov 28, 2020
Closed 1
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Serializing k-mer style pre-tokenizer
#1654 opened Oct 15, 2024 by millanp95
Inconsistent behaviour of PreTrainedTokenizerFasts on diacritics marked texts bug Something isn't working
#1663 opened Oct 11, 2024 by sven-nm
2 of 4 tasks
NormalizedString.clear() broken? bug Something isn't working
#1636 opened Sep 25, 2024 by lkurlandski
.NET bindings
#1615 opened Aug 16, 2024 by sappho192
RefMutContainer is unsound
#1612 opened Aug 13, 2024 by CheaterCodes
[test-infra] Enable Codecov for tokenizers
#1611 opened Aug 12, 2024 by hvaara
ProTip! Follow long discussions with comments:>50.