Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to disable cache for FromPretrained and FromFile #1680

Open
daulet opened this issue Nov 12, 2024 · 2 comments
Open

Option to disable cache for FromPretrained and FromFile #1680

daulet opened this issue Nov 12, 2024 · 2 comments

Comments

@daulet
Copy link

daulet commented Nov 12, 2024

Related to this.

Currently cache config for BPE tokenizers is set either via BpeBuilder or via config in tokenizer.json for pretrained tokenizers, hence for tokenizers loaded from a file there is no API to disable caching. This is a request to add one.

@ArthurZucker
Copy link
Collaborator

I am not sure when you'd want to disable caching but let's see if this is asked by the community!

@daulet
Copy link
Author

daulet commented Nov 15, 2024

because it grows uncontrollably unless one actively calls clear_cache. At the very least it should not be default behavior (it's on by default for configs that dont ever mention cache). If one wants to load a pretrained tokenizer there is no option to disable it unless manually modifying tokenizer config file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants