Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cyhunspell prevents us from upgrading past python 3.10 #983

Open
bhearsum opened this issue Jan 10, 2025 · 3 comments
Open

cyhunspell prevents us from upgrading past python 3.10 #983

bhearsum opened this issue Jan 10, 2025 · 3 comments

Comments

@bhearsum
Copy link
Collaborator

bhearsum commented Jan 10, 2025

Over in #700 I tried upgrading our docker images to Ubuntu 24.04, which includes Python 3.12. One of the issues I encountered was that cyhunspell fails to compile with errors such as:

[task 2025-01-10T17:32:25.823Z] hunspell/hunspell.cpp: In function ‘Py_ssize_t __Pyx_PyIndex_AsSsize_t(PyObject*)’:
[task 2025-01-10T17:32:25.823Z] hunspell/hunspell.cpp:16201:47: error: ‘PyLongObject’ {aka ‘struct _longobject’} has no member named ‘ob_digit’
[task 2025-01-10T17:32:25.823Z] 16201 |     const digit* digits = ((PyLongObject*)b)->ob_digit;
[task 2025-01-10T17:32:25.823Z]       |                                               ^~~~~~~~

This is due to ob_digit disappearing after Python 3.10.

kenlm had similar issues, which they fixed upstream. The hunspell we're using has not been updated in 3 years, and does not have a similar fix.

There doesn't seem to be an obvious drop in replacement for this. One commenter suggests using spylls, a pure python version of hunspell. https://github.com/cdhigh/chunspell may have fixed the issue, although it specifically says it has removed caching and batch functionality, which may or may not matter to us. We could also consider forking of course, and applying the same fix that kenlm did, which may have little downside considering how unmaintained this ecosystem is.

This is not a highly urgent issue, but eventually we'll need to get off of Python 3.10 when it and/or Ubuntu 22.04 are no longer supported.

@ZJaume
Copy link
Collaborator

ZJaume commented Jan 13, 2025

We will take a look at this. Probably chunspell will be the way to go. We do not use batch functionality and caching.

@ZJaume
Copy link
Collaborator

ZJaume commented Jan 13, 2025

Seems that chunspell is slower. Can you check if this one compiles on your environmnent? pip install git+https://github.com/MartinHlavna/cython_hunspell/. It is the one in the PR 38.

@bhearsum
Copy link
Collaborator Author

I've already abandoned by Ubuntu 24.04 upgrade attempt, so I'm not in a good position to test this at the moment :(.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants