Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NLP Spacy model 出错 #253

Open
jek6899 opened this issue Nov 11, 2024 · 3 comments
Open

NLP Spacy model 出错 #253

jek6899 opened this issue Nov 11, 2024 · 3 comments

Comments

@jek6899
Copy link

jek6899 commented Nov 11, 2024

请问 NLP Spacy model是哪一个?在哪下,放到哪?

⚠️ Transcription results already exist, skipping transcription step.
⏳ Loading NLP Spacy model: <en_core_web_md> ...
Downloading en_core_web_md model...
If download failed, please check your network and try again.
2024-11-11 15:14:05.316 Uncaught app exception
Traceback (most recent call last):
File "H:\VideoLingo\core\spacy_utils\load_nlp_model.py", line 22, in init_nlp
nlp = spacy.load(model)
File "H:\VideoLingo\venv\lib\site-packages\spacy_init_.py", line 51, in load
return util.load_model(
File "H:\VideoLingo\venv\lib\site-packages\spacy\util.py", line 472, in load_model
raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'en_core_web_md'. It doesn't seem to be a Python package or a valid path to a data directory.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "H:\VideoLingo\venv\lib\site-packages\urllib3\connection.py", line 199, in _new_conn
sock = connection.create_connection(
File "H:\VideoLingo\venv\lib\site-packages\urllib3\util\connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "C:\Users\jek\AppData\Local\Programs\Python\Python310\lib\socket.py", line 955, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11004] getaddrinfo failed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "H:\VideoLingo\venv\lib\site-packages\urllib3\connectionpool.py", line 789, in urlopen
response = self._make_request(
File "H:\VideoLingo\venv\lib\site-packages\urllib3\connectionpool.py", line 490, in _make_request
raise new_e
File "H:\VideoLingo\venv\lib\site-packages\urllib3\connectionpool.py", line 466, in _make_request
self._validate_conn(conn)
File "H:\VideoLingo\venv\lib\site-packages\urllib3\connectionpool.py", line 1095, in _validate_conn
conn.connect()
File "H:\VideoLingo\venv\lib\site-packages\urllib3\connection.py", line 693, in connect
self.sock = sock = self._new_conn()
File "H:\VideoLingo\venv\lib\site-packages\urllib3\connection.py", line 206, in _new_conn
raise NameResolutionError(self.host, self, e) from e
urllib3.exceptions.NameResolutionError: <urllib3.connection.HTTPSConnection object at 0x0000024C572E9BA0>: Failed to resolve 'raw.githubusercontent.com' ([Errno 11004] getaddrinfo failed)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "H:\VideoLingo\venv\lib\site-packages\requests\adapters.py", line 667, in send
resp = conn.urlopen(
File "H:\VideoLingo\venv\lib\site-packages\urllib3\connectionpool.py", line 843, in urlopen
retries = retries.increment(
File "H:\VideoLingo\venv\lib\site-packages\urllib3\util\retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /explosion/spacy-models/master/compatibility.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x0000024C572E9BA0>: Failed to resolve 'raw.githubusercontent.com' ([Errno 11004] getaddrinfo failed)"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "H:\VideoLingo\core\spacy_utils\load_nlp_model.py", line 26, in init_nlp
download(model)
File "H:\VideoLingo\venv\lib\site-packages\spacy\cli\download.py", line 85, in download
compatibility = get_compatibility()
File "H:\VideoLingo\venv\lib\site-packages\spacy\cli\download.py", line 130, in get_compatibility
r = requests.get(about.compatibility)
File "H:\VideoLingo\venv\lib\site-packages\requests\api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "H:\VideoLingo\venv\lib\site-packages\requests\api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "H:\VideoLingo\venv\lib\site-packages\requests\sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "H:\VideoLingo\venv\lib\site-packages\requests\sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "H:\VideoLingo\venv\lib\site-packages\requests\adapters.py", line 700, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /explosion/spacy-models/master/compatibility.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x0000024C572E9BA0>: Failed to resolve 'raw.githubusercontent.com' ([Errno 11004] getaddrinfo failed)"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "H:\VideoLingo\venv\lib\site-packages\streamlit\runtime\scriptrunner\exec_code.py", line 88, in exec_func_with_error_handling
result = func()
File "H:\VideoLingo\venv\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 590, in code_to_exec
exec(code, module.dict)
File "H:\VideoLingo\st.py", line 117, in
main()
File "H:\VideoLingo\st.py", line 113, in main
text_processing_section()
File "H:\VideoLingo\st.py", line 30, in text_processing_section
process_text()
File "H:\VideoLingo\st.py", line 47, in process_text
step3_1_spacy_split.split_by_spacy()
File "H:\VideoLingo\core\step3_1_spacy_split.py", line 16, in split_by_spacy
nlp = init_nlp()
File "H:\VideoLingo\core\spacy_utils\load_nlp_model.py", line 29, in init_nlp
raise ValueError(f"❌ Failed to load NLP Spacy model: {model}")
ValueError: ❌ Failed to load NLP Spacy model: en_core_web_md

@ziziran97
Copy link

同样的问题。

@ysxk
Copy link

ysxk commented Nov 15, 2024

实测下载en_core_web_md
手动复制到虚拟环境的lib文件夹下的site-packages文件夹下可以解决
虚拟环境的具体路径可以用conda info 命令查看
https://www.123684.com/s/nApcVv-paZ4H
image

@ysxk
Copy link

ysxk commented Nov 15, 2024

我在打包的时候不小心多打包了一个en_core_web_md-3.7.1.zip
可以删掉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants