gpt4 book ai didi

python - 导入 StanfordNER Tagger Google Colab

转载 作者:太空宇宙 更新时间:2023-11-03 11:58:46 25 4
gpt4 key购买 nike

我在尝试导入 StanfordNER Tagger 以用于 NER 时遇到一些问题。这是我的代码(从这里的其他帖子中摘取了部分代码):

import os
def install_java():
!apt-get install -y openjdk-8-jdk-headless -qq > /dev/null
os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"
!java -version
install_java()

!pip install StanfordCoreNLP
from stanfordcorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP('stanford-corenlp', lang='en', memory='4g')

我得到的错误突出显示了最后一行代码告诉我:

OSError: stanford-corenlp is not a directory.

任何帮助都会很棒!

编辑:这是对我有用的另一行代码。对于 StanfordNERTagger 内部的内容,将这些文件加载​​到 Colab 中并提供路径名。对我最初提出的上述问题执行相同的操作。为我工作。

from nltk.tag import StanfordNERTagger
from nltk.tokenize import word_tokenize



st = StanfordNERTagger('/content/english.muc.7class.distsim.crf.ser.gz',
'/content/stanford-ner.jar',
encoding='utf-8')

text = 'While in France, Christine Lagarde discussed short-term stimulus efforts in a recent interview with the Wall Street Journal.'

tokenized_text = word_tokenize(text)
classified_text = st.tag(tokenized_text)

print(classified_text)

最佳答案

以下代码下载所有必需的文件并设置环境:

from nltk.tag.stanford import StanfordNERTagger
from nltk.tokenize import word_tokenize
import nltk

!wget 'https://nlp.stanford.edu/software/stanford-ner-2018-10-16.zip'
!unzip stanford-ner-2018-10-16.zip

nltk.download('punkt')

st = StanfordNERTagger('/content/stanford-ner-2018-10-16/classifiers/english.all.3class.distsim.crf.ser.gz',
'/content/stanford-ner-2018-10-16/stanford-ner.jar',
encoding='utf-8')

text = 'While in France, Christine Lagarde discussed short-term stimulus efforts in a recent interview with the Wall Street Journal.'

tokenized_text = word_tokenize(text)
classified_text = st.tag(tokenized_text)

关于python - 导入 StanfordNER Tagger Google Colab,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55129698/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com