gpt4 book ai didi

python - Pandas.apply 在 spacy doc 列上返回无值

转载 作者:行者123 更新时间:2023-12-01 08:22:22 24 4
gpt4 key购买 nike

我正在我的 pandas df 'sp500news3' 上运行以下命令,它返回 None 值

def extract_ticker(title):
for word in title:
if word in constituents['Symbol']:
return word

sp500news3['tickers'] = sp500news3['title'].apply(extract_ticker)

#sp500news3 sample:



index date_publish title tickers
0 79944 2007-01-29 19:08:35 (MSFT, Vista, corporate, sales, go, very, well) None
1 181781 2007-12-14 19:39:06 (WMB, No, Anglican, consensus, on, Episcopal, Church) None
2 213175 2008-01-22 11:17:19 (CSX, quarterly, profit, rises) None
3 93554 2008-01-22 18:52:56 (C, says, 30, bln, capital, helps, exceed, target) None

成分['符号']:样本

0      TWX  
1 C
2 MSFT
3 WMB ...

从以下内容复制 spacy 文档:

constituents =  pd.DataFrame({"Symbol":["TWX","C","MSFT","WMB"]})

sp500news3 = pd.DataFrame({"title":["MSFT Vista corporate sales go very well","WMB No Anglican consensus on Episcopal Church","CSX quarterly profit rises",'C says 30 bln capital helps exceed target','TWX plans cable spinoff']})

import spacy

nlp = spacy.load('en_core_web_sm')

sp500news3['title'] = sp500news3['title'].apply(nlp)

最佳答案

iterating over a spacy.tokens.doc.Doc 开始,您必须使用 word.text它迭代 Token which doesn't implement __eq__ for strings :

for word in title:
if word.text in constituents['Symbol'].values:
return word
<小时/>

以你的例子:

In [11]: sp500news3['title'].apply(extract_ticker)
Out[11]:
0 MSFT
1 WMB
2 None
3 C
4 TWX
Name: title, dtype: object

关于python - Pandas.apply 在 spacy doc 列上返回无值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54541204/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com