gpt4 book ai didi

python - Pandas 数据框检查字符串是否不在要追加的列中

转载 作者:行者123 更新时间:2023-12-01 09:24:58 25 4
gpt4 key购买 nike

我有一个 pandas 数据框,其中一列称为艺术家。仅当新艺术家姓名不在此列中时,我才想附加新行。

我尝试过,但没有成功:

if (all_data != name.all(axis = 0)):
all_data = all_data.append({'artist':str(name), 'netWorth':str(worth.strip())}, ignore_index = True)

这是我的所有代码:

def get_webpage(i, url):
URL = url+str(i)
response = requests.get(URL)
return bs4.BeautifulSoup(response.text, 'html.parser')

COLUMNS = ['artist', 'netWorth']
all_data = pd.DataFrame(columns = COLUMNS)

def scrape(soup):
artists = soup.find_all('article', class_ = 'thumb-wrap')
for ar in artists:
name = ar.h3.a.text
worth = ar.div.find('div', class_='bc-networth').text
global all_data
if (all_data['artist'] != name).any():
all_data = all_data.append({'artist':str(name), 'netWorth':str(worth.strip())}, ignore_index = True)

i = 1
url = 'http://www.therichest.com/celebnetworth-category/celeb/singer/page/'
while (i<=14):
soup = get_webpage(i, url)
i = i+1
data = scrape(soup)
i = 1
url = 'http://www.therichest.com/celebnetworth-category/celeb/musician/page/'
while (i<=7):
soup = get_webpage(i, url)
i = i+1
data = scrape(soup)

最佳答案

我认为只需要检查一列艺术家:

if (all_data['artist'] != str(name)).all():

示例:

all_data = pd.DataFrame({'netWorth':[5,3],
'artist':list('ab')})

print (all_data)
netWorth artist
0 5 a
1 3 b

name = 'a'
b = 10

if (all_data['artist'] != str(name)).all():
all_data = all_data.append({'artist':str(name), 'netWorth':b }, ignore_index = True)

print (all_data)
netWorth artist
0 5 a
1 3 b
name = 'd'
b = 10

if (all_data['artist'] != name).all():
all_data = all_data.append({'artist':str(name), 'netWorth':b }, ignore_index = True)

print (all_data)
netWorth artist
0 5 a
1 3 b
2 10 d

关于python - Pandas 数据框检查字符串是否不在要追加的列中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50507549/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com