gpt4 book ai didi

Python - 查找数据框中包含单词的前 5 行

转载 作者:行者123 更新时间:2023-12-01 02:07:11 25 4
gpt4 key购买 nike

我正在尝试创建一个函数,打印包含单词列表中的单词的产品列表的前 5 个产品及其价格,以及后 5 个产品及其价格。我尝试过这样做 -

def wordlist_top_costs(filename, wordlist):
xlsfile = pd.ExcelFile(filename)
dframe = xlsfile.parse('Sheet1')
dframe['Product'].fillna('', inplace=True)
dframe['Price'].fillna(0, inplace=True)
price = {}
for word in wordlist:
mask = dframe.Product.str.contains(word, case=False, na=False)
price[mask] = dframe.loc[mask, 'Price']

top = sorted(Score.items(), key=operator.itemgetter(1), reverse=True)
print("Top 10 product prices for: ", wordlist.name)
for i in range(0, 5):
print(top[i][0], " | ", t[i][1])

bottom = sorted(Score.items(), key=operator.itemgetter(1), reverse=False)
print("Bottom 10 product prices for: ", wordlist.name)
for i in range(0, 5):
print(top[i][0], " | ", t[i][1])

但是,上面的函数在行处抛出错误price[mask] = dframe.loc[mask, 'Price in AUD'] 表示 -TypeError:“Series”对象是可变的,因此无法对它们进行哈希处理任何帮助纠正/修改这一点表示赞赏。谢谢!

编辑-例如。单词列表 - alu、co、vin

产品|价格

  • 铝冠 - 22.20

  • 可口可乐 - 1.0

  • 黄铜盒 - 28.75

  • 文森特·凯特尔 - 12.00

  • 乙烯基贴纸 - 0.50

  • 多力多滋 - 2.0

  • 科林发油 - 5.0

  • 文森特·蔡斯太阳镜 - 75.40

  • 美国旅游者 - 120.90 美元

输出:

前 3 名产品价格:

文森特·蔡斯太阳镜 - 75.40

铝冠 - 22.20

文森特·凯特尔 - 12.0

后 3 个产品价格:

乙烯基贴纸 - 0.50

可口可乐 - 1.0

科林发油 - 5.0

最佳答案

您可以使用nlargestnsmallest :

#remove $ and convert column Price to floats
dframe['Price'] = dframe['Price'].str.replace('$', '').astype(float)

#filter by regex - joined all values of list by |
wordlist = ['alu', 'co', 'vin']
pat = '|'.join(wordlist)
mask = dframe.Product.str.contains(pat, case=False, na=False)
dframe = dframe.loc[mask, ['Product','Price']]

top = dframe.nlargest(3, 'Price')
#top = dframe.sort_values('Price', ascending=False).head(3)
print (top)
Product Price
7 Vincent Chase Sunglasses 75.4
0 Aluminium Crown 22.2
3 Vincent Kettle 12.0

bottom = dframe.nsmallest(3, 'Price')
#bottom = dframe.sort_values('Price').head(3)
print (bottom)
Product Price
4 Vinyl Stickers 0.5
1 Coca Cola 1.0
6 Colin's Hair Oil 5.0

设置:

dframe = pd.DataFrame({'Price': ['22.20', '1.0', '28.75', '12.00', '0.50', '2.0', '5.0', '75.40', '$120.90'], 'Product': ['Aluminium Crown', 'Coca Cola', 'Brass Box', 'Vincent Kettle', 'Vinyl Stickers', 'Doritos', "Colin's Hair Oil", 'Vincent Chase Sunglasses', 'American Tourister']}, columns=['Product','Price'])
print (dframe)
Product Price
0 Aluminium Crown 22.20
1 Coca Cola 1.0
2 Brass Box 28.75
3 Vincent Kettle 12.00
4 Vinyl Stickers 0.50
5 Doritos 2.0
6 Colin's Hair Oil 5.0
7 Vincent Chase Sunglasses 75.40
8 American Tourister $120.90

关于Python - 查找数据框中包含单词的前 5 行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48943297/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com