gpt4 book ai didi

python - 基于索引数据框使用 Pandas 创建新的(更详细的)数据框

转载 作者:行者123 更新时间:2023-12-01 03:27:35 24 4
gpt4 key购买 nike

对于新手问题,我深表歉意,但我很难弄清楚 Pandas 的数据框。我有一个数据框,内容类似于

df_index:
Product Title
100000 Sample main product
200000 Non-consecutive main sample

我有另一个数据框,其中包含更详细的产品列表,其格式如下

df_details:
Product Title
100000 Sample main product
100000-Format-English Sample product details
100000-Format-Spanish Sample product details
100000-Format-French Sample product details
110000 Another sample main product
110000-Format-English Another sample details
110000-Format-Spanish Another sample details
120000 Yet another sample main product
120000-Format-English Yet another sample details
120000-Format-Spanish Yet another sample details
...
200000 Non-consecutive main sample
200000-Format-English Non-consecutive sample details
200000-Format-Spanish Non-consecutive sample details

我想基于 df_details 创建一个新的数据框,但仅限于 df_index 中出现的产品。理想情况下,它看起来像:

new_df:
Product Title
100000 Sample main product
100000-Format-English Sample product details
100000-Format-Spanish Sample product details
100000-Format-French Sample product details
200000 Non-consecutive main sample
200000-Format-English Non-consecutive sample details
200000-Format-Spanish Non-consecutive sample details

这是我迄今为止尝试过的:

new_df = df_details[df_details['Product'][0:5] == df_index['Product'][0:5]]

这给了我一个错误:

ValueError: Can only compare identically-labeled Series objects

我也尝试过

new_df = pd.merge(df_index, df_details, 
left_on=['Product'[0:5]], right_index=True, how='left')

这确实给了我一个结果数据集,但不是我想要的类型 - 它不包含带有格式信息的详细信息行。

最佳答案

您应该能够将 .isin() 用作:

new_df = df_details[df_details['Product'].isin(df_index['Product']]

这将执行一个仅查找公共(public)索引的掩码。

编辑:仅当列具有相同的字符串时才有效。要解决这个问题,您可以使用 str.contains() :

import re

# create a pattern to look for
pat ='|'.join(map(re.escape, df_index['Product']))

# Create the mask
new_df = df_details[df_details['Product'].str.contains(pat)]

如果列的格式设置为字符串,则此方法有效。

关于python - 基于索引数据框使用 Pandas 创建新的(更详细的)数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41288989/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com