gpt4 book ai didi

python - 用 pandas 中的其他列子字符串替换子字符串

转载 作者:行者123 更新时间:2023-12-02 02:04:24 25 4
gpt4 key购买 nike

我有一个数据框,其中包含一些模板字符串和相应的字符串变量来替换。例如,给定:

template,variable
"{color} shirt in {size}", "blue,medium"
"{capacity} bottle in {color}", "24oz,teal"
"{megapixel}mp camera", "24.1"

我想要生成以下内容:

"blue shirt in medium"
"24oz bottle in teal"
"24.1mp camera"

保证第一列中模板子字符串的数量将等于第二列中字符串中变量的数量。字符串的格式与上面的示例一致。

我的第一个想法是使用 extractall 创建一个多索引数据框,然后加入:

templates = df['template'].str.extractall('({\w+\})')
variables = df['variable'].str.extractall('(\w+)')
multi_df = templates.join(variables, how='inner')

但我不太清楚该去哪里。或者有更简单的方法吗?

最佳答案

使用string.Formattertemplate列中提取变量并构建能够替换的字典。

>>> df
template value # I modified your column name
0 {color} shirt in {size} blue,medium
1 {capacity} bottle in {color} 24oz,teal
2 {megapixel}mp camera 24.1
from string import Formatter

def extract_vars(s):
return tuple(fn for _, fn, _, _ in Formatter().parse(s) if fn is not None)

df['variable'] = df['template'].apply(extract_vars)
df['value'] = df['value'].str.split(',')
df['combined'] = df.apply(lambda x: dict(zip(x['variable'], x['value'])), axis=1)

此时,您的数据框如下所示:

                       template           value           variable                               combined
0 {color} shirt in {size} [blue, medium] [color, size] {'color': 'blue', 'size': 'medium'}
1 {capacity} bottle in {color} [24oz, teal] [capacity, color] {'capacity': '24oz', 'color': 'teal'}
2 {megapixel}mp camera [24.1] [megapixel] {'megapixel': '24.1'}

最后,评估你的字符串:

>>> df.apply(lambda x: x['template'].format(**x['combined']), axis=1)
0 blue shirt in medium
1 24oz bottle in teal
2 24.1mp camera
dtype: object

关于python - 用 pandas 中的其他列子字符串替换子字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68642354/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com