gpt4 book ai didi

Python Pandas 部分匹配数据框中的字符串列表

转载 作者:行者123 更新时间:2023-12-02 02:45:49 26 4
gpt4 key购买 nike

大家好,我正在尝试匹配数据框中列中的部分字符串并返回匹配字符串(大写字母很重要)。我没有很强的编程知识,我才刚刚开始学习。

#list of State
state_abbrv = ["AL","AK","AZ","AR","CA","CO","CT","DE","FL","GA","HI","ID","IL","IN","IA","KS","KY","LA",
"ME","MD","MA","MI","MN","MS","MO","MT","NE","NV","NH","NJ","NM","NY","NC","ND","OH","OK",
"OR","PA","RI","SC","SD","TN","TX","UT","VT","VA","WA","WV","WI","WY"]

#Create dataframe
d = {"Index": [1, 2, 3, 4, 5 , 6, 7], "Description": ["ABNY", "MANY", "NYNY","DO", "nyNY", ""CWARD NY", "HOWARD BEACH NY"]}

df = pd.DataFrame(data=d)

这是 df:

Index Description 
1 ABNY
2 MANY
3 NYNY
4 DO
5 nyNY
6 CWARD NY
7 HOWARD BEACH NY

这是我的代码:

df = df.assign(State = df["Description"].str.findall(state_abbrv))

这是预期的结果:

Index Description State
1 ABNY NY
2 MANY MA,NY
3 NYNY NY,NY
4 DO
5 nyNY NY
6 CWARD NY WA,NY
7 HOWARD BEACH NY WA,AR,NY

谢谢

最佳答案

您可以尝试使用join,然后使用str.findall:

statesjoin='|'.join(state_abbrv)
df=df.assign(State = df["Description"].str.findall(statesjoin))

输出:

df
Index Description State
0 1 ABNY [NY]
1 2 MANY [MA, NY]
2 3 NYNY [NY, NY]
3 4 DO []
4 5 nyNY [NY]
5 6 ABALBB [AL]
6 7 ALCA [AL, CA]

在@AkshaySehgal 描述的可能情况下,您可以试试这个:

import re
df=df.assign(State = df["Description"].apply(lambda x: ','.join(re.findall('..',x))).str.findall(statesjoin))

关于Python Pandas 部分匹配数据框中的字符串列表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62824406/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com