gpt4 book ai didi

python - Pandas apply 函数将 4 个元素列表返回到 4 列键错误

转载 作者:行者123 更新时间:2023-12-01 01:55:53 25 4
gpt4 key购买 nike

我正在尝试将函数应用于 df 中的列,并根据返回的列表添加 4 个新列。

这是返回列表的函数。

def separateReagan(data):
block = None
township = None
section = None
acres = None

if 'BLK' in data:
patern = r'BLK (\d{1,3})'
blockList = re.findall(patern,data)
if blockList:
block = blockList[0]
else:
patern = r'B-([0-9]{1,3})'
blockList = re.findall(patern,data)
if blockList:
block = blockList[0]

# Similar for others

return [block,township,section,acres]

这是带有数据框的代码。

df = df[['ID','Legal Description']]

# Dataframe looks like this
# ID Legal Description
# 0 1 143560 CLARKSON | ENDEAVOR ENERGY RESO | A- ,B...
# 1 2 143990 CLARKSON ESTATE | ENDEAVOR ENERGY RESO ...
# 2 3 144420 CLARKSON RANCH | ENDEAVOR ENERGY RESO |...

df[['Block','Township','Section','Acres']] = df.apply(lambda x: separateReagan(x['Legal Description']),axis=1)

我收到此错误:

KeyError: "['Block' 'Township' 'Section' 'Acres'] not in index"

尝试返回元组而不是列表,但没有成功。

最佳答案

我很快就提出了一个小建议,这可能就是您正在寻找的内容。让我知道这是否有帮助。

from pandas import DataFrame
import re

def separate_reagan(row):
# row is a single row from the dataframe which is what is passed in
# from df.apply(fcn, axis=1)
# note: this means that you can also set values on the row

# switch local variables to setting row in dataframe if you
# really want to initialize them. If they are missing they should
# just become some form of NaN or None though depending on the dtype
row['township'] = None
row['section'] = None
row['acres'] = None
row['block'] = None

# grab legal description here instead of passing it in as the only variable
data = row['legal_description']
if 'BLK' in data:
block_list = re.search(r'BLK (\d{1,3})', data)
if block_list:
row['block'] = block_list.group(1)
else:
# since you only seem to want the first match,
# search is probably more what you're looking for
block_list = re.search(r'B-([0-9]{1,3})', data)
if block_list:
row['block'] = block_list.group(1)

# Similar for others

# returns the modified row.
return row

df = DataFrame([
{'id': 1, 'legal_description': '43560 CLARKSON | ENDEAVOR ENERGY RESO | A- ,B...'},
{'id': 2, 'legal_description': '4143990 CLARKSON ESTATE | ENDEAVOR ENERGY RESO ...'},
{'id': 3, 'legal_description': '144420 CLARKSON RANCH | ENDEAVOR ENERGY RESO |...'},
])
df = df[['id','legal_description']]

# df now only has columns ID and Legal Description

# This left hand side gets the columns from the dataframe, but as mentioned in the comment
# above, those columns in not contained in the dataframe. Also they aren't returned from the
# apply function because you never set them in separateReagan

df = df.apply(separate_reagan, axis=1)
# now these columns exist because you set them in the function
print(df[['block','township','section','acres']])

关于python - Pandas apply 函数将 4 个元素列表返回到 4 列键错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50195807/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com