gpt4 book ai didi

python - Pandas Dataframe - 条件列创建

转载 作者:行者123 更新时间:2023-11-30 22:21:54 24 4
gpt4 key购买 nike

我正在尝试根据另一列的条件逻辑创建一个新列。我尝试过搜索,但未能找到任何可以解决我的问题的内容。

我已将 CSV 导入到 pandas 数据框中,其结构如下。我编辑了这篇文章的一些描述,但除此之外一切都是一样的:

#code used to load dataframe:
df = pd.read_csv(r"C:\filepath\filename.csv")

#output from print(type(df)):
#class 'pandas.core.frame.DataFrame'

#output from print(df.columns.values):
#['Type' 'Trans Date' 'Post Date' 'Description' 'Amount']

#output from print(df.columns):
Index(['Type', 'Trans Date', 'Post Date', 'Description', 'Amount'], dtype='object')
#output from print

Type Trans Date Post Date Description Amount
0 Sale 01/25/2018 01/25/2018 DESC1 -13.95

1 Sale 01/25/2018 01/26/2018 AMAZON MKTPLACE PMTS -6.99

2 Sale 01/24/2018 01/25/2018 SUMMIT BISTRO -5.85

3 Sale 01/24/2018 01/25/2018 DESC3 -9.13

4 Sale 01/24/2018 01/26/2018 DYNAMIC VENDING INC -1.60

然后我编写以下代码:

def criteria(row):
if row.Description.find('SUMMIT BISTRO')>0:
return 'Lunch'
elif row.Description.find('AMAZON MKTPLACE PMTS')>0:
return 'Amazon'
elif row.Description.find('Aldi')>0:
return 'Groceries'
else:
return 'NotWorking'

df['Category'] = df.apply(criteria, axis=0)

错误:

Traceback (most recent call last):
File "C:\Users\Test_BankReconcile2.py", line 44, in <module>
df['Category'] = df.apply(criteria, axis=0)
File "C:\Users\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4262, in apply
ignore_failures=ignore_failures)
File "C:\Users\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4358, in _apply_standard
results[i] = func(v)
File "C:\Users\OneDrive\Documents\finance\Test_BankReconcile2.py", line 35, in criteria
if row.Description.find('SUMMIT BISTRO')>0:
File "C:\Users\Anaconda3\lib\site-packages\pandas\core\generic.py", line 3081, in __getattr__
return object.__getattribute__(self, name)
AttributeError: ("'Series' object has no attribute 'Description'", 'occurred at index Type')

我能够在来自不同银行的非常相似的 csv 文件上成功执行相同类型的命令(此示例来 self 的信用卡),所以我不知道发生了什么,但可能我需要以某种我没有做的方式定义数据框?或者可能还有其他一些我没有看到的非常明显的东西?预先感谢大家帮助我解决这个问题。

最佳答案

是的,您的问题是您需要将 axis=1 传递给 .apply:

In [52]: df
Out[52]:
Type Trans Date Post Date Description Amount
0 Sale 01/25/2018 01/25/2018 DESC1 -13.95
1 Sale 01/25/2018 01/26/2018 AMAZON MKTPLACE PMTS -6.99
2 Sale 01/24/2018 01/25/2018 SUMMIT BISTRO -5.85
3 Sale 01/24/2018 01/25/2018 DESC3 -9.13
4 Sale 01/24/2018 01/26/2018 DYNAMIC VENDING INC -1.60

In [53]: def criteria(row):
...: if row.Description.find('SUMMIT BISTRO')>0:
...: return 'Lunch'
...: elif row.Description.find('AMAZON MKTPLACE PMTS')>0:
...: return 'Amazon'
...: elif row.Description.find('Aldi')>0:
...: return 'Groceries'
...: else:
...: return 'NotWorking'
...:

In [54]: df.apply(criteria, axis=1)
Out[54]:
0 NotWorking
1 NotWorking
2 NotWorking
3 NotWorking
4 NotWorking
dtype: object

第二个问题是你有一个逻辑错误,而不是 .find(x) > 0 你想要 .find(x) >= 0 ,或者更好, some_other_string 中的 some_string

关于python - Pandas Dataframe - 条件列创建,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48491190/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com