gpt4 book ai didi

python-3.x - 如何使用 pandas get_dummies 函数消除键错误

转载 作者:行者123 更新时间:2023-12-05 03:01:45 27 4
gpt4 key购买 nike

当我运行 pandas get_dummies() 函数时,它返回一个键错误,指出我的所有列都不存在。以下代码使用了受版权保护的数据,我正在引用它:UCI Machine Learning Repository's adult dataset cited Dua, D. and Graff, C. (2019)。 UCI 机器学习库 [ http://archive.ics.uci.edu/ml] .加州欧文市:加州大学信息与计算机科学学院。

我不确定要尝试什么。

age, workclass, fnlwgt, education, education-num, marital-status, occupation, forces, relationship, race, sex, capital-gain, capital-loss, hours-per-week, native-country,
39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K
50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K
38, Private, 215646, HS-grad, 9, Divorced, Handlers-cleaners, Not-in-family, White, Male, 0, 0, 40, United-States, <=50K
53, Private, 234721, 11th, 7, Married-civ-spouse, Handlers-cleaners, Husband, Black, Male, 0, 0, 40, United-States, <=50K
28, Private, 338409, Bachelors, 13, Married-civ-spouse, Prof-specialty, Wife, Black, Female, 0, 0, 40, Cuba, <=50K
37, Private, 284582, Masters, 14, Married-civ-spouse, Exec-managerial, Wife, White, Female, 0, 0, 40, United-States, <=50K
49, Private, 160187, 9th, 5, Married-spouse-absent, Other-service, Not-in-family, Black, Female, 0, 0, 16, Jamaica, <=50K
52, Self-emp-not-inc, 209642, HS-grad, 9, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 45, United-States, >50K
#import modules
import pandas as pd

#define functions
def open_infile():
d = pd.read_csv('adult.data.txt', sep = ',')
return d

def onehot_encode(data):
data = pd.get_dummies(data, columns = ['workclass', 'education', 'marital-status', 'occupation', 'forces',
'relationship', 'race', 'sex', 'native-country'])
return data
##########gather data##########
#opoen infile
data = open_infile()
print(len(data))

##########process data##########
#one-hot encode categorical columns
onehot_encode(data)
print(data.head())
Traceback (most recent call last):
File "C:/Users/Hezekiah/PycharmProjects/Artificial Intelligence 0/Chapter 1 Application Adult.py", line 20, in <module>
onehot_encode(data)
File "C:/Users/Hezekiah/PycharmProjects/Artificial Intelligence 0/Chapter 1 Application Adult.py", line 11, in onehot_encode
'relationship', 'race', 'sex', 'native-country'])
File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\reshape\reshape.py", line 812, in get_dummies
data_to_encode = data[columns]
File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\frame.py", line 2934, in __getitem__
raise_missing=True)
File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\indexing.py", line 1354, in _convert_to_indexer
return self._get_listlike_indexer(obj, axis, **kwargs)[1]
File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\indexing.py", line 1161, in _get_listlike_indexer
raise_missing=raise_missing)
File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\indexing.py", line 1246, in _validate_read_indexer
key=key, axis=self.obj._get_axis_name(axis)))
KeyError: "None of [Index(['workclass', 'education', 'marital-status', 'occupation', 'forces',\n 'relationship', 'race', 'sex', 'native-country'],\n dtype='object')] are in the [columns]"

我希望 pandas get_dummies() 函数将所有分类属性转换为数字属性,但 pycharm 返回一个键错误,告诉我我的列都不存在,而实际上它们确实存在。

最佳答案

列名中的尾随空格有问题,解决方案是使用 str.strip :

data.columns = data.columns.str.strip()

或者用strip 来理解列表:

data.columns = [x.strip() for x in data.columns]

关于python-3.x - 如何使用 pandas get_dummies 函数消除键错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55546321/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com