gpt4 book ai didi

python - Pandas read_csv 无法正确加载逗号分隔的 CSV

转载 作者:太空宇宙 更新时间:2023-11-04 00:30:55 26 4
gpt4 key购买 nike

现在,我分析一下Kaggel的泰坦尼克号挑战。我的代码是这样的: code

但我理想的输出是: ideal output

所以,在我最后的代码中是

df["Age"].fillna(df.Age.median(), inplace=True)

错误发生

---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/Users/XXXi/anaconda/envs/py36/lib/python3.6/site-packages/pandas/indexes/base.py in get_loc(self, key, method, tolerance)
2133 try:
-> 2134 return self._engine.get_loc(key)
2135 except KeyError:

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4433)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4279)()

pandas/src/hashtable_class_helper.pxi in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:13742)()

pandas/src/hashtable_class_helper.pxi in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:13696)()

KeyError: 'Age'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last)
<ipython-input-4-9763f0a9951c> in <module>()
----> 1 df["Age"].fillna(df.Age.median(), inplace=True)

/Users/XXXi/anaconda/envs/py36/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
2057 return self._getitem_multilevel(key)
2058 else:
-> 2059 return self._getitem_column(key)
2060
2061 def _getitem_column(self, key):

/Users/XXXi/anaconda/envs/py36/lib/python3.6/site-packages/pandas/core/frame.py in _getitem_column(self, key)
2064 # get column
2065 if self.columns.is_unique:
-> 2066 return self._get_item_cache(key)
2067
2068 # duplicate columns & possible reduce dimensionality

/Users/XXXi/anaconda/envs/py36/lib/python3.6/site-packages/pandas/core/generic.py in _get_item_cache(self, item)
1384 res = cache.get(item)
1385 if res is None:
-> 1386 values = self._data.get(item)
1387 res = self._box_item_values(item, values)
1388 cache[item] = res

/Users/XXXi/anaconda/envs/py36/lib/python3.6/site-packages/pandas/core/internals.py in get(self, item, fastpath)
3541
3542 if not isnull(item):
-> 3543 loc = self.items.get_loc(item)
3544 else:
3545 indexer = np.arange(len(self.items))[isnull(self.items)]

/Users/XXXi/anaconda/envs/py36/lib/python3.6/site-packages/pandas/indexes/base.py in get_loc(self, key, method, tolerance)
2134 return self._engine.get_loc(key)
2135 except KeyError:
-> 2136 return self._engine.get_loc(self._maybe_cast_indexer(key))
2137
2138 indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4433)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4279)()

pandas/src/hashtable_class_helper.pxi in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:13742)()

pandas/src/hashtable_class_helper.pxi in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:13696)()

KeyError: 'Age'

我使用 sep=',' 所以我真的不明白为什么这段代码不能用逗号分隔。我该如何解决这个问题?

我跟着一个答案,但是错误发生了(我不知道为什么) error

我的数据是data

最佳答案

注意!

主要问题是下载数据。如果您在加载和处理 Kaggle Titanic 数据集时遇到问题,您可以从 here 重新下载 CSV。并重新运行您的程序。


可以通过delimiter=',':

df = pd.read_csv("Desktop/data/train.csv", delimiter=',')
print(df.head())

PassengerId Survived Pclass \
0 1 0 3
1 2 1 1
2 3 1 3
3 4 1 1
4 5 0 3

Name Sex Age SibSp \
0 Braund, Mr. Owen Harris male 22.0 1
1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1
2 Heikkinen, Miss. Laina female 26.0 0
3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1
4 Allen, Mr. William Henry male 35.0 0

Parch Ticket Fare Cabin Embarked
0 0 A/5 21171 7.2500 NaN S
1 0 PC 17599 71.2833 C85 C
2 0 STON/O2. 3101282 7.9250 NaN S
3 0 113803 53.1000 C123 S
4 0 373450 8.0500 NaN S


print(df.columns)

Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',
'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
dtype='object')

接下来,您可以创建各种映射:

mapping = {'male' : 0, 'female' : 1}

然后您将调用 pd.Series.replace:

df.Sex = df.Sex.replace(mapping)
print(df.Sex)

0 0
1 1
2 1
3 1
4 0
Name: Sex, dtype: int64

关于python - Pandas read_csv 无法正确加载逗号分隔的 CSV,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45893390/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com