gpt4 book ai didi

python - my_dataframe.new_column = 值?

转载 作者:行者123 更新时间:2023-11-28 22:55:47 24 4
gpt4 key购买 nike

我刚刚遇到了一种奇怪的 Pandas 行为。说我愿意:

import string
import random
m_size = (4,3)
num_mat = np.random.random_integers(0,10, m_size)
my_cols = [random.choice(string.ascii_uppercase) for x in range(matrix.shape[1])]
mydf = pd.DataFrame(num_mat, columns=['A', 'B', 'C'])

print mydf

A B C
0 6 6 7
1 9 10 4
2 0 10 7
3 1 3 10

如果我现在这样做:

mydf.D = 4

我希望它创建一个列 D 并填充值 4,但是 mydfentries 没有改变:

print mydf

A B C
0 6 6 7
1 9 10 4
2 0 10 7
3 1 3 10

为什么?我没有收到任何警告或错误,那么mydf.D = 4做了什么?

这都是最新的稳定版 pandas (0.11.0)

最佳答案

尽管 pandas 允许您使用 df.Col读取列,但这显然只是 df['Col'] 的简写,并且速记不适用于创建新列。您需要执行 mydf['D'] = 4

我觉得这很不幸,因为我经常尝试像你那样做。阴险的部分是它实际上在数据框对象上创建了一个名为 D 的普通 Python 属性;它实际上并没有作为列添加。因此,您必须确保删除该属性,否则即使您稍后正确添加它,它也会隐藏该列:

>>> d = pandas.DataFrame(np.random.randn(3, 2), columns=["A", "B"])
>>> d
A B
0 -0.931675 1.029137
1 -0.363033 -0.227672
2 0.058903 -0.362436
>>> d.Col = 8
>>> d.Col # Attribute is there
8
>>> d['Col'] # But it is not a columns, just a simple attribute
Traceback (most recent call last):
File "<pyshell#8>", line 1, in <module>
d['Col']
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\core\frame.py", line 1906, in __getitem__
return self._get_item_cache(key)
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\core\generic.py", line 570, in _get_item_cache
values = self._data.get(item)
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\core\internals.py", line 1383, in get
_, block = self._find_block(item)
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\core\internals.py", line 1525, in _find_block
self._check_have(item)
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\core\internals.py", line 1532, in _check_have
raise KeyError('no item named %s' % com.pprint_thing(item))
KeyError: u'no item named Col'
>>> d['Col'] = 100 # Create a real column
>>> d.Col # Attribute blocks access to column
8
>>> d['Col'] # Column is available via item access
0 100
1 100
2 100
Name: Col, dtype: int64
>>> del d.Col # Delete the attribute
>>> d.Col # Columns is now available as an attribute (!)
0 100
1 100
2 100
Name: Col, dtype: int64
>>> d['Col'] # And still as an item
5: 0 100
1 100
2 100
Name: Col, dtype: int64

看到 d.Col“只有在你删除它之后才有效”——也就是说,在你执行 del d.Col 之后,你可能会感到有些惊讶,随后阅读 d.Col 实际上会为您提供专栏。这只是因为 Python __getattr__ 的工作方式,但在这种情况下它仍然有点不直观。

关于python - my_dataframe.new_column = 值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16406343/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com