gpt4 book ai didi

python - 向 pandas 数据框添加一列会用 NA 填充它

转载 作者:行者123 更新时间:2023-12-01 04:55:00 24 4
gpt4 key购买 nike

我有这个 pandas 数据框:

          SourceDomain                           1  2         3
0 www.theguardian.com profile.theguardian.com 1 Directed
1 www.theguardian.com membership.theguardian.com 2 Directed
2 www.theguardian.com subscribe.theguardian.com 3 Directed
3 www.theguardian.com www.google.co.uk 4 Directed
4 www.theguardian.com jobs.theguardian.com 5 Directed

我想添加一个新列,它是像这样创建的 pandas 系列:

Weights = Weights.value_counts()

但是,当我尝试使用 edgesFile[4] = Weights 添加新列时,它会用 NA 而不是值填充它:

          SourceDomain                           1  2         3   4
0 www.theguardian.com profile.theguardian.com 1 Directed NaN
1 www.theguardian.com membership.theguardian.com 2 Directed NaN
2 www.theguardian.com subscribe.theguardian.com 3 Directed NaN
3 www.theguardian.com www.google.co.uk 4 Directed NaN
4 www.theguardian.com jobs.theguardian.com 5 Directed NaN

如何添加保留值的新列?谢谢?

丹尼

最佳答案

您得到 NaN 是因为 Weights 的索引与 edgesFile 的索引不匹配。如果您希望 Pandas 忽略 Weights.index 并仅按顺序粘贴值,然后传递底层 NumPy 数组:

edgesFile[4] = Weights.values
<小时/>

下面是一个演示差异的示例:

In [14]: df = pd.DataFrame(np.arange(4)*10, index=list('ABCD'))

In [15]: df
Out[15]:
0
A 0
B 10
C 20
D 30

In [16]: s = pd.Series(np.arange(4), index=list('CDEF'))

In [17]: s
Out[17]:
C 0
D 1
E 2
F 3
dtype: int64

在这里我们看到 Pandas 对齐索引:

In [18]: df[4] = s

In [19]: df
Out[19]:
0 4
A 0 NaN
B 10 NaN
C 20 0
D 30 1

在这里,Pandas 只是将 s 中的值粘贴到列中:

In [20]: df[4] = s.values

In [21]: df
Out[21]:
0 4
A 0 0
B 10 1
C 20 2
D 30 3

关于python - 向 pandas 数据框添加一列会用 NA 填充它,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27635767/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com