python - pandas DataFrame 将代码或标签转换为分类-6ren

python - pandas DataFrame 将代码或标签转换为分类

转载作者：行者123 更新时间：2023-12-01 06:25:21

24

4

鉴于分类数据的现有代码/标签映射，我想将一系列数据帧转换为分类数据。我正在努力将包含 (a) 标签的系列转换为分类标签，并将包含 (b) 代码的系列转换为分类标签。

系列数据包含代码(而不是与发现的许多示例不同的类别标签)。

这是我到目前为止得到的:

# this is the code-label mapping that I'd like to apply for the
# (a) label -> cat conversion (`df1`)
# (b) code -> cat conversion (`df2`)

>>> cat = pd.Categorical.from_codes([-1, 1, 2, 3], ['-', 'a', 'b', 'c'])
>>> cat.codes
array([-1,  1,  2,  3], dtype=int8)
>>> cat
[NaN, a, b, c]
Categories (4, object): [-, a, b, c]
>>> cat.__array__
<bound method Categorical.__array__ of [NaN, a, b, c]
Categories (4, object): [-, a, b, c]>


>>> df1
   x
0  a
1  a
2  c
3  b
4  b
>>> df2
   y
0  nan
1  1
2  3
3  2
4  2

我将如何将 x 转换为使用 cat 作为类型。我认为我遇到的问题是我不太明白 pd.Categorical 实际上是什么或者它是如何使用的(它是一个 dtype (看起来不是这样)，是吗？实际的系列(看起来也不是这样，因为那样它会允许重复))？它似乎只保存实际的代码标签映射，但我不确定如何使用它(即将它应用于已经存在的系列)。

最佳答案

如果我理解正确的话，您可以通过在其 dtype 上使用 .astype 将 df1.x 转换为 cat 类别 属性

df1.x.astype(cat.dtype)

Out[950]:
0    a
1    a
2    c
3    b
4    b
Name: x, dtype: category
Categories (4, object): [-, a, b, c]

关于python - pandas DataFrame 将代码或标签转换为分类，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/60179027/

24

4

0

文章推荐： python - Tensorflow:如何手动对数据集进行分片

文章推荐： java - 如何添加到 3d arrayList 而不覆盖以前的条目？

文章推荐： python - 跨行执行聚合函数(例如平均值)会产生 NaN

文章推荐： python - 如何修改 Range 函数中边界值的包含/排除行为？

首页

博学

6Ren·AI

商城