python - Pandas - 转置数据框值中长度不等的列表-6ren

python - Pandas - 转置数据框值中长度不等的列表

转载作者：太空宇宙更新时间：2023-11-03 11:40:36

24

4

这个问题是这个问题的延伸 Pandas: split list in column into multiple rows ，这次我不想合并更多的 DataFrame。而且我无法让它与超过 2 个 dfs 一起工作。

我有这个数据框:

  Index     Job positions   Job types   Locations
      0          [5]         [6]        [3, 4, 5]
      1          [1]         [2, 6]     [3, NaN] 
      2          [1,3]       [9, 43]    [1]

我想要每个数字的组合，所以最终结果是:

index   Job position  Job type  Location
    0   5             6         3
    0   5             6         4
    0   5             6         5
    1   1             2         3
    1   1             2         NaN
    1   1             6         3
    1   1             6         NaN
    2   1             9         1
    2   1             43        1
    2   3             9         1
    2   3             43        1

所以我所做的是将列转换为系列:

positions = df['Job positions'].apply(pd.Series).reset_index().melt(id_vars='index').dropna()[['index', 'value']].set_index('index')
types = df['Job types'].apply(pd.Series).reset_index().melt(id_vars='index').dropna()[['index', 'value']].set_index('index')
locations = df['Locations'].apply(pd.Series).reset_index().melt(id_vars='index').dropna()[['index', 'value']].set_index('index')

dfs = [positions, types, locations]

然后尝试像这样合并它们:

df_final = reduce(lambda left,right: pd.merge(left,right,left_index=True, right_index=True, how="left"), dfs)

但它似乎跳过了带有 NaN 的字段 - 我该如何防止这种情况发生？

最佳答案

1 行:

import itertools

dfres = pd.DataFrame([(i[0],)+j for i in df.values for j in itertools.product(*i[1:])]
        ,columns=df.columns).set_index('index')


       Job positions  Job types  Locations
index                                     
0                  5          6        3
0                  5          6        4
0                  5          6        5
1                  1          2        3
1                  1          2        NaN
1                  1          6        3
1                  1          6        NaN
2                  1          9        1
2                  1         43        1
2                  3          9        1
2                  3         43        1

关于python - Pandas - 转置数据框值中长度不等的列表，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50306919/

24

4

0

文章推荐： c# - 使用左列标题创建动态表的最佳方法

文章推荐： Android:如何以相同比例扩展矩形图像以适合屏幕

文章推荐： python - 在 python 中处理 .ige 文件

javascript - 数组相等/不等
谁能解释为什么这些 JavaScript 数组不等式比较的计算结果为真？ [""] !== [""] [1] !== [1] [] !== [] [""] != [""] [1] != [1] []
c++ - 为什么这段代码会失败？ child 不等
好的，所以我一直在努力学习掌握子进程并正确地等待它们完成。我已经阅读了很多 Stack Overflow Q/A，但我似乎仍然无法按照我的意愿让它工作。我一直在阅读/搜索这本书(C++ Primer
batch-file - 批量不等于(不等)运算符
根据this , !==! 是不等于字符串运算符。尝试一下，我得到: C:\> if "asdf" !==! "fdas" echo asdf !==! was unexpected at this
algorithm - 我有 100 万亿个元素，每个元素的大小从 1 字节到 1 万亿字节 (0.909 TiB) 不等。如何非常有效地存储和访问它们？
这是一道面试题: Suppose: I have 100 trillion elements, each of them has size from 1 byte to 1 trillion byte
python - 如何集成一个函数 w.r.t 时间；即 'y' 是一个数组，时间(t)的值从 1 到 3000 不等
如何集成功能 f(y) w.r.t 时间;即 'y'是一个包含 3000 个值和值 time(t) 的数组从 1 到 3000 不等。所以，在整合 f(y) 后我需要 3000 个值. 积分将是不确定

首页

博学

6Ren·AI

商城

python - Pandas - 转置数据框值中长度不等的列表