python - Pandas 中的数据透视表/反转表(但不完全是)-6ren

python - Pandas 中的数据透视表/反转表(但不完全是)

转载作者：太空狗更新时间：2023-10-30 02:35:54

27

4

我有一个问题，为此我设法编写了一些工作代码，但我想看看这里是否有人可以提供更简单/更有条理/不那么丑陋/更内置的解决方案。抱歉，标题非常模糊，但我无法用一句话概括这个问题。

问题

基本上我有一个如下所示的 DataFrame:

  id  foo_col  A  B  C  D
0  x  nothing  2  0  1  1
1  y       to  0  0  3  2
2  z      see  1  3  2  2

现在我想将列 ['A', 'B', 'C', 'D'] 转换为 ['W1', 'W2', ' W3']，这将是前 3 列名称(每行)使用每行中的数字排序。

这样，id为x的行将有A(2)，C(1)， D (with 1), B (with 0), 从而得到'W1' = 'A', 'W2' = 'C' , 'W3' = 'D'。

目标 DataFrame 将如下所示:

  id  foo_col W1 W2    W3
0  x  nothing  A  C     D
1  y       to  C  D  None
2  z      see  B  C     D

规则

可以使用字母顺序(行 x)打破平局；
如果少于 3 个非零 W，缺少的将得到 None(行 y)；
如果有超过 3 个非零 W，多出的一个将不会进入最终的 DataFrame(行 z)。

解决方案

import pandas as pd
import operator
import more_itertools as mit

# Define starting DataFrame
df = pd.DataFrame(data={'id': ['x', 'y', 'z'],
                        'foo_col': ['nothing', 'to', 'see'],
                        'A': [2, 0, 1],
                        'B': [0, 0, 3],
                        'C': [1, 3, 2],
                        'D': [1, 2, 2]})

print('Original DataFrame')
print(df.to_string())
print()

# Define 'source' and 'target' columns
w_columns = ['A', 'B', 'C', 'D']
w_labels = ['W1', 'W2', 'W3']

# Define function to do this pivoting
def pivot_w(row, columns=w_columns, labels=w_labels):
    # Convert relevant columns of DF to dictionary
    row_dict = row[columns].to_dict()
    # Convert dictionary to list of tuples
    row_tuples = [tuple(d) for d in row_dict.items()]
    # Sort list of tuples based on the second item (the value in the cell)
    row_tuples.sort(key=operator.itemgetter(1), reverse=True)
    # Get the sorted 'column' labels
    row_list = [x[0] for x in row_tuples if x[1] != 0]
    # Enforce rules 2 and 3
    if len(row_list) < 3:
        row_list = list(mit.take(3, mit.padnone(row_list)))
    else:
        row_list = row_list[:3]

    # Create a dictionary using the W lables
    output = {i: j for i, j in zip(labels, row_list)}

    return output

# Get DataFrame with W columns and index
df_w = pd.DataFrame(list(df.apply(pivot_w, axis=1)))
# Merge DataFrames on index
df = df.merge(df_w, how='inner', left_index=True, right_index=True)
# Drop A, B, C, D columns
df.drop(columns=w_columns, inplace=True)

print('Final DataFrame')
print(df.to_string())

除了可能重复使用同一个变量来存储函数中的中间结果之外，还有什么我可以做的更聪明的事情吗？

P.S. 如果你们中的任何人有关于更好/更清晰的标题的想法，请随时进行编辑!

最佳答案

您可以使用 argsort用于获取前 3 列名称，但随后有必要用排序和 np.where 替换 0 值中的位置:

w_columns = ['A', 'B', 'C', 'D']
w_labels = ['W1', 'W2', 'W3']

#sorting columns names by values, last are 0 values (because minimal)
arr = np.array(w_columns)[np.argsort(-a, axis=1)]
print (arr)
[['A' 'C' 'D' 'B']
 ['C' 'D' 'A' 'B']
 ['B' 'C' 'D' 'A']]

#sorting values for 0 to last positions and compare by 0
mask = -np.sort(-df[w_columns], axis=1) == 0
print (mask)
[[False False False  True]
 [False False  True  True]
 [False False False False]]

#replace first 3 'columns' by mask to None
out = np.where(mask[:, :3], None, arr[:, :3])
print (out)
[['A' 'C' 'D']
 ['C' 'D' None]
 ['B' 'C' 'D']]

df1 = pd.DataFrame(out, columns=w_labels, index=df.index)
print (df1)
  W1 W2    W3
0  A  C     D
1  C  D  None
2  B  C     D

df = df.drop(w_columns, 1).join(df1)
print (df)
  id  foo_col W1 W2    W3
0  x  nothing  A  C     D
1  y       to  C  D  None
2  z      see  B  C     D

如果可能需要排除一些在所有选择值中都不是最小值的其他值，则可以将其替换为 NaN 并用于测试使用 np.isnan:

a = np.where(df[w_columns] != 0, df[w_columns], np.nan)
print (a)
[[ 2. nan  1.  1.]
 [nan nan  3.  2.]
 [ 1.  3.  2.  2.]]

arr = np.array(w_columns)[np.argsort(-a, axis=1)]
mask = np.isnan(np.sort(a, axis=1))

out = np.where(mask[:, :3], None, arr[:, :3])
print (out)

[['A' 'C' 'D']
 ['C' 'D' None]
 ['B' 'C' 'D']]

关于python - Pandas 中的数据透视表/反转表(但不完全是)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57524881/

27

4

0

文章推荐： python - 如何在 Python 中读取一个 100GB 的单行文本文件？

文章推荐： python - 由属性确定的唯一数据库条目

c - 反转 CRC32(a)
我正在尝试将我在本文档中阅读的内容付诸实践: https://sar.informatik.hu-berlin.de/research/publications/SAR-PR-2006-05/SAR-
Terraform - 反转 map
我一直在尝试编写一个可以改变这个的 terraform 表达式: subnets = { my_subnet_1 = { nsg = "my_nsg_1", ad
java - 反转 HashMap
我有一个HashMap，它将两个字符串转换为单词，然后将单词添加到 map 中。我拥有它，以便一个键可以指向多个值。现在我想创建一个循环来反转表，以便所有值都指向键。不要为一个指向多个逆值的键而烦恼。
python - 按位运算一元 ~(反转)
我对 ~ 运算符有点困惑。代码如下: a = 1 ~a #-2 b = 15 ~b #-16 ~ 是如何工作的？我想，~a 会是这样的: 0001 = a 1110 = ~a 为什么不呢？最佳
.net - 反转 ResourceManager
如果执行 ResourceManager.GetString(Key)，您可以获取资源中某个项目的值。有没有一种方法可以进行反向查找以从给定值的资源中获取 key (本质上是反翻译)？最佳答案您应
反转 R 中的数字
我在 R 中编写了一个代码来反转一个数字。但是我得到了 inf作为输出。 digit0){ rev_num=rev_num*10 + digit %% 10 digit=digit / 10 }
python 反转/转置字典
这个问题已经有答案了: Invert keys and values of the original dictionary (3 个回答) 已关闭 9 年前。我正在寻找在 python 上转置一本字
java - 反转 while 循环形状？
所以我试图反转我当前制作的形状的输出。我想知道我应该扭转这种情况吗？我尝试更改变量“a”和“c”的值，最终陷入无限循环。 class IRT { public static void main
php - mysql_real_escape_string 反转？
这个问题在这里已经有了答案: 关闭 10 年前。 Possible Duplicate: PHP mysql_real_escape_string() -> stripslashes() leavi
CSS 径向渐变投影 - 反转
从 Wordpress 模板中提取一些预先存在的代码来绘制椭圆阴影。阴影呈椭圆形向下辐射。只有椭圆的下半部分可见，从而形成底部阴影效果。我只是想“反转”椭圆的“阴影效果”，以便只有阴影的顶部一半可
javascript - 弧度有时会翻转/反转
我有一个函数应该找到两个弧度的中间 function mrad(rb,ra){return (rb+ra)/2;} 但有时，当我用 Math.sin 和 Math.cos 绘制 x 和 y 时，这两个
html - CSS动画如何对悬停事件使用“反转”？
给定此代码（http://jsfiddle.net/bzf1mkx5/） .intern { -webkit-animation: in 1s 1 reverse forwards; } .i
python - 按位运算一元 ~(反转)
我对 ~ 运算符有点困惑。代码如下: a = 1 ~a #-2 b = 15 ~b #-16 ~ 是如何工作的？我想，~a 会是这样的: 0001 = a 1110 = ~a 为什么不呢？最佳
c# - 反转 SPlist
我需要以相反的顺序从列表中提取项目(从最后一个条目到第一个)。我设法得到了所有元素，但是，从第一个到最后一个。这是我正在使用的部分代码: 该列表位于不同的网站集上。 using (SPSit
c# - ServerCertificateValidationCallback 反转
由于一些证书问题，我不得不写 ServicePointManager.ServerCertificateValidationCallback += (sender, certificate, chai
Python Map() 反转
是否有一个函数接受一个函数列表和一个输入，并输出一个对输入进行操作的函数列表？所以像 map，但倒退: >>>map(lambda x: 2*x,[1,2,3,4,5,6,7,8,9]) [2, 4
mysql - 反转 IN 功能
考虑下表团队消息: 15:10 | Peter | I'm off to the store, call my mobile phone if you need me. 15:11 | Susy |
c - 反转 AND 按位
算法如下: int encryption(int a, int b) { short int c, c2; uint8_t d; c = a ^ b; c2 = c;
algorithm - 反转 CRC32
我正在寻找一种方法来逆转 a CRC32 checksum .周围有解决方案，但它们要么是 badly written , extremely technical和/或 in Assembly .汇编
windows - 反转 FOR 命令的含义？
使用批处理文件，处理所有在文件名或扩展名中共享字符串的文件就足够简单了，例如: FOR /R %F IN (*.EXE) DO @ECHO %F 但是，如果我想反转文件集的含义怎么办？比如，处理所有不

首页

博学

6Ren·AI

商城

python - Pandas 中的数据透视表/反转表(但不完全是)