Python - Pandas - 分组数据框中所有列的 value

Python - Pandas - 分组数据框中所有列的 value_counts

转载作者：行者123 更新时间：2023-12-01 03:01:07

25

4

我有一个针对所有问题的 7 分制调查数据集，我想获取所有列中常见值的 value_counts(并将数据框按两列分组)。让我向您展示一个示例数据集以及我到目前为止所达到的目标。

| col1          | col2          | col3          | Building      | Levels_Name            |
|---------------|---------------|---------------|---------------|------------------------|
| Not Satisfied | Not Satisfied | Not Satisfied | San Francisco | Individual Contributor |
| Satisfied     | Satisfied     | NA            | Basingstoke   | Individual Contributor |
| Not Satisfied | Satisfied     | Not Satisfied | San Francisco | Middle Management      |
| Not Satisfied | Satisfied     | Not Satisfied | Miami         | Senior Leadership      |
| Not Satisfied | Not Satisfied | Not Satisfied | Foster City   | Senior Leadership      |
| NA            | NA            | NA            | Foster City   | Other                  |
| Not Satisfied | Not Satisfied | NA            | Foster City   | Senior Leadership      |
| Not Satisfied | Satisfied     | Not Satisfied | Austin        | Middle Management      |
| Satisfied     | Satisfied     | Satisfied     | San Francisco | Senior Leadership      |
| Not Satisfied | Not Satisfied | Not Satisfied | Foster City   | Individual Contributor |
| Satisfied     | Satisfied     | NA            | Miami         | Middle Management      |

现在，我想按“Building”和“Levels_Name”对此数据集进行分组，并为“Satisfied”、“Not Satisfied”、“NA”添加新分组，并获取每列的值计数。

因此结果应如下所示:

| Building      | Levels_Name            | Sentiment     | col1 | col2 | col3 |
|---------------|------------------------|---------------|------|------|------|
| Foster City   | Individual Contributor | Not Satisfied | 1    | 1    | 1    |
| Foster City   | Individual Contributor | NA            | 0    | 0    | 0    |
| Foster City   | Individual Contributor | Satisfied     | 0    | 0    | 0    |
| Foster City   | Senior Leadership      | Not Satisfied | 2    | 2    | 0    |
| Foster City   | Senior Leadership      | NA            | 0    | 0    | 1    |
| Foster City   | Senior Leadership      | Satisfied     | 0    | 0    | 0    |
| San Francisco | Individual Contributor | Not Satisfied | 1    | 1    | 1    |
| San Francisco | Individual Contributor | NA            | 0    | 0    | 0    |
| San Francisco | Individual Contributor | Satisfied     | 0    | 0    | 0    |

谢谢!

最佳答案

首先，您要融化数据框，然后进行分组

d1 = pd.melt(
    df, ['Building', 'Levels_Name'], value_name='Sentiment'
).replace(np.nan, 'NaN')

d1.groupby(
    d1.columns.tolist()
).size().unstack('variable', fill_value=0).reset_index()

variable       Building             Levels_Name      Sentiment  col1  col2  col3
0                Austin       Middle Management  Not Satisfied     1     0     1
1                Austin       Middle Management      Satisfied     0     1     0
2           Basingstoke  Individual Contributor            NaN     0     0     1
3           Basingstoke  Individual Contributor      Satisfied     1     1     0
4           Foster City  Individual Contributor  Not Satisfied     1     1     1
5           Foster City                   Other            NaN     1     1     1
6           Foster City       Senior Leadership            NaN     0     0     1
7           Foster City       Senior Leadership  Not Satisfied     2     2     1
8                 Miami       Middle Management            NaN     0     0     1
9                 Miami       Middle Management      Satisfied     1     1     0
10                Miami       Senior Leadership  Not Satisfied     1     0     1
11                Miami       Senior Leadership      Satisfied     0     1     0
12        San Francisco  Individual Contributor  Not Satisfied     1     1     1
13        San Francisco       Middle Management  Not Satisfied     1     0     1
14        San Francisco       Middle Management      Satisfied     0     1     0
15        San Francisco       Senior Leadership      Satisfied     1     1     1

关于Python - Pandas - 分组数据框中所有列的 value_counts，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/43829096/

25

4

0

文章推荐： ajax - 无法获取内容()；通过 AJAX 在 Wordpress 中发布的帖子

文章推荐： python - 升级Django后如何更新包？

文章推荐： python - Django ModelForm 不保存数据

文章推荐： python - 将所需参数传递到实例中

python - 使用 `Dataframe.value_counts()`时不指定列名如何实现 `Series.value_counts`
比如我得到的dataframe如下: PassengerId Survived Pclass 0 1 0 3 1 2
python - value_counts 无法正常工作
我有一个数据框，我想查找每个值出现的次数。当我使用这个命令时 test = df['name'].value_counts() 我得到了正确的结果，但是当我尝试这个 df['occ'] = df['n
python - .value_counts() 给出截断的结果
我有一个 excel 文件，其中有一列包含多个单词。我正在尝试计算每个单词的出现频率。所以如果我有一个列表 Labels a a b b c c c 输出应该是 c : 3 b : 2 a : 2 我
python - 获取字典值的 value_counts()
这个问题在这里已经有了答案: How to count the same values in a dict? [duplicate] (1 个回答) 关闭 3 年前。我很熟悉如何返回value_c
python - 从部分分类列中获取 value_counts
我正在尝试使用 pandas(v0.23.4 ).当所有类别都存在时，这工作正常: import calendar import random import pandas as pd random.s
python - 如何绘制非数值数据的日期时间和 value_counts() ？
我有以下列connect_start 0 2019-01-01 00:01:44 1 2019-01-01 00:02:57 2 2019-01-01 00:24:09 3 2019-
python - 有效计算分组在一组其他分组变量中的多个独立列的 value_counts
我需要计算大量独立列中的值的计数(例如由 value_counts 表示)，这些独立列由一组固定的 2-5 个其他列分组。此练习是对多达数百万行和多达 50-100 列的数据进行数据挖掘的一部分。因此
python - 如何获取嵌套列的唯一元素的 value_counts()？
我正在尝试计算 pandas df 的嵌套列的唯一值，这是 manuel 注释的结果。假设我们有以下 df: df_test = pd.DataFrame(data=dict(x=["A","B","
pandas计数 value_counts()的使用
在pandas里面常用value_counts确认数据出现的频率。 1. Series 情况下： pandas 的 value_counts() 函数可以对Series里面的每个值进行计数并且排
python - 从多列的 value_counts 中排除项目
我得到了以下数据框: ae264e3637204a6fb9bb56bc8210ddfd ... 2906b810c7d4411798c6938adc9daaa5 1
python - 装箱 Pandas value_counts
我有一个由 df.column.value_counts().sort_index() 生成的 Pandas 系列。 | N Months | Count | |------|------| |
python - 如何存储从 value_counts() 函数返回的信息
我有以下数据框: import pandas as pd import numpy as np df_Station_Weather = pd.DataFrame(
python - value_counts() 计算数据帧中的 NaN
我创建了一个由两列组成的数据框。我想计算这两列出现的次数。数据框看起来像 - No Name 1 A 1 A 5 T 9 V Nan M 5 T 1 A 我想使用 valu
python - 为什么pandas value_counts() 显示某些值的计数为零？
我有一个数据框，其中一列是带有以下标签的分类变量:['Short', 'Medium', 'Long', 'Very Long', 'Extremely Long'] .我正在尝试创建一个新的数据框，
python - 返回 value_counts 的总和
这是我的数据框: email title id --------------------------------- balh@blah.com Title a
python - 返回 value_counts 的总和
这是我的数据框: email title id --------------------------------- balh@blah.com Title a
python - pandas value_counts 输出文件
目标我正在尝试从 value_counts() 开始，为数据框中的每一列自动生成 EDA 报告。问题问题是我的函数没有返回任何内容。因此，虽然它确实打印到控制台，但它不会将相同的输出打印到我的文
python - 调查结果的条形图为 pd.value_counts()
我进行了一项调查，答案可以是 1-7，例如“绝对不快乐”到“绝对快乐”以及介于两者之间的一切，数据是一个 pandas 系列。对它进行 data.value_counts() 会产生有序表 5.0
python - 如何根据另一个数据帧中某一列的 value_counts 但在其他列上具有某些条件来创建新数据帧？
我有一个在一组服务器上提出的票证的 pandas 数据框架，如下所示: a b c Users Problem 0 data data data U
python - 在 value_counts() 之后从分类中提取索引作为数组
自从我开始在 pandas 中使用categorical类型以来，我有一段无法工作的特定代码:(为了方便起见，我将其形成为测试): import pandas as pd import numpy a

首页

博学

6Ren·AI

商城

Python - Pandas - 分组数据框中所有列的 value_counts