gpt4 book ai didi

python - Seaborn countplot,每组归一化 y 轴

转载 作者:IT老高 更新时间:2023-10-28 21:13:24 25 4
gpt4 key购买 nike

我想知道是否可以创建 Seaborn 计数图,但不是在 y 轴上显示实际计数,而是显示其组内的相对频率(百分比)(由 hue 指定> 参数)。

我用以下方法解决了这个问题,但我无法想象这是最简单的方法:

# Plot percentage of occupation per income class
grouped = df.groupby(['income'], sort=False)
occupation_counts = grouped['occupation'].value_counts(normalize=True, sort=False)

occupation_data = [
{'occupation': occupation, 'income': income, 'percentage': percentage*100} for
(income, occupation), percentage in dict(occupation_counts).items()
]

df_occupation = pd.DataFrame(occupation_data)

p = sns.barplot(x="occupation", y="percentage", hue="income", data=df_occupation)
_ = plt.setp(p.get_xticklabels(), rotation=90) # Rotate labels

结果:

Percentage plot with seaborn

我正在使用来自 UCI machine learning repository 的知名成人数据集. pandas 数据框是这样创建的:

# Read the adult dataset
df = pd.read_csv(
"data/adult.data",
engine='c',
lineterminator='\n',

names=['age', 'workclass', 'fnlwgt', 'education', 'education_num',
'marital_status', 'occupation', 'relationship', 'race', 'sex',
'capital_gain', 'capital_loss', 'hours_per_week',
'native_country', 'income'],
header=None,
skipinitialspace=True,
na_values="?"
)

This question有点相关,但不使用 hue 参数。在我的情况下,我不能只更改 y 轴上的标签,因为条形的高度必须取决于组。

最佳答案

使用较新版本的 seaborn,您可以执行以下操作:

import numpy as np
import pandas as pd
import seaborn as sns
sns.set(color_codes=True)

df = sns.load_dataset('titanic')
df.head()

x,y = 'class', 'survived'

(df
.groupby(x)[y]
.value_counts(normalize=True)
.mul(100)
.rename('percent')
.reset_index()
.pipe((sns.catplot,'data'), x=x,y='percent',hue=y,kind='bar'))


输出

enter image description here

更新:还在条形图顶部显示百分比

如果您还想要百分比,可以执行以下操作:

import numpy as np
import pandas as pd
import seaborn as sns

df = sns.load_dataset('titanic')
df.head()

x,y = 'class', 'survived'

df1 = df.groupby(x)[y].value_counts(normalize=True)
df1 = df1.mul(100)
df1 = df1.rename('percent').reset_index()

g = sns.catplot(x=x,y='percent',hue=y,kind='bar',data=df1)
g.ax.set_ylim(0,100)

for p in g.ax.patches:
txt = str(p.get_height().round(2)) + '%'
txt_x = p.get_x()
txt_y = p.get_height()
g.ax.text(txt_x,txt_y,txt)

enter image description here

关于python - Seaborn countplot,每组归一化 y 轴,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34615854/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com