gpt4 book ai didi

python - 如何将分组结果转换为数据框

转载 作者:行者123 更新时间:2023-12-01 02:59:09 25 4
gpt4 key购买 nike

我有以下数据框:

import pandas as pd
import numpy as np
df = pd.DataFrame({
'category': ['ctr','ctr','ctr','ctr','ctr','ctr'],
'expected_count': [100,100,112,1.3,14,125],
'sample_id': ['S1','S1','S1','S2','S2','S2'],
'gene_symbol': ['a', 'b', 'c', 'a', 'b', 'c'],
})

这会产生这个:

In [2]: df
Out[2]:
category expected_count gene_symbol sample_id
0 ctr 100.0 a S1
1 ctr 100.0 b S1
2 ctr 112.0 c S1
3 ctr 1.3 a S2
4 ctr 14.0 b S2
5 ctr 125.0 c S2

我可以将其与基因符号分组:

In [4]: gdf = df.groupby(by = 'gene_symbol')['expected_count'].mean()
...: gdf
...:
Out[4]:
gene_symbol
a 50.65
b 57.00
c 118.50
Name: expected_count, dtype: float64

In [5]: str(gdf)
Out[5]: 'gene_symbol\na 50.65\nb 57.00\nc 118.50\nName: expected_count, dtype: float64'

请注意,gdf 是一个字符串。如何将其转换为数据框?

最佳答案

需要as_index=Falsereset_index :

gdf = df.groupby('gene_symbol', as_index=False)['expected_count'].mean()
print (gdf)
gene_symbol expected_count
0 a 50.65
1 b 57.00
2 c 118.50

或者:

gdf = df.groupby('gene_symbol')['expected_count'].mean().reset_index()
print (gdf)
gene_symbol expected_count
0 a 50.65
1 b 57.00
2 c 118.50

输出不是字符串,而是系列:

print (type(df.groupby('gene_symbol')['expected_count'].mean()))
<class 'pandas.core.series.Series'>

关于python - 如何将分组结果转换为数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43973537/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com