gpt4 book ai didi

python - python 获取一个csv字段中2个数字的平均值

转载 作者:太空宇宙 更新时间:2023-11-03 14:47:50 26 4
gpt4 key购买 nike

我正在尝试清理 python(pandas)中的数据集(csv) enter image description here

在“预计投资”列中,我有包含 2 个数字的数据。例如 30-35 我怎样才能得到这个平均值,以便该字段包含 32.5

最佳答案

我认为最好的是创建float列,而不是混合数字与字符串。

第一replace missingNaN,然后 split ,转换为 float 并最后得到mean:

df = pd.DataFrame({'Projected investment':['missing','30-35','77']})
print (df)
Projected investment
0 missing
1 30-35
2 77

df['Projected investment'] = df['Projected investment'].replace('missing', np.nan) \
.str.split('-', expand=True) \
.astype(float) \
.mean(axis=1)
print (df)
Projected investment
0 NaN
1 32.5
2 77.0

print (df['Projected investment'].dtypes)
float64

如果需要缺少作为字符串:

def parse_number(x): 
try:
return np.mean(np.array(str(x).split('-')).astype(float))
except ValueError:
return x

df['Projected investment'] = df['Projected investment'].map(parse_number)
print (df)
Projected investment
0 missing
1 32.5
2 77

print (df['Projected investment'].apply(type))
0 <class 'str'>
1 <class 'numpy.float64'>
2 <class 'numpy.float64'>
Name: Projected investment, dtype: object

关于python - python 获取一个csv字段中2个数字的平均值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46113083/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com