gpt4 book ai didi

python - 从Python中的字典中删除无关的值

转载 作者:太空宇宙 更新时间:2023-11-03 21:10:42 25 4
gpt4 key购买 nike

感谢您的解决方案。但是,当我尝试将其应用于数据时,以使列标题在无关值的搜索和替换中不受影响。这是我的数据框。请协助。

df = pd.DataFrame({'Date_sampled': ['8/31/2018 0:00',
'9/31/2018 12:00:00 AM', '2/31/2018 12:00:00 AM', '2/31/2018 12:00:00 AM', '12/31/2018 0:00',
'12/31/2018 0:00', '12/31/2018 0:00', '6/31/2018 12:00:00 AM', '2/31/2018 12:00:00 AM',
'2/31/2018 12:00:00 AM', '12/31/2018 0:00', '12/31/2018 0:00'], 'apple18:apple1': ['15.8',
'27.84883300816733\\U', '27.68303400840678\\O', '???', '?????', '67.61', '27.33',
'37.73069872941176\\M', '37.98761171079137\\F', '10.2\\I', '10.1\\Y', '67.61'],
'Orange:ripe': ['89.59', '44.64197389840307\\Y', '39.93121897299962\\W', '7.2\\K',
'6.0\\Y', '9.19', '18.62', '???', '???', '7.2\\T', '7.0\\D', '79.1'], 'Banana': ['51.36', '?????',
'???', '23.77814972104277\\T', '27.80709611086276\\N', '13.3\\T', '31.27', '?????', '???',
'17.3\\H', '16.4\\E', '11.36'], 'Egg24:Eg17 (Toasted:Scrammed)': ['17.98', '13.3\\T', '9.4\\J',
'2396,7', 'nan', '14', 'None', 'None', '14.8', '44.64197349440307\\Y', '39.93151497599965\\W',
'-'], 'Bread(white)': ['23.24', '6.1\\Q', '7.2\\K', 'None', 'None', '20', 'None', 'None', '20.4', '3473,3',
'1606,3', '47,7'], 'Potato:24': ['-', '-', '-', '-', 'nan', 'nan', 'nan', '343.859844\\OP', '56.06332588\\RS',
'75.1973942\\ZTO', 'nan', '-']})

最佳答案

我相信您需要通过 Series.str.replace 提取数值与 Series.str.extract :

d ={'apple': ['15.8', '356,2', '51.36', '17986,8','6.0\\tY', 'Null'],
'banana': ['27.84883300816733\\U', 'Z44.64197389840307\\Y', '?????', '13.3\\T', 'p17.6', '6.1\\Q'],
'cheese': ['27.68303400840678\\O', '39.93121897299962\\W', '???', '9.4\\J', '7.2\\K','6.0\\Y'],
'egg': ['???', '7.2\\K', '66.0\\p','23.77814972104277\\T', '2396,7', 'None']}
<小时/>
df = pd.DataFrame(d)
print (df)
apple banana cheese egg
0 15.8 27.84883300816733\U 27.68303400840678\O ???
1 356,2 Z44.64197389840307\Y 39.93121897299962\W 7.2\K
2 51.36 ????? ??? 66.0\p
3 17986,8 13.3\T 9.4\J 23.77814972104277\T
4 6.0\tY p17.6 7.2\K 2396,7
5 Null 6.1\Q 6.0\Y None

#https://stackoverflow.com/a/28832504/2901002
pat = r"(\d+\.*\d*)"
df = df.apply(lambda x: x.str.replace(',','.').str.extract(pat, expand=False))
print (df)

apple banana cheese egg
0 15.8 27.84883300816733 27.68303400840678 NaN
1 356.2 44.64197389840307 39.93121897299962 7.2
2 51.36 NaN NaN 66.0
3 17986.8 13.3 9.4 23.77814972104277
4 6.0 17.6 7.2 2396.7
5 NaN 6.1 6.0 NaN

最后可以转换为 float :

df = df.apply(lambda x: x.str.replace(',','.').str.extract(pat, expand=False)).astype(float)
print (df)
apple banana cheese egg
0 15.80 27.848833 27.683034 NaN
1 356.20 44.641974 39.931219 7.20000
2 51.36 NaN NaN 66.00000
3 17986.80 13.300000 9.400000 23.77815
4 6.00 17.600000 7.200000 2396.70000
5 NaN 6.100000 6.000000 NaN

关于python - 从Python中的字典中删除无关的值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55086179/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com