gpt4 book ai didi

python - 如何使用字典对 DataFrame 进行子集化?

转载 作者:太空狗 更新时间:2023-10-29 20:51:14 25 4
gpt4 key购买 nike

比如说,我给出了一个 DataFrame,其中大部分列都是分类数据。

> data.head()
age risk sex smoking
0 28 no male no
1 58 no female no
2 27 no male yes
3 26 no male no
4 29 yes female yes

我想通过这些分类变量的键值对字典对这些数据进行子集化。

tmp = {'risk':'no', 'smoking':'yes', 'sex':'female'}

因此,我想要以下子集。

data[ (data.risk == 'no') & (data.smoking == 'yes') & (data.sex == 'female')]

我想做的是:

data[tmp]

执行此操作的大多数 python/pandas 方法是什么?


最小的例子:

import numpy as np
import pandas as pd
from pandas import Series, DataFrame

x = Series(random.randint(0,2,50), dtype='category')
x.cat.categories = ['no', 'yes']

y = Series(random.randint(0,2,50), dtype='category')
y.cat.categories = ['no', 'yes']

z = Series(random.randint(0,2,50), dtype='category')
z.cat.categories = ['male', 'female']

a = Series(random.randint(20,60,50), dtype='category')

data = DataFrame({'risk':x, 'smoking':y, 'sex':z, 'age':a})

tmp = {'risk':'no', 'smoking':'yes', 'sex':'female'}

最佳答案

我会使用 .query()此任务的方法:

qry = ' and '.join(["{} == '{}'".format(k,v) for k,v in tmp.items()])    

data.query(qry)

输出:

   age risk     sex smoking
7 24 no female yes
22 43 no female yes
23 42 no female yes
25 24 no female yes
32 29 no female yes
40 34 no female yes
43 35 no female yes

查询字符串:

print(qry)
"sex == 'female' and risk == 'no' and smoking == 'yes'"

关于python - 如何使用字典对 DataFrame 进行子集化?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40111730/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com