gpt4 book ai didi

python - 如何在 python pandas 中逐一阅读专栏?

转载 作者:太空宇宙 更新时间:2023-11-04 09:58:22 26 4
gpt4 key购买 nike

我已经从 URL 读取文件如下:

    url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"

names = ['sepal length', 'sepal width', 'petal length', 'petal width', 'class']

data = pd.read_csv(url, names=names)

print(data.shape)

print(data)

现在,我想阅读一列并进行一些处理(可能是最小值、最大值或标准偏差、r 分数等),然后再次阅读另一列并进行一些处理。

有什么办法可以在 scikit learn/pandas/python 中实现吗?

最佳答案

你可以使用describe:

data.describe()

输出:

       sepal length  sepal width  petal length  petal width
count 150.000000 150.000000 150.000000 150.000000
mean 5.843333 3.054000 3.758667 1.198667
std 0.828066 0.433594 1.764420 0.763161
min 4.300000 2.000000 1.000000 0.100000
25% 5.100000 2.800000 1.600000 0.300000
50% 5.800000 3.000000 4.350000 1.300000
75% 6.400000 3.300000 5.100000 1.800000
max 7.900000 4.400000 6.900000 2.500000

或单列:

data['petal length'].describe()

输出:

count    150.000000
mean 3.758667
std 1.764420
min 1.000000
25% 1.600000
50% 4.350000
75% 5.100000
max 6.900000
Name: petal length, dtype: float64

或者您可以使用 apply 和 lambda 来按列进行一些自定义处理。

data.apply(lambda x: x.describe())

输出:

        sepal length  sepal width  petal length  petal width        class
25% 5.100000 2.800000 1.600000 0.300000 NaN
50% 5.800000 3.000000 4.350000 1.300000 NaN
75% 6.400000 3.300000 5.100000 1.800000 NaN
count 150.000000 150.000000 150.000000 150.000000 150
freq NaN NaN NaN NaN 50
max 7.900000 4.400000 6.900000 2.500000 NaN
mean 5.843333 3.054000 3.758667 1.198667 NaN
min 4.300000 2.000000 1.000000 0.100000 NaN
std 0.828066 0.433594 1.764420 0.763161 NaN
top NaN NaN NaN NaN Iris-setosa
unique NaN NaN NaN NaN 3

关于python - 如何在 python pandas 中逐一阅读专栏?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44897075/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com