gpt4 book ai didi

python - 使用多种颜色的 seaborn.pairplot() 绘制数据框?

转载 作者:太空宇宙 更新时间:2023-11-03 15:36:50 24 4
gpt4 key购买 nike

我想创建一个类似于此图像的绘图,以便比较我的数据集的多个 dims。数据集没有预设。我设法以一种颜色正确显示数据,但我想要 y=0 的一种颜色和 y=1 的一种颜色来比较点。就像在 iris 数据集的图像中一样。只要我将 hue='y' 包含在 sns.pairplot 方法中,代码将不会编译到最后。

我也不明白控制台输出。有什么问题?

enter image description here 将 seaborn 导入为 sns; sns.set(style="ticks", color_codes=True) 将 pandas 导入为 pd

dataframe = pd.DataFrame(dict(F1=X[:, 0], F2=X[:, 1], F3=X[:, 2], F4=X[:, 3], y=y))

print(dataframe)

g = sns.pairplot(dataframe, hue='y')

这是dataframe 的输出。我觉得还不错:

            F1        F2        F3        F4    y
0 3.173182 2.849991 2.497907 2.851715 0.0
1 2.468625 -0.216985 0.275206 1.232518 1.0
2 2.398419 2.258931 2.255533 4.895872 0.0
3 1.379937 1.041677 1.165911 1.992650 1.0
4 2.489665 2.269068 4.129961 2.218203 0.0
5 4.140160 2.809088 2.973027 3.553128 0.0
6 2.997969 1.701299 2.978875 1.946793 0.0
7 3.864436 3.554276 3.568455 2.839489 0.0
8 -0.000605 1.376971 1.128350 1.293777 1.0
9 2.398057 1.180861 2.400801 2.264726 1.0
10 0.997385 -0.560205 0.954628 2.788858 1.0

... ... ... ... ... ...

3990 3.334553 4.576306 2.470476 3.032781 0.0
3991 1.465784 2.304793 1.267303 -0.030802 1.0
3992 0.505905 -0.280769 -1.223464 1.077305 1.0
3993 2.581596 3.924394 3.878303 2.579366 0.0
3994 4.362067 2.247818 2.948595 1.906314 0.0
3995 2.310546 0.006672 2.382227 1.940343 1.0
3996 -0.944635 1.387136 0.604135 2.421478 1.0
3997 1.290999 1.485965 0.262792 0.899340 1.0
3998 0.864532 1.759607 1.118346 1.038935 1.0
3999 1.819110 2.218838 3.927945 2.593009 0.0

[4000 rows x 5 columns]

但最终我收到了这个错误:

Traceback (most recent call last):
File "/Users//PycharmProjects//V3_multiTops/vergleich.py", line 131, in <module>
g = sns.pairplot(dataframe, hue='y')
File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/seaborn/axisgrid.py", line 2111, in pairplot
grid.map_diag(kdeplot, **diag_kws)
File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/seaborn/axisgrid.py", line 1399, in map_diag
func(data_k, label=label_k, color=color, **kwargs)
File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/seaborn/distributions.py", line 691, in kdeplot
cumulative=cumulative, **kwargs)
File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/seaborn/distributions.py", line 294, in _univariate_kdeplot
x, y = _scipy_univariate_kde(data, bw, gridsize, cut, clip)
File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/seaborn/distributions.py", line 366, in _scipy_univariate_kde
kde = stats.gaussian_kde(data, bw_method=bw)
File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/scipy/stats/kde.py", line 172, in __init__
self.set_bandwidth(bw_method=bw_method)
File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/scipy/stats/kde.py", line 499, in set_bandwidth
self._compute_covariance()
File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/scipy/stats/kde.py", line 510, in _compute_covariance
self._data_inv_cov = linalg.inv(self._data_covariance)
File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/scipy/linalg/basic.py", line 975, in inv
raise LinAlgError("singular matrix")
numpy.linalg.linalg.LinAlgError: singular matrix

我想我对 sns.pairplot() 做错了什么,我还不明白。你能给我解释一下吗?

最佳答案

问题似乎是 "y" 列本身是数字。因此它将作为列/行包含在 pairgrid 中。无论如何,这似乎是不受欢迎的。要选择应参与网格的变量,请使用 pairplotvars 关键字。

 sns.pairplot(df, vars=df.columns[:-1], hue="y")

iris 数据集在不指定 vars 的情况下工作的原因是 hue 列不是数字。非数字列不包含在网格中。

完整示例:

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randn(300, 4), columns=[f"F{i+1}" for i in range(4)])
df["y"] = np.random.choice([1., 0.], size=len(df))

sns.pairplot(df, vars=df.columns[:-1], hue="y")
plt.show()

enter image description here

关于python - 使用多种颜色的 seaborn.pairplot() 绘制数据框?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54317168/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com