gpt4 book ai didi

python - 使用 numpy 将自定义颜色分配给集群

转载 作者:太空宇宙 更新时间:2023-11-03 16:47:26 24 4
gpt4 key购买 nike

是否有一种方法可以使用自己喜欢的颜色(8 到 10 或更多)来绘制由以下代码绘制的不同簇:

import numpy as np

existing_df_2d.plot(
kind='scatter',
x='PC2',y='PC1',
c=existing_df_2d.cluster.astype(np.float),
figsize=(16,8))

代码来自这里:https://www.codementor.io/python/tutorial/data-science-python-pandas-r-dimensionality-reduction

谢谢

我尝试了以下方法但没有成功:

LABEL_COLOR_MAP = {0 : 'red',
1 : 'blue',
2 : 'green',
3 : 'purple'}

label_color = [LABEL_COLOR_MAP[l] for l in range(len(np.unique(existing_df_2d.cluster)))]

existing_df_2d.plot(
kind='scatter',
x='PC2',y='PC1',
c=label_color,
figsize=(16,8))

最佳答案

您需要添加一种新颜色4并使用maping通过字典LABEL_COLOR_MAP:

LABEL_COLOR_MAP = {0 : 'red',
1 : 'blue',
2 : 'green',
3 : 'purple',
4 : 'yellow'}

existing_df_2d.plot(
kind='scatter',
x='PC2',y='PC1',
c=existing_df_2d.cluster.map(LABEL_COLOR_MAP),
figsize=(16,8))

因为:

print np.unique(existing_df_2d.cluster)
[0 1 2 3 4]

所有代码:

import numpy as np
import pandas as pd
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans

tb_existing_url_csv = 'https://docs.google.com/spreadsheets/d/1X5Jp7Q8pTs3KLJ5JBWKhncVACGsg5v4xu6badNs4C7I/pub?gid=0&output=csv'

existing_df = pd.read_csv(
tb_existing_url_csv,
index_col = 0,
thousands = ',')
existing_df.index.names = ['country']
existing_df.columns.names = ['year']

pca = PCA(n_components=2)
pca.fit(existing_df)
PCA(copy=True, n_components=2, whiten=False)
existing_2d = pca.transform(existing_df)

existing_df_2d = pd.DataFrame(existing_2d)
existing_df_2d.index = existing_df.index
existing_df_2d.columns = ['PC1','PC2']
existing_df_2d.head()

kmeans = KMeans(n_clusters=5)
clusters = kmeans.fit(existing_df)
existing_df_2d['cluster'] = pd.Series(clusters.labels_, index=existing_df_2d.index)
print existing_df_2d.head()

PC1 PC2 cluster
country
Afghanistan -732.215864 203.381494 2
Albania 613.296510 4.715978 3
Algeria 569.303713 -36.837051 3
American Samoa 717.082766 5.464696 3
Andorra 661.802241 11.037736 3

LABEL_COLOR_MAP = {0 : 'red',
1 : 'blue',
2 : 'green',
3 : 'purple',
4 : 'yellow'}

existing_df_2d.plot(
kind='scatter',
x='PC2',y='PC1',
c=existing_df_2d.cluster.map(LABEL_COLOR_MAP),
figsize=(16,8))

graph

测试:

按列排列的前 10 行PC2:

print existing_df_2d.loc[existing_df_2d['PC2'].nlargest(10).index,:]
PC1 PC2 cluster
country
Kiribati -2234.809790 864.494075 2
Djibouti -3798.447446 578.975277 4
Bhutan -1742.709249 569.448954 2
Solomon Islands -809.277671 530.292939 1
Nepal -986.570652 525.624757 1
Korea, Dem. Rep. -2146.623299 438.945977 2
Timor-Leste -1618.364795 428.244340 2
Tuvalu -1075.316806 366.666171 1
Mongolia -686.839037 363.722971 1
India -1146.809345 363.270389 1

关于python - 使用 numpy 将自定义颜色分配给集群,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36180477/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com