python - 在 Seaborn 中绘制具有类似于 "hue"的多个属性的图形-6ren

python - 在 Seaborn 中绘制具有类似于 "hue"的多个属性的图形

转载作者：太空狗更新时间：2023-10-29 18:08:01

24

4

我有以下名为 df 的示例数据集，其中阶段时间是到达那里的天数:

id stage1_time stage_1_to_2_time stage_2_time stage_2_to_3_time stage3_time
a  10          30                40           30                70
b  30               
c  15          30                45     
d

我编写了以下脚本来获取 stage1_time 对 CDF 的散点图:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats

dict = {'id': id, 'stage_1_time': [10, 30, 15, None], 'stage_1_to_2_time': [30, None, 30, None], 'stage_2_time' : [40, None, 45, None],'stage_2_to_3_time' : [30, None, None, None],'stage_3_time' : [70, None, None, None]}
df = pd.DataFrame(dict)

#create eCDF function
def ecdf(df):
    n = len(df)
    x = np.sort(df)
    y = np.arange(1.0, n+1) / n
    return x, y

def generate_scatter_plot(df):

    x, y = ecdf(df)

    plt.plot(x, y, marker='.', linestyle='none') 
    plt.axvline(x.mean(), color='gray', linestyle='dashed', linewidth=2) #Add mean

    x_m = int(x.mean())
    y_m = stats.percentileofscore(df.as_matrix(), x.mean())/100.0

    plt.annotate('(%s,%s)' % (x_m,int(y_m*100)) , xy=(x_m,y_m), xytext=(10,-5), textcoords='offset points')

    percentiles= np.array([0,25,50,75,100])
    x_p = np.percentile(df, percentiles)
    y_p = percentiles/100.0

    plt.plot(x_p, y_p, marker='D', color='red', linestyle='none') # Overlay quartiles

    for x,y in zip(x_p, y_p):                                        
        plt.annotate('%s' % int(x), xy=(x,y), xytext=(10,-5), textcoords='offset points')

#Data to plot
stage1_time = df['stage_1_time'].dropna().sort_values()

#Scatter Plot
stage1_time_scatter = generate_scatter_plot(pd.DataFrame({"df" : stage1_time.as_matrix()}))
plt.title('Scatter Plot of Days to Stage1')
plt.xlabel('Days to Stage1')
plt.ylabel('Cumulative Probability')
plt.legend(('Days to Stage1', "Mean", 'Quartiles'), loc='lower right')
plt.margins(0.02)

plt.show()

输出:

目前，我将所有达到 stage1 的人所花费的天数与其累积概率作图，但是我想要实现的是，当我作图时，散点图具有三种颜色:那些达到 stage1 并留在那里，那些移动到 stage2 的人，以及那些移动到 stage3 的人。我还想要图中数据的计数:# in stage1、# in stage2 和 # in stage3。

请问有没有人可以协助到达那里？

仅供引用，目的是以此为基础，以便我还可以为 stage2_time 创建图表，其中到达 stage_3 的图表以不同的颜色突出显示。

最佳答案

您可以创建一个新列并使用它来存储最后阶段，然后使用这个新列为您的绘图着色。

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
import math

dict = {'id': id, 'Progressive_time': [10, 30, 15, None],'stage_1_to_2_time': [30, None, 30, None], 'Active_time' : [40,None, 45, None],'stage_2_to_3_time' : [30, None, None,None],'Engaged_time' : [70, None, None, None]}
df = pd.DataFrame(dict)

    #create eCDF function
def ecdf(df, serie):
    n = len(df)
    df['x'] = np.sort(df[serie])
    df['y'] = np.arange(1.0, n+1) / n
    return df

def generate_scatter_plot(df,serie,nb_stage):
    df=df.dropna(subset=[serie]).sort_values(by=[serie])
    st=1
    for i in range(1,nb_stage*2,2):
        df.loc[df.iloc[:,i].notnull(),'stage']=st
        st=st+1

    df= ecdf(df, serie)
    plt.plot(df.loc[df['stage'] == 1, 'x'], df.loc[df['stage'] == 1, 'y'], marker='.', linestyle='none',c='blue') 
    plt.plot(df.loc[df['stage'] == 2, 'x'], df.loc[df['stage'] == 2, 'y'], marker='.', linestyle='none',c='red') 
    plt.plot(df.loc[df['stage'] == 3, 'x'], df.loc[df['stage'] == 3, 'y'], marker='.', linestyle='none',c='green') 
    plt.axvline(df['x'].mean(), color='gray', linestyle='dashed', linewidth=2) #Add mean


    x_m = int(df['x'].mean())
    y_m = stats.percentileofscore(df[serie], df['x'].mean())/100.0

    plt.annotate('(%s,%s)' % (x_m,int(y_m*100)) , xy=(x_m,y_m), xytext=(10,-5), textcoords='offset points')

    percentiles= np.array([0,25,50,75,100])
    x_p = np.percentile(df[serie], percentiles)
    y_p = percentiles/100.0

    plt.plot(x_p, y_p, marker='D', color='red', linestyle='none') # Overlay quartiles

    for x,y in zip(x_p, y_p):                                        
        plt.annotate('%s' % int(x), xy=(x,y), xytext=(10,-5), textcoords='offset points')

#Scatter Plot
stage1_time_scatter = generate_scatter_plot(df,'stage_1_time',3)
plt.title('Scatter Plot of Days to Stage1')
plt.xlabel('Days to Stage1')
plt.ylabel('Cumulative Probability')
plt.legend(('Progressive','Active','Engaged','Days to Stage1', "Mean", 'Quartiles'), loc='lower right')
plt.margins(0.02)

plt.show()

关于python - 在 Seaborn 中绘制具有类似于 "hue"的多个属性的图形，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50611171/

24

4

0

文章推荐： angular - 从 Angular 4 Material 对话框调用父组件函数

文章推荐： c# - 自定义 Linq 排序

文章推荐： c# - Func 不返回任何东西？

javascript 属性 .net 属性
你能比较一下属性吗我想禁用文本框“txtName”。有两种方式使用javascript，txtName.disabled = true 使用 ASP.NET，哪种方法更好，为什么？最佳答案我
VBS教程：属性-Count 属性
Count 属性返回一个集合或 Dictionary 对象包含的项目数。只读。 object.Count object 可以是“应用于”列表中列出的任何集合或对
VBS教程：属性-CompareMode 属性
CompareMode 属性设置并返回在 Dictionary 对象中比较字符串关键字的比较模式。 object.CompareMode[ = compare] 参数
VBS教程：属性-Column 属性
Column 属性只读属性，返回 TextStream 文件中当前字符位置的列号。 object.Column object 通常是 TextStream 对象的名称。
VBS教程：属性-AvailableSpace 属性
AvailableSpace 属性返回指定的驱动器或网络共享对于用户的可用空间大小。 object.AvailableSpace object 应为 Drive
VBS教程：属性-Attributes 属性
Attributes 属性设置或返回文件或文件夹的属性。可读写或只读（与属性有关）。 object.Attributes [= newattributes] 参数 object
VBS教程：属性-AtEndOfStream 属性
AtEndOfStream 属性如果文件指针位于 TextStream 文件末，则返回 True；否则如果不为只读则返回 False。 object.A
VBS教程：属性-AtEndOfLine 属性
AtEndOfLine 属性 TextStream 文件中，如果文件指针指向行末标记，就返回 True；否则如果不是只读则返回 False。 object.AtEn
VBS教程：属性-RootFolder 属性
RootFolder 属性返回一个 Folder 对象，表示指定驱动器的根文件夹。只读。 object.RootFolder object 应为 Dr
VBS教程：属性-Path 属性
Path 属性返回指定文件、文件夹或驱动器的路径。 object.Path object 应为 File、Folder 或 Drive 对象的名称。说明对于驱动器，路径不包含根目录。
VBS教程：属性-ParentFolder 属性
ParentFolder 属性返回指定文件或文件夹的父文件夹。只读。 object.ParentFolder object 应为 File 或 Folder 对象的名称。说明以下代码
VBS教程：属性-Name 属性
Name 属性设置或返回指定的文件或文件夹的名称。可读写。 object.Name [= newname] 参数 object 必选项。应为 File 或&
VBS教程：属性-Line 属性
Line 属性只读属性，返回 TextStream 文件中的当前行号。 object.Line object 通常是 TextStream 对象的名称。说明文件刚
VBS教程：属性-Key 属性
Key 属性在 Dictionary 对象中设置 key。 object.Key(key) = newkey 参数 object 必选项。通常是 Dictionary
VBS教程：属性-Item 属性
Item 属性设置或返回 Dictionary 对象中指定的 key 对应的 item，或返回集合中基于指定的 key 的&
VBS教程：属性-IsRootFolder 属性
IsRootFolder 属性如果指定的文件夹是根文件夹，返回 True；否则返回 False。 object.IsRootFolder object 应为&n
VBS教程：属性-IsReady 属性
IsReady 属性如果指定的驱动器就绪，返回 True；否则返回 False。 object.IsReady object 应为 Drive&nbs
VBS教程：属性-FreeSpace 属性
FreeSpace 属性返回指定的驱动器或网络共享对于用户的可用空间大小。只读。 object.FreeSpace object 应为 Drive 对象的名称。
VBS教程：属性-FileSystem 属性
FileSystem 属性返回指定的驱动器使用的文件系统的类型。 object.FileSystem object 应为 Drive 对象的名称。说明可
VBS教程：属性-Files 属性
Files 属性返回由指定文件夹中所有 File 对象（包括隐藏文件和系统文件）组成的 Files 集合。 object.Files object&n

首页

博学

6Ren·AI

商城

python - 在 Seaborn 中绘制具有类似于 "hue"的多个属性的图形