gpt4 book ai didi

R plotly grouped boxplot突出显示每个类别的特定值

转载 作者:行者123 更新时间:2023-12-05 05:43:33 25 4
gpt4 key购买 nike

我有以下代码在我这边运行良好:

# Seeding the pseudo-random number generator for reproducible results
set.seed(1234)
# Create three varaible
income <- round(rnorm(500, # 500 random data point values
mean = 10000, # mean of 100
sd = 1000), # standard deviation of 1000
digits = 2) # round the random values to two decimal points
stage <- sample(c("Early",
"Mid",
"Late"), # sample space of the stage variable
500, # 500 random data point values
replace = TRUE) # replace values for reselection
country <- sample(c("USA",
"Canada"), # sample space of the country variabe
500, # 500 random data point values
replace = TRUE) # replace values for reselection
# Create tibble
df1 <- tibble(Income = income, # create an Income variable for the income data point values
Stage = stage, # create a Stage variable for the stage data point values
Country = country) # create a Country variable for the country data point values

df1 <- as.data.frame(df1)
df1$HIGHLIGHT <- 'NO'
df1$TMP = paste0(df1$Country,"_",df1$Stage)
idx <- duplicated(df1$TMP)
df1$HIGHLIGHT[!idx] = 'YES'


plot_ly(df1,
x = ~Country,
y = ~Income,
color = ~Stage,
type = "box") %>%
layout(boxmode = "group",
title = "Income by career stage",
xaxis = list(title = "Country",
zeroline = FALSE),
yaxis = list(title = "Income",
zeroline = FALSE))

但是,我想添加的是每个单独的箱线图上的红点,显示“HIGHLIGHT”列给出的最新值,其中该列中的值为“YES”。这有助于用户不仅查看每个箱线图的分布,还可以查看最新值的位置。我找不到添加这些红点的方法。有什么建议么?谢谢

最佳答案

我找不到我称之为简单或直观的方法来执行此操作,但我确实找到了一种可行的方法。

我用域来对齐 x 轴上的点和 y 轴上的收入。因为 plotly 中的注释需要文本,所以我使用了星号。我确实从一个句号开始,但点数显示不对,因为句号位于文本“空格”的底部。

如果这就是您要找的,请告诉我。

# first find the values needed 
df1 %>% filter(HIGHLIGHT == "YES") %>%
group_by(Country, Stage) %>%
summarise(Income = Income)
# # A tibble: 6 × 3
# # Groups: Country [2]
# Country Stage Income
# <chr> <chr> <dbl>
# 1 Canada Early 7654.
# 2 Canada Late 9002.
# 3 Canada Mid 8793.
# 4 USA Early 11084.
# 5 USA Late 9110.
# 6 USA Mid 10277.

然后提取绘图所需的值。也请注意此处的顺序。这与现在 plotly 中的顺序相同。

通过反复试验,知道加拿大以域中的 x = 0 为中心,而美国以域中的 x = 1 为中心,我尝试了一些值,直到找到可行的值。

x 域中的箱线图中心为 -.235、0、.235、.765、1 和 1.235。

接下来,我为注释创建了 x 和 y。

newY = df1 %>% filter(HIGHLIGHT == "YES") %>% 
group_by(Country, Stage) %>%
summarise(Income = Income) %>%
ungroup() %>%
select(Income) %>% as.data.frame() %>%
unlist()

x = c(-.235, 0, .235, .765, 1, 1.235)

然后我把它们放在一起。在您的绘图代码中,大多数变量都是大写的,但它们不在数据中。我只是在数据中更改了它们。

(plt = plot_ly(df1,
x = ~Country,
y = ~Income,
color = ~Stage,
type = "box") %>%
layout(boxmode = "group",
title = "Income by career stage",
xaxis = list(title = "Country",
zeroline = FALSE),
yaxis = list(title = "Income",
zeroline = FALSE),
annotations = list(x = x,
y = newY,
text = "*",
hovertext = newY,
font = list(size = 20,
color = "red"),
showarrow = F,
valign = "middle",
xanchor = "middle",
yanchor = "middle" )
) # end legend
) # end print

enter image description here

enter image description here

关于R plotly grouped boxplot突出显示每个类别的特定值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71802909/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com