gpt4 book ai didi

r - 使用 ggplot 标记异常值

转载 作者:行者123 更新时间:2023-12-04 12:00:42 24 4
gpt4 key购买 nike

我正在尝试用 ggplot 标记异常值。关于我的代码,我有两个问题:

  • 为什么它不标记低于 1.5*IQR 的异常值?
  • 为什么它不根据异常值所在的组来标记异常值,而是显然是指数据的整体平均值?我想单独标记每个箱线图的异常值。 IE。第 1 波(调查的)中 A 国的异常值等。

  • 我的代码示例:
    PERCENT <- rnorm(50, sd = 3)
    WAVE <- sample(6, 50, replace = TRUE)
    AGE_GROUP <- rep(c("21-30", "31-40", "41-50", "51-60", "61-70"), 10)
    COUNTRY <- rep(c("Country A", "Country B"), 25)
    N <- rnorm(50, mean = 200, sd = 2)

    df <- data.frame(PERCENT, WAVE, AGE_GROUP, COUNTRY, N)

    ggplot(df, aes(x = factor(WAVE), y = PERCENT, fill = factor(COUNTRY))) +
    geom_boxplot(alpha = 0.3) +
    geom_point(aes(color = AGE_GROUP, group = factor(COUNTRY)), position = position_dodge(width=0.75)) +
    geom_text(aes(label = ifelse(PERCENT > 1.5*IQR(PERCENT)|PERCENT < -1.5*IQR(PERCENT), paste(AGE_GROUP, ",", round(PERCENT, 1), "%, n =", round(N, 0)),'')), hjust = -.3, size = 3)

    到目前为止我所拥有的图片:
    Outlier Label

    enter image description here

    我感谢您的帮助!

    最佳答案

    如果你想要 IQR要按国家/地区计算,您需要对数据进行分组。您可能可以在全局范围内(即在将数据发送到 ggplot 之前)或在图层本地进行。

    library(dplyr)
    library(ggplot2)

    ggplot(df, aes(x = as.factor(WAVE), y = PERCENT, fill = COUNTRY)) +
    geom_boxplot(alpha = 0.3) +
    geom_point(aes(color = AGE_GROUP, group = COUNTRY), position = position_dodge(width=0.75)) +
    geom_text(aes(group = COUNTRY, label = ifelse(!between(PERCENT,-1.3*IQR(PERCENT), 1.3*IQR(PERCENT)),
    paste(" ",COUNTRY, ",", AGE_GROUP, ",", round(PERCENT, 1), "%, n =", round(N, 0)),'')),
    position = position_dodge(width=0.75),
    hjust = "left", size = 3)

    关于r - 使用 ggplot 标记异常值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47842646/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com