r - R 中的二次判别分析 (QDA) 图-6ren

r - R 中的二次判别分析 (QDA) 图

转载作者：行者123 更新时间：2023-12-03 23:26:36

我正在尝试使用 MASS 和 ggplot2 包绘制 Iris 数据集二次判别分析 (QDA) 的结果。该脚本在其第一部分中显示了线性判别分析 (LDA)，但我不知道要继续为 QDA 执行此操作。 “qda”类的对象与“lda”类对象有点不同，例如:我找不到解释的组间方差/判别分量的迹线比例/X%，无法将它们添加到图中轴。任何帮助或想法如何使用 ggplot2 编码此图？
代码:

require(MASS)
require(ggplot2)
require(scales)
 

irislda <- lda(Species ~ ., iris)
prop.lda = irislda$svd^2/sum(irislda$svd^2)
plda <- predict(irislda,   iris)

datasetLDA = data.frame(species = iris[,"Species"], irislda = plda$x)
ggplot(datasetLDA) + geom_point(aes(irislda.LD1, irislda.LD2, colour = species, shape = species), size = 2.5) + 
    labs(x = paste("LD1 (", percent(prop.lda[1]), ")", sep=""),
       y = paste("LD2 (", percent(prop.lda[2]), ")", sep=""))

 
irisqda <- qda(Species ~ ., iris)
pqda <- predict(irisqda,   iris)
datasetQDA = data.frame(species = iris[,"Species"], irisqda = pqda$posterior) 
ggplot(datasetQDA) + geom_point(???, ???, colour = species, shape = species), size = 2.5)

最佳答案

按照 Ducks 的评论，如果您只有 2 个维度，我们可以使用链接中提供的 decisionplot 函数来可视化这些。对于更多的变量，它必须稍微改变。

library(MASS)
model <- qda(Species ~ Sepal.Length + Sepal.Width, iris)
decisionplot(model, iris, class = "Species")

decisionplot 函数如下所示。

decisionplot <- function(model, data, class = NULL, predict_type = "class",
  resolution = 100, showgrid = TRUE, ...) {

  if(!is.null(class)) cl <- data[,class] else cl <- 1
  data <- data[,1:2]
  k <- length(unique(cl))

  plot(data, col = as.integer(cl)+1L, pch = as.integer(cl)+1L, ...)

  # make grid
  r <- sapply(data, range, na.rm = TRUE)
  xs <- seq(r[1,1], r[2,1], length.out = resolution)
  ys <- seq(r[1,2], r[2,2], length.out = resolution)
  g <- cbind(rep(xs, each=resolution), rep(ys, time = resolution))
  colnames(g) <- colnames(r)
  g <- as.data.frame(g)

  ### guess how to get class labels from predict
  ### (unfortunately not very consistent between models)
  p <- predict(model, g, type = predict_type)
  if(is.list(p)) p <- p$class
  p <- as.factor(p)

  if(showgrid) points(g, col = as.integer(p)+1L, pch = ".")

  z <- matrix(as.integer(p), nrow = resolution, byrow = TRUE)
  contour(xs, ys, z, add = TRUE, drawlabels = FALSE,
    lwd = 2, levels = (1:(k-1))+.5)

  invisible(z)
}

如果我们想用 ggplot2 重新创建它，我们只需更改函数以使用 ggplot2 函数而不是基本图。这需要将数据更改为 data.frame 并在此过程中构建绘图。

decisionplot_ggplot <- function(model, data, class = NULL, predict_type = "class",
                         resolution = 100, showgrid = TRUE, ...) {
  
  if(!is.null(class)) cl <- data[,class] else cl <- 1
  data <- data[,1:2]
  cn <- colnames(data)
  
  k <- length(unique(cl))
  
  data$pch <- data$col <- as.integer(cl) + 1L
  gg <- ggplot(aes_string(cn[1], cn[2]), data = data) + 
    geom_point(aes_string(col = 'as.factor(col)', shape = 'as.factor(col)'), size = 3)
  
  # make grid
  r <- sapply(data[, 1:2], range, na.rm = TRUE)
  xs <- seq(r[1, 1], r[2, 1], length.out = resolution)
  ys <- seq(r[1, 2], r[2, 2], length.out = resolution)
  
  g <- cbind(rep(xs, each = resolution), 
             rep(ys, time = resolution))
  colnames(g) <- colnames(r)
  
  g <- as.data.frame(g)
  
  ### guess how to get class labels from predict
  ### (unfortunately not very consistent between models)
  p <- predict(model, g, type = predict_type)
  if(is.list(p)) p <- p$class
  g$col <- g$pch <- as.integer(as.factor(p)) + 1L
  
  if(showgrid) 
    gg <- gg + geom_point(aes_string(x = cn[1], y = cn[2], col = 'as.factor(col)'), data = g, shape = 20, size = 1)
  
  gg + geom_contour(aes_string(x = cn[1], y = cn[2], z = 'col'), data = g, inherit.aes = FALSE)
}

用法:

decisionplot_ggplot(model, iris, class = "Species")

请注意，它现在返回 ggplot 本身，因此可以使用标准函数来更改标题、主题等。此外，这只是一种直接翻译。将 geom_polygon 与有效的 alpha 一起使用可能会更美观。可以使用 geom_* 的替代选择来制作类似的更好的轮廓。

关于r - R 中的二次判别分析 (QDA) 图，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/63782598/

文章推荐： python - 将 xml 文件转换为 tfrecord 文件时出错

文章推荐： terraform - "for_each"值取决于无法确定的资源属性(Terraform)

文章推荐： python - 拆分列中的值并创建新的 cols 小问题

r - 某些组中的错误对于 'qda' 来说太小了
# load the library and data library('MASS') library('sqldf') data(fgl, package = 'MASS') df <- data.
R错误: some group is too small for 'qda'
我使用 MASS::qda() 来查找我的数据的分类器，并且它总是报告 `some group is too small for 'qda' 这是由于我用于模型的测试数据的大小吗？我将测试样本大小从
r - R 中的二次判别分析 (QDA) 图
我正在尝试使用 MASS 和 ggplot2 包绘制 Iris 数据集二次判别分析 (QDA) 的结果。该脚本在其第一部分中显示了线性判别分析 (LDA)，但我不知道要继续为 QDA 执行此操作。 “
python - 在 sklearn 中交叉验证 QDA 分类器
难道不能在 sklearn 的 QDA 分类器上调用 cross_val_score 函数吗？这是我的片段: cvKF = cross_validation.KFold(len(communicati
scikit-learn - 在 scikit-learn 中控制 LDA 和 QDA 的后验概率阈值
考虑以下用例(完全摘自 James 等人的《统计学习简介》)。您正试图根据各种个人数据预测信用卡所有者是否会违约。您正在使用线性判别分析(或者，出于这个问题的目的，二次判别分析)。您希望估算器优先
python - 无法使用 scikit-learn 0.19.1 导入 sklearn.qda 和 sklearn.lda
无法使用 scikit-learn 0.19.1 导入 sklearn.qda 和 sklearn.lda 我得到: 导入错误:没有名为“sklearn.qda”的模块导入错误:没有名为“sklea

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

r - R 中的二次判别分析 (QDA) 图