- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
在标题为“复制要求”(https://uc-r.github.io/iml-pkg)的部分中尝试执行以下代码时出现以下错误:
#classification data
df <- rsample::attrition %>%
mutate_if(is.ordered, factor, ordered = FALSE) %>%
mutate(Attrition = recode(Attrition, "Yes" = "1", "No" = "0") %>% factor(levels = c("1", "0")))
> Error: 'attrition' is not an exported object from 'namespace:rsample'
使用以下代码解决了问题:
#data
library(modeldata)
data("attrition", package = "modeldata")
#classification data
df <- attrition %>%
mutate_if(is.ordered, factor, ordered = FALSE) %>%
mutate(Attrition = recode(Attrition, "Yes" = "1", "No" = "0") %>% factor(levels = c("1", "0")))
不幸的是,我在尝试执行以下代码后遇到了另一个错误(标题为“全局解释/特征重要性”的部分(https://uc-r.github.io/iml-pkg)):
#compute feature importance with specified loss metric
imp.glm <- FeatureImp$new(predictor.glm, loss = "mse")
imp.rf <- FeatureImp$new(predictor.rf, loss = "mse")
imp.gbm <- FeatureImp$new(predictor.gbm, loss = "mse")
> Error in [.data.frame(prediction, , self$class, drop = FALSE) : undefined columns selected
> Error in [.data.frame(prediction, , self$class, drop = FALSE) : undefined columns selected
> Error in [.data.frame(prediction, , self$class, drop = FALSE) : undefined columns selected
我用的是 R 4.2.0/Win10
最佳答案
教程中显示的参数需要稍作改动;而不是 class = "classification"
,将其更改为 class = 2
(根据 the docs ),示例按预期工作:
library(rsample) # data splitting
library(ggplot2) # allows extension of visualizations
library(dplyr) # basic data transformation
library(h2o) # machine learning modeling
#install.packages("iml")
library(iml) # ML interprtation
#install.packages("modeldata")
library(modeldata)
library(R6)
h2o.no_progress()
h2o.init()
#> Connection successful!
#>
#> R is connected to the H2O cluster:
#> H2O cluster uptime: 9 minutes 18 seconds
#> H2O cluster timezone: Australia/Melbourne
#> H2O data parsing timezone: UTC
#> H2O cluster version: 3.36.0.1
#> H2O cluster version age: 6 months and 28 days !!!
#> H2O cluster name: H2O_started_from_R_jared_mpb432
#> H2O cluster total nodes: 1
#> H2O cluster total memory: 1.58 GB
#> H2O cluster total cores: 4
#> H2O cluster allowed cores: 4
#> H2O cluster healthy: TRUE
#> H2O Connection ip: localhost
#> H2O Connection port: 54321
#> H2O Connection proxy: NA
#> H2O Internal Security: FALSE
#> H2O API Extensions: Amazon S3, XGBoost, Algos, Infogram, AutoML, Core V3, TargetEncoder, Core V4
#> R Version: R version 4.1.3 (2022-03-10)
df <- modeldata::attrition %>%
mutate_if(is.ordered, factor, ordered = FALSE) %>%
mutate(Attrition = recode(Attrition, "Yes" = "1", "No" = "0") %>%
factor(levels = c("1", "0")))
# convert to h2o object
df.h2o <- as.h2o(df)
# create train, validation, and test splits
set.seed(123)
splits <- h2o.splitFrame(df.h2o, ratios = c(.7, .15), destination_frames = c("train","valid","test"))
names(splits) <- c("train","valid","test")
# variable names for resonse & features
y <- "Attrition"
x <- setdiff(names(df), y)
# elastic net model
glm <- h2o.glm(
x = x,
y = y,
training_frame = splits$train,
validation_frame = splits$valid,
family = "binomial",
seed = 123
)
# random forest model
rf <- h2o.randomForest(
x = x,
y = y,
training_frame = splits$train,
validation_frame = splits$valid,
ntrees = 1000,
stopping_metric = "AUC",
stopping_rounds = 10,
stopping_tolerance = 0.005,
seed = 123
)
#> Warning in .h2o.processResponseWarnings(res): early stopping is enabled but neither score_tree_interval or score_each_iteration are defined. Early stopping will not be reproducible!.
# gradient boosting machine model
gbm <- h2o.gbm(
x = x,
y = y,
training_frame = splits$train,
validation_frame = splits$valid,
ntrees = 1000,
stopping_metric = "AUC",
stopping_rounds = 10,
stopping_tolerance = 0.005,
seed = 123
)
#> Warning in .h2o.processResponseWarnings(res): early stopping is enabled but neither score_tree_interval or score_each_iteration are defined. Early stopping will not be reproducible!.
# model performance
h2o.auc(glm, valid = TRUE)
#> [1] 0.7870935
## [1] 0.7870935
h2o.auc(rf, valid = TRUE)
#> [1] 0.7681021
## [1] 0.7681021
h2o.auc(gbm, valid = TRUE)
#> [1] 0.7468242
## [1] 0.7468242
features <- as.data.frame(splits$valid) %>% select(-Attrition)
# 2. Create a vector with the actual responses
response <- as.vector(as.numeric(splits$valid$Attrition))
# 3. Create custom predict function that returns the predicted values as a
# vector (probability of purchasing in our example)
pred <- function(model, newdata) {
results <- as.data.frame(h2o.predict(model, as.h2o(newdata)))
return(results[[3L]])
}
# example of prediction output
pred(glm, features) %>% head()
#> [1] 0.12243347 0.12887908 0.09674399 0.26008143 0.00672000 0.13741387
predictor.glm <- Predictor$new(
model = glm,
data = features,
y = response,
predict.fun = pred,
class = "classification"
)
predictor.glm$predict(features[1:10,])
#> Error in `[.data.frame`(prediction, , self$class, drop = FALSE): undefined columns selected
# class = "classification" doesn't make sense; from the docs:
### The class column to be returned in case of multiclass output.
### You can either use numbers, e.g. class=2 would take the 2nd column
### from the predictions, or the column name of the predicted class,
### e.g. class="dog".
# so, in this case, 'class = 2' should work as expected
predictor.glm <- Predictor$new(
model = glm,
data = features,
y = response,
predict.function = pred,
class = 2
)
predictor.glm$predict(features[1:10,])
#> p1
#> 1 0.12243347
#> 2 0.12887908
#> 3 0.09674399
#> 4 0.26008143
#> 5 0.00672000
#> 6 0.13741387
#> 7 0.47917917
#> 8 0.11775822
#> 9 0.11316964
#> 10 0.22963757
predictor.rf <- Predictor$new(
model = rf,
data = features,
y = response,
predict.fun = pred,
class = 2
)
predictor.gbm <- Predictor$new(
model = gbm,
data = features,
y = response,
predict.fun = pred,
class = 2
)
imp.glm <- FeatureImp$new(predictor.glm, loss = "mse")
imp.rf <- FeatureImp$new(predictor.rf, loss = "mse")
imp.gbm <- FeatureImp$new(predictor.gbm, loss = "mse")
p1 <- plot(imp.glm) + ggtitle("GLM")
p2 <- plot(imp.rf) + ggtitle("RF")
p3 <- plot(imp.gbm) + ggtitle("GBM")
#gridExtra::grid.arrange(p1, p2, p3, nrow = 1)
p1
p2
p3
由 reprex package 创建于 2022-07-28 (v2.0.1)
关于r - 教程中的错误(使用 iml 包解释机器学习模型),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72930868/
通过终端,您可以使用命令 - “SetFile -a B 文件名” 以编程方式,我认为我应该通过[[NSFileManager defaultManager] createDirectoryAtPat
嗨,正在尝试书中的一些示例:Practical Graph mining with R对于子图挖掘: library(subgraphMining) library(igraph) graph1 =
代码中的相同问题: class Foo { int getIntProperty () { ... } CustomObject getObjectProperty () { ... }
所以这可能是一个愚蠢的问题,但它已经困扰我一段时间了。 使用 React,我创建了两个组件(Buttons.js 和 Message.js),每个组件都有一个导出。但是,现在我希望将这两个组件用作 n
从今天早上开始,我发现我无法再从某个范围安装任何 NPM 包(或任何具有依赖项的包)。例如,如果我输入 npm i webpack 我会收到以下错误... npm ERR! code E401 npm
我在这里搜索过,Angular 2, @ngtools/webpack, AOT ,但对我不起作用。我运行了 npm install 命令。我正在做的是创建一个新的 Angular 2 项目。当我运行
情况: 我有一个 Swift 包,将其命名为 lib。 lib 位于其自己的存储库中。在lib的仓库中,有一堆本地包;也就是说,这些包是在 lib 中定义的,使用本地路径依赖格式 .package(p
我想在工作中学习和使用nodejs,但是在使用 de npm 命令安装模块/包时遇到网络问题。我是否可以使用我的家用计算机构建完整的 Node js 包,然后将其安装在另一台计算机(我的工作场所计算机
我需要将一些 .tar.bz2 格式的非 Python 包转换为 Anaconda/miniConda .egg 文件并安装它们。为此,我需要一个适用于 Windows 的 bld.bat 文件。互联
我需要共享库文件 libthrift-0.9.3.so 作为其他包的依赖项。我在构建 thrift-0.9.3 包时看到编译问题(我确实从 https://thrift.apache.org/down
我尝试在 R 版本 3.5.0 中安装“arcgisbinding”包。但是我失败了,得到以下错误和警告。 Installing package into ‘C:/Users/Lenovo/Docum
我尝试在 R 版本 3.5.0 中安装“arcgisbinding”包。但是我失败了,得到以下错误和警告。 Installing package into ‘C:/Users/Lenovo/Docum
我试图在 flutter 中测试这个应用程序,但我无法运行该应用程序,因为出现此错误“名称‘Page’在库‘package:burn_off/widgets/page.dart’和‘package’中
试图理解和学习如何编写包...用我一直使用的东西进行测试,记录... 您能帮我理解为什么“日志”变量不起作用...并且屏幕上没有日志记录吗? 谢谢! 主要文件: #!/opt/local/bin/py
我尝试运行此使用 Google 云的代码。 import signal import sys from google.cloud import language, exceptions # creat
我想知道是否有人找到了一个很好的 R 包来分析眼动追踪数据? 我遇到了 eyetrackR,但据我所知,没有可用的英文支持文档: http://read.psych.uni-potsdam.de/pm
我正在 R 上制作一个包。我有两个函数共享一个变量(全局)。 如何将其导入到包中? 例如, m<-0 f<-function() { m <- m+1 } g<-function() { m <- m
我用 C 为 Lua 编写了很多模块。每个模块都包含一个 Lua 用户数据类型,我像这样加载和使用它们: A = require("A") B = require("B") a = A.new(3,{
我正在尝试在 R 中的 Ubuntu 上安装 xlsx 包,以便使用允许在 R 中插入链接然后将它们导出到 Excel 的功能。 话虽如此,我根本无法安装该软件包。 显然它必须与 rJava 一起使用
我想在 Haskell 中做一些蒙特卡洛分析。我希望能够编写这样的代码: do n <- poisson lambda xs <- replicateM n $ normal mu sigma
我是一名优秀的程序员,十分优秀!