gpt4 book ai didi

java - 如何将 sdf_predict() 与库 (sparklyr) 中 ml_pca() 提供的模型一起使用

转载 作者:行者123 更新时间:2023-11-30 10:35:32 24 4
gpt4 key购买 nike

我得到一个 pca 模型

> library(sparklyr)
> library(dplyr)
> sc <- spark_connect("local", version="2.0.0")
> iris_tbl <- copy_to(sc, iris, "iris", overwrite = TRUE)
The following columns have been renamed:
- 'Sepal.Length' => 'Sepal_Length' (#1)
- 'Sepal.Width' => 'Sepal_Width' (#2)
- 'Petal.Length' => 'Petal_Length' (#3)
- 'Petal.Width' => 'Petal_Width' (#4)
> pca_model <- tbl(sc, "iris") %>%
+ select(-Species) %>%
+ ml_pca()
> print(pca_model)
Explained variance:

PC1 PC2 PC3 PC4
0.924618723 0.053066483 0.017102610 0.005212184

Rotation:
PC1 PC2 PC3 PC4
Sepal_Length -0.36138659 -0.65658877 0.58202985 0.3154872
Sepal_Width 0.08452251 -0.73016143 -0.59791083 -0.3197231
Petal_Length -0.85667061 0.17337266 -0.07623608 -0.4798390
Petal_Width -0.35828920 0.07548102 -0.54583143 0.7536574

但不能使用生成的模型进行预测。

sdf_predict(pca_model)

Source: query [?? x 6]
Database: spark connection master=local[4] app=sparklyr local=TRUE

以错误结束

java.lang.IllegalArgumentException: requirement failed: 
The columns of A don't match the number of elements of x. A: 4, x: 0

为预测插入数据没有帮助

sdf_predict(pca_model, tbl(sc, "iris") %>% select(-Species))

Source: query [?? x 5]
Database: spark connection master=local[4] app=sparklyr local=TRUE

以错误结束

java.lang.IllegalArgumentException: requirement failed: 
The columns of A don't match the number of elements of x. A: 4, x: 0

在spark中一般可以使用PCA进行预测吗?

最佳答案

使用 sdf_project 而不是 sdf_predict

> pca_projected <- sdf_project(pca_model, tbl(sc, "iris") %>% select(-Species), 
+ features=rownames(pca_model$components))
> pca_projected %>% collect %>% head
# A tibble: 6 x 8
Sepal_Length Sepal_Width Petal_Length Petal_Width PC1 PC2 PC3 PC4
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 5.10 3.50 1.40 0.200 -2.82 -5.65 0.660 -0.0311
2 4.90 3.00 1.40 0.200 -2.79 -5.15 0.842 0.0657
3 4.70 3.20 1.30 0.200 -2.61 -5.18 0.614 -0.0134
4 4.60 3.10 1.50 0.200 -2.76 -5.01 0.600 -0.109
5 5.00 3.60 1.40 0.200 -2.77 -5.65 0.542 -0.0946
6 5.40 3.90 1.70 0.400 -3.22 -6.07 0.463 -0.0576

关于java - 如何将 sdf_predict() 与库 (sparklyr) 中 ml_pca() 提供的模型一起使用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41087821/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com