r - knn 函数出错-6ren

r - knn 函数出错

转载作者：行者123 更新时间：2023-12-04 00:06:00

32

4

我尝试运行这条线:

knn(mydades.training[,-7],mydades.test[,-7],mydades.training[,7],k=5)

但我总是收到这个错误:

Error in knn(mydades.training[, -7], mydades.test[, -7], mydades.training[,  : 
  NA/NaN/Inf in foreign function call (arg 6)
In addition: Warning messages:
1: In knn(mydades.training[, -7], mydades.test[, -7], mydades.training[,  :
  NAs introduced by coercion
2: In knn(mydades.training[, -7], mydades.test[, -7], mydades.training[,  :
  NAs introduced by coercion

有什么想法吗？

PS:mydades.training 和 mydades.test 定义如下:

N <- nrow(mydades) 
permut <- sample(c(1:N),N,replace=FALSE)
ord <- order(permut)
mydades.shuffled <- mydades[ord,]
prop.train <- 1/3
NOMBRE <- round(prop.train*N)
mydades.training <- mydades.shuffled[1:NOMBRE,]
mydades.test <- mydades.shuffled[(NOMBRE+1):N,]

最佳答案

我怀疑您的问题在于“mydades”中有非数字数据字段。错误行:

NA/NaN/Inf in foreign function call (arg 6)

让我怀疑对 C 语言实现的 knn 函数调用失败了。 R 中的许多函数实际上调用底层的、更高效的 C 实现，而不是仅在 R 中实现算法。如果您在 R 控制台中只键入“knn”，则可以检查“knn”的 R 实现。存在以下行:

 Z <- .C(VR_knn, as.integer(k), as.integer(l), as.integer(ntr), 
        as.integer(nte), as.integer(p), as.double(train), as.integer(unclass(clf)), 
        as.double(test), res = integer(nte), pr = double(nte), 
        integer(nc + 1), as.integer(nc), as.integer(FALSE), as.integer(use.all))

其中 .C 表示我们正在使用提供的函数参数调用名为“VR_knn”的 C 函数。因为你有两个错误

NAs introduced by coercion

我认为两个 as.double/as.integer 调用失败，并引入 NA 值。如果我们开始计算参数，第 6 个参数是:

as.double(train)

在以下情况下可能会失败:

# as.double can not translate text fields to doubles, they are coerced to NA-values:
> as.double("sometext")
[1] NA
Warning message:
NAs introduced by coercion
# while the following text is cast to double without an error:
> as.double("1.23")
[1] 1.23

你会得到两个强制错误，它们可能是由 'as.double(train)' 和 'as.double(test)' 给出的。由于您没有向我们提供“mydades”的确切细节，以下是我的一些最佳猜测(以及人工多元正态分布数据):

library(MASS)
mydades <- mvrnorm(100, mu=c(1:6), Sigma=matrix(1:36, ncol=6))
mydades <- cbind(mydades, sample(LETTERS[1:5], 100, replace=TRUE))

# This breaks knn
mydades[3,4] <- Inf
# This breaks knn
mydades[4,3] <- -Inf
# These, however, do not introduce the coercion for NA-values error message

# This breaks knn and gives the same error; just some raw text
mydades[1,2] <- mydades[50,1] <- "foo"
mydades[100,3] <- "bar"

# ... or perhaps wrongly formatted exponential numbers?
mydades[1,1] <- "2.34EXP-05"

# ... or wrong decimal symbol?
mydades[3,3] <- "1,23" 
# should be 1.23, as R uses '.' as decimal symbol and not ','

# ... or most likely a whole column is non-numeric, since the error is given twice (as.double problem both in training AND test set)
mydades[,1] <- sample(letters[1:5],100,replace=TRUE)

我不会将数字数据和类标签都保存在一个矩阵中，也许您可以将数据拆分为:

mydadesnumeric <- mydades[,1:6] # 6 first columns
mydadesclasses <- mydades[,7]

使用调用

str(mydades); summary(mydades)

还可以帮助您/我们找到有问题的数据条目并将其更正为数字条目或省略非数字字段。

其余的运行代码(在破坏数据之后)，由您提供:

N <- nrow(mydades) 
permut <- sample(c(1:N),N,replace=FALSE)
ord <- order(permut)
mydades.shuffled <- mydades[ord,]
prop.train <- 1/3
NOMBRE <- round(prop.train*N)
mydades.training <- mydades.shuffled[1:NOMBRE,]
mydades.test <- mydades.shuffled[(NOMBRE+1):N,]

# 7th column seems to be the class labels
knn(train=mydades.training[,-7],test=mydades.test[,-7],mydades.training[,7],k=5)

关于r - knn 函数出错，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/16874038/

32

4

0

文章推荐： functional-programming - Erlang:优雅的 tuple_to_list/1

文章推荐： wpf - 如何编写 WPF 触发器来更改文本 block 悬停上的光标

文章推荐： codeigniter - Codeigniter Active Record 中的 Concat

r - 为什么 R 在我学习期间传递命令 (knn.pred=knn(train.X,test.X,train.Y,k=1)) 时抛出错误(找不到函数 "knn")？
library(ISLR) standardized.X=scale(Caravan [,-86]) test =1:1000 train.X=standardized.X[-test ,] test
python - 多标签分类 ML-kNN 与 KNN
这可能是一个愚蠢的问题，但我只是想知道在 scikit.ml 中实现的 ML-KNN 与 scikit-learn 的 KNeighborsClassifier 之间的区别是什么。根据sklearn'
machine-learning - KNN 中的 knn.score 和准确率指标有什么区别 - SKlearn
我担心我的预测与测试的准确性，这完全有意义。 X_train , X_test, y_train ,y_test =train_test_split(iris_dataset['data'], iri
machine-learning - tensorflow KNN : How can we assign the K parameter for defining number of neighbors in KNN?
我已经开始在 python tensorflow 库上使用 K-Nearest-Neighbors 方法开发一个机器学习项目。我没有使用tensorflow工具的经验，所以我在github上找到了一些
算法金|再见！！！KNN
大侠幸会，在下全网同名「算法金」 0 基础转 AI 上岸，多个算法赛 Top 「日更万日，让更多人享受智能乐趣」 KNN算法的工作原理简单直观，易于理解和实现，这使得它在各种应用场景
R knn 大型数据集
我试图在 R 中使用 knn(使用了几个包( knnflex ， class ))来预测基于 8 个变量的违约概率。数据集大约有 100k 行 8 列，但我的机器似乎很难处理 10k 行的样本。在数据
r - knn 聚类预测
我有一个 60.000 obs/40 变量数据集，我在其中使用了 Clara，主要是由于内存限制。 library(cluster) library(dplyr) mutate(kddne
r - knn 函数出错
我尝试运行这条线: knn(mydades.training[,-7],mydades.test[,-7],mydades.training[,7],k=5) 但我总是收到这个错误: Error in
python - kNN - 如何根据计算的距离在训练矩阵中定位最近的邻居
我正在尝试使用 python 实现 k-近邻算法。我最终得到了以下代码。但是，我正在努力寻找最近邻居项目的索引。以下函数将返回距离矩阵。但是，我需要在features_train(算法的输入矩阵)中获
pandas - Knn 对距离上的特定特征赋予更多权重
我正在使用Kobe Bryant Dataset 。我希望用 KnnRegressor 预测 shot_made_flag。我使用game_date来提取year和month特征: # covert
python - kNN 中一个点的最远点
在 kNN classifier 的文档中，有一个方法kneighbors ，返回 k 个最近邻居。我感兴趣的是如何优雅地返回此类分类器中的 k 个最远邻居？最佳答案不，没有这样的能力。您需要记
optimization - KNN 中需要优化哪些参数？
我想优化 KNN。关于SVM、RF和XGboost的内容有很多；但对于 KNN 来说很少。据我所知，邻居的数量是一个需要调整的参数。但是还有哪些参数需要测试呢？有什么好的文章吗？谢谢最佳答案
python - KNN 查询数据维度必须与训练数据维度匹配
我正在尝试使用具有两列的数据集进行词袋问题 - 摘要和解决方案。我正在使用 KNN。训练数据集有 91 列，测试数据集有 15 列。为了生成向量，我使用以下代码。 vectorizer = Coun
statistics - KNN 中的软投票是什么？
我了解 k 最近邻 (KNN) 的工作原理，但我不熟悉“软投票”一词。与 KNN 相关的软投票是什么？它与标准 KNN 投票相比如何工作？比较两种投票方案的简单示例会很有用，并且指向 Matlab
python - 带有提示数据集的 KNN
我正在尝试将 KNN 应用到 tips 数据集并将对象映射如下: f.Male=df.Gender.map({'Female':0,'Male':1}) df.Smokes = df.Smoker.
python - knn.score参数说明
在下面的代码(最后一行)中，根据文档使用了 X_test 和 y_test: Returns the mean accuracy on the given test data and label 问题
python - KNN - 我如何加速？
我有约 65 个特征、450k 个观察值和不平衡的分类响应变量 Y(约 5% 真实，2 个状态)的 df。这已通过 train_test_split 分为 {Xtrain, ytrain} (10%)
algorithm - KNN 算法中需要归一化
为什么在 KNN 中需要规范化？我知道这个过程标准化了所有特征对结果的影响，但是在标准化之前到特定点 V 的“K”最近点将与到该特定点的“K”最近点完全相同归一化后的 V。那么归一化对欧氏距离有何影响
algorithm - kNN 是统计分类器吗？
我目前正在为我的人工智能考试做一个机器学习项目。目标是使用 WEKA 正确选择两种分类算法进行比较，请记住，这两种算法必须有足够的不同才能进行比较。此外，算法必须同时处理标称数据和数字数据(我想这是进
algorithm - 决策树与朴素贝叶斯与 KNN
进行数据挖掘时，什么时候应该选择其中一种算法而不是另一种？有具体原因吗？另外，其中哪一个是最有效的？我将给出一个表格作为示例。最佳答案一种选择方法是尝试所有这些并选择最好的。如果我要尝试构建数

首页

博学

6Ren·AI

商城

r - knn 函数出错