gpt4 book ai didi

r - enc2utf8(x) : argumemt is not a character vector 中的错误

转载 作者:行者123 更新时间:2023-12-01 21:51:45 25 4
gpt4 key购买 nike

Error in enc2utf8(x) : argumemt is not a character vector 是我尝试在 R 3.1.2 中运行下面的代码时遇到的错误。如果我在这里遗漏了什么,有人可以帮助我理解吗?

使用的操作系统是Windows

#Text Cleaning: tm Code
clean<-function(text){
library(NLP)
library(tm)
sample<- Corpus(VectorSource(text),readerControl=list(language="english"))
sample<- tm_map(sample, function(x) iconv(enc2utf8(x), sub = "bytes"))
sample<-tm_map(sample,removePunctuation)
sample <- tm_map(sample, stripWhitespace)
sample<-tm_map(sample,removeNumbers)
sample<-tm_map(sample,removeWords,stopwords('smart'))
sample <- tm_map(sample, stripWhitespace)
sample <- tm_map(sample, stripWhitespace)
dtm <- DocumentTermt(sample[1:3])Matrix(sample)
return(list(sample,dtm))
}
fileName <- 'input.txt'
test = readChar(fileName, file.info(fileName)$size)
clean (test)

最佳答案

您必须引用语料库的content,即sample$content中的字符向量:

tm_map(sample, function(x) iconv(enc2utf8(x$content), sub = "bytes"))

在这里,我将 enc2utf8(x) 替换为 enc2utf8(x$content)

关于r - enc2utf8(x) : argumemt is not a character vector 中的错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27478161/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com