r - R 中的 Dictionary() 函数的问题-6ren

r - R 中的 Dictionary() 函数的问题

转载作者：行者123 更新时间：2023-12-03 14:58:01

30

4

根据 Lantz 题为“Machine Learning with R”的书，我一直在关注贝叶斯分类器的一个例子。案例是一个垃圾邮件分类器，它使用以下链接的数据:

http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/

在代码中，我在这部分遇到了问题:

sms_train<-DocumentTermMatrix(sms_corpus_train,list(dictionary=sms_dict))
sms_test<-DocumentTermMatrix(sms_corpus_test,list(dictionary=sms_dict))

因为它说我应该使用以下指令:

sms_dict <- Dictionary(findFreqTerms(sms_dtm_train, 5))

问题在于新版本的 tm 已弃用 Dictionary() 函数。我应该怎么做才能完成书中所说的:

A dictionary is a data structure allowing us to specify which words should appear in a document term matrix. To limit our training and test matrixes to only the words in the preceding dictionary, use the following command

我做了以下工作:

sms_dict<-findFreqTerms(sms_dtm_train,5)
sms_train<-DocumentTermMatrix(sms_corpus_train,list(dictionary=sms_dict))
sms_test<-DocumentTermMatrix(sms_corpus_test,list(dictionary=sms_dict))

但我确信我并没有限制书中所说的测试矩阵。即使代码正在运行，它也没有给我正确的结果。在这种情况下我可以修改什么？

用于跟踪的完整代码如下:

sms_raw<-read.csv("sms_spam.csv",stringsAsFactors=FALSE)
install.packages("tm")
library(tm)
sms_corpus<-Corpus(VectorSource(sms_raw$text))
corpus_clean<-tm_map(sms_corpus,content_transformer(tolower))
corpus_clean<-tm_map(corpus_clean,removeNumbers)
corpus_clean<-tm_map(corpus_clean,removeWords,stopwords())
corpus_clean<-tm_map(corpus_clean,stripWhitespace)
sms_dtm<-DocumentTermMatrix(corpus_clean)
sms_raw_train<-sms_raw[1:4169,]
sms_raw_test<-sms_raw[4170:5559,]
sms_dtm_train<-sms_dtm[1:4169,]
sms_dtm_test<-sms_dtm[4170:5559,]
sms_corpus_train<-corpus_clean[1:4169]
sms_corpus_test<-corpus_clean[4170:5559]
sms_dict<-findFreqTerms(sms_dtm_train,5)
sms_train<-DocumentTermMatrix(sms_corpus_train,list(dictionary=sms_dict))
sms_test<-DocumentTermMatrix(sms_corpus_test,list(dictionary=sms_dict))
convert_counts<-function(x){
x<-ifelse(x>0,1,0)
x<-factor(x,levels=c(0,1),labels=c("No","Yes"))
return(x)
}
sms_train<-apply(sms_train,MARGIN=2,convert_counts)
sms_test<-apply(sms_test,MARGIN=2,convert_counts)
library(e1071)
sms_classifier<-naiveBayes(sms_train,sms_raw_train$type)
sms_test_pred<-predict(sms_classifier,sms_test)
install.packages("gmodels")
library(gmodels)
CrossTable(sms_test_pred,sms_raw_test$type,prop.chisq=FALSE,prop.t=FALSE,dnn=c('predicted','actual'))

谢谢

最佳答案

我遇到了同样的问题并通过执行以下操作解决了它:

CrossTable(sms_test_pred[["class"]], sms_raw_test$Type, prop.chisq = FALSE, prop.t = FALSE, dnn = c('predicted','actual'))

关于r - R 中的 Dictionary() 函数的问题，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47141436/

30

4

0

文章推荐： string - 将 VBA 字符串转换为 double

文章推荐： unity3d - Canvas 对于 Unity 中的相机来说太大了

文章推荐： r - 如何在数据框中按名称删除列

python - 内存效率 : One large dictionary or a dictionary of smaller dictionaries?
我正在用 Python (2.6) 编写一个应用程序，需要我使用字典作为数据存储。我很好奇拥有一个大字典是否更节省内存，或者将其分解为许多(很多)较小的字典，然后拥有一个包含对所有较小字典的引用的“
ios - Swift 减少/展平 Dictionary 到 Dictionary
Convert this [ "Cat" : ["A" : 1, "B": 2], "Mat" : ["C" : 3, "D": 4] ] Into [ "A" : 1,
c# - 组合 Dictionary + Dictionary 来创建 Dictionary
有什么很酷的快速方法可以让两个字典创建第三个字典，以内连接方式将第一个字典的键映射到第二个字典的值？ Dictionary dic1 = new Dictionary {{a1,b1},{a2,b2}
c# - 请尝试使用 Dictionary, Dictionary> 的建议
我希望将字典相互嵌套，以便容纳 block 的 xy 坐标。所以我会 IDictionary, IDictionary> 键 Dictionary 包含列、行组合，而值 Dictionary 包含 x
c# - 使用 Dictionary 作为 Dictionary>
在 C# 中，我需要将数据保存在字典对象中，如下所示: Dictionary> MyDict = new Dictionary>(); 现在我意识到，在某些情况下我需要一些其他(不是字典类的)
C#:Dictionary 到 Dictionary> 的转换
第一个Dictionary就像 Dictionary ParentDict = new Dictionary(); ParentDict.Add("A_1", "1")
c# - 使用 LINQ 按内部 Dictionary 值的值对 Dictionary> 进行排序？
我似乎无法理解这个问题。我需要使用 LINQ 按内部字典的值对字典进行排序。有什么想法吗？最佳答案你的意思是你想要所有的值，按内部值排序？ from outerPair in outer from
Swagger 3 : schema for dictionary of dictionaries
我想建模一个模式，其中响应是字典: { 'id1': { 'type': 'type1', 'active': true, }, 'id2': { 'type':
python - dictionary of dictionary - 如果键不存在，如何更新或创建值？
我有以下代码要添加或更新(如果已经存在)dict()-dict 中的值: if id not in self.steps: self.steps[ id ] = step else:
swift - 如何改变 Swift Dictionary of Dictionary
我有一个包含字典的 Swift 字典，我想使用存储的属性来访问键值: var json = [NSObject:AnyObject]() var title: String { get
c# - IEqualityComparer on Dictionary inside Dictionary
我想创建一个 Dictionary>结构，我想提供一个 IEqualityComparer在包含 APerson 的second 字典中作为关键如果我只有内部字典，那就是 var f = new D
Mongodb groupby on Dictionary inside dictionary
我有一个集合，其中包含如下文档:文档 1: { "company": "ABC" "application": { "app-1": {"earning_from_src_A": 50,
swift - swift 中的 Dictionary of Dictionary
我正在快速学习。我发现 dictionary 就像 hash 用于 PHP 或其他一些语言。那我怎么制作dictionary的dictionary呢？？我有这样的数据 key:J name:jh
python - Dictionary of lists 到 Dictionary
这个问题在这里已经有了答案: Explode a dict - Get all combinations of the values in a dictionary (2 个答案) 关闭 5 个月前
dictionary - 如何通过给定的项目值显示 Motobit Multi.Dictionary 中的键？
我是编程新手，所以如果我的问题看起来很愚蠢，我很抱歉。我想问一下有没有办法从 Multi.Dictionary 返回key当我有值(value)？这是我的代码: Dim myDict Set myD
dictionary - Ada 中是否预先实现了 "dictionary"类型？以及如何使用它？
我试图找出标准 Ada 库是否配备了“字典”类型(我的意思是:一种以格式存储值的数据结构，我可以从中检索 value 使用相应的唯一 key)。这样的数据结构存在吗？如果是这样，有人可以提供一个
dictionary - VBScript Dictionary Exists 方法总是返回 True
我究竟做错了什么？根据我的测试，objDic.exists 永远不会给出 False! dim objDic set objDic = createobject("scripting.
dictionary - Julia 中的复合类型 : Dictionaries as a named field?
我想创建一个复合类型，其中包含一个字典作为其命名字段之一。但是明显的语法不起作用。我敢肯定有一些我不明白的基本原理。下面是一个例子: type myType x::Dict() end Jul
dictionary - Julia 错误: map is not defined on dictionaries
julia> hotcell2vocab = Dict([(cell, i-1+vocab_start) for (i,cell) in enumerate(h
dictionary - .NET : ForEach() extension methods and Dictionary
我有一个简单的问题:我对 Dictionary.Value 集合进行了很多次迭代，这让我很烦，我必须调用 .ToList() 然后才能调用 .ForEach()，因为它似乎没有可枚举的Dictiona

首页

博学

6Ren·AI

商城

r - R 中的 Dictionary() 函数的问题