r - 如何通过重组 MALLET 输出文件来创建表格？-6ren

r - 如何通过重组 MALLET 输出文件来创建表格？

转载作者：行者123 更新时间：2023-12-04 10:57:52

26

4

我正在使用 MALLET 进行主题分析，它在几千行和一百左右行的文本文件(“topics.txt”)中输出结果，其中每行由制表符分隔的变量组成，如下所示:

Num1 text1 topic1 proportion1 topic2 proportion2 topic3 proportion3,  etc.
Num2 text2 topic1 proportion1 topic2 proportion2 topic3 proportion3,  etc.
Num3 text3 topic1 proportion1 topic2 proportion2 topic3 proportion3,  etc.

以下是实际数据的片段:

> dat[1:5,1:10]

  V1 V2 V3    V4 V5        V6 V7        V8 V9        V10
1  0 10.txt   27 0.4560785 23 0.3040853 20 0.1315621 21 0.03632624
2  1 1001.txt 20 0.2660085 12 0.2099153  8 0.1699586 13 0.16922928
3  2 1002.txt 16 0.3341721  2 0.1747023 10 0.1360454 12 0.07507119
4  3 1003.txt 12 0.5366148  8 0.2255179 18 0.1388561  0 0.01867091
5  4 1005.txt 16 0.2363206  0 0.2214441 24 0.1914769  7 0.17760521

我正在尝试使用 R 将此输出转换为数据表，其中主题是列标题，每个主题包含变量“比例”的值，直接位于每个变量“主题”的右侧，对于每个“文本”的值。像这样:

      topic1       topic2       topic3
text1 proportion1  proportion2  proportion3
text2 proportion1  proportion2  proportion3

或使用上面的数据片段，如下所示:

           0         2         7         8         10        12        13        16        18       20        21         23        24         27
10.txt     0         0         0         0         0         0         0         0         0        0.1315621 0.03632624 0.3040853 0          0.4560785        
1001.txt   0         0         0         0.1699586 0         0.2099153 0.1692292 0         0        0.2660085 0          0         0          0
1002.txt   0         0.1747023 0         0         0.1360454 0.0750711 0         0.3341721 0        0         0          0         0          0
1003.txt   0.0186709 0         0         0.2255179 0         0.5366148 0         0         0.138856 0         0          0         0          0
1005.txt   0.2214441 0         0.1776052 0         0         0         0         0.2363206 0        0         0          0         0.1914769  0

这是 R 代码我必须完成这项工作，从 friend 那里发送，但它对我不起作用(我对它的了解不够，无法自己修复):

##########################################
dat<-read.table("topics.txt", header=F, sep="\t")
datnames<-subset(dat, select=2)
dat2<-subset(dat, select=3:length(dat))
y <- data.frame(topic=character(0),proportion=character(0),text=character(0))
for(i in seq(1, length(dat2), 2)){ 
z<-i+1
x<-dat2[,i:z]
x<-cbind(x, datnames)
colnames(x)<-c("topic","proportion", "text")
y<-rbind(y, x)
}

# Right at this step at the end of the block 
# I get this message that may indicate the problem:
# Error in c(in c("topic", "proportion", "text") : unused argument(s) ("text")

y[is.na(y)] <- 0 
xdat<-xtabs(proportion ~ text+topic, data=y)  
write.table(xdat, file="topicMatrix.txt", sep="\t", eol = "\n", quote=TRUE, col.names=TRUE, row.names=TRUE)
##########################################

对于如何使此代码正常工作的任何建议，我将不胜感激。我的问题可能与 this one 相关，也可能与 this one 相关，但我还没有技能立即使用这些问题的答案。

最佳答案

这是解决您问题的一种方法

 dat <-read.table(as.is = TRUE, header = FALSE, textConnection(
  "Num1 text1 topic1 proportion1 topic2 proportion2 topic3 proportion3
   Num2 text2 topic1 proportion1 topic2 proportion2 topic3 proportion3
   Num3 text3 topic1 proportion1 topic2 proportion2 topic3 proportion3"))

 NTOPICS = 3 
 nam <- c('num', 'text', 
   paste(c('topic', 'proportion'), rep(1:NTOPICS, each = 2), sep = ""))

 dat_l <- reshape(setNames(dat, nam), varying = 3:length(nam), direction = 'long',
   sep = "")
 reshape2::dcast(dat_l, num + text ~ topic, value_var = 'proportion')

num  text      topic1      topic2      topic3
1 Num1 text1 proportion1 proportion2 proportion3
2 Num2 text2 proportion1 proportion2 proportion3
3 Num3 text3 proportion1 proportion2 proportion3

编辑。无论比例是文本还是数字，这都将起作用。您也可以修改 NTOPICS以适合您拥有的主题数量

关于r - 如何通过重组 MALLET 输出文件来创建表格？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/8058402/

26

4

0

文章推荐： r - gtable结构元素说明

文章推荐： tsql - 存储过程中 END 后的语句

文章推荐： r - 删除具有负值的行

文章推荐： sql - 使用 Oracle SQL 生成嵌套 XML

JetpackCompose(4)——重组
目录一、状态变化 1.1 状态变化是什么 1.2 mutableStateListOf 和 mutableStateMapOf 二、重组的
重组(交换层)列表列表
这是我的真实数据列表的示例列表: df setNames(nm) } fun3 Unit: microseconds #> expr min lq me
clojure - map 重组
在 clojure 中，我可以像这样解构 map : (let [{:keys [key1 key2]} {:key1 1 :key2 2}] ...) 这类似于 CoffeeScript 的方法
Javascript:重组 JSON
这个问题在这里已经有了答案: Re-Structuring a JSON (2 个答案) 关闭 9 年前。我需要使用 Javascript/Jquery 将我的 JSON 重新组织成新的结构，但我
mysql - 重组/分区数据库表
我有一个具有以下结构的表，列是出发地、目的地、旅行时间和出发时间。主键是出发地、目的地和出发时间的组合。有没有什么方法可以重新构造它，以便行中没有太多的起点和目的地重复？最佳答案 “出发地和目的地太
python - 重组 JSON
需要将 JSON 重组为 Python 中的新结构。例如: { 'a' : 1, 'b' : 1, 'd' : {'d1' : '1', 'd2' : 2}, 'm' : [
reactjs - 用 Prop 重组
我正在努力了解重组的工作原理。虽然我理解它的基本概念，但我在使用 withProps 函数时遇到了问题。如果我尝试使用它来装饰带有附加 Prop 的组件的每个子组件，它根本无法工作。我所做的是: co
重组 data.frames 列表
假设我有一个数据框列表。每个数据框都有这样的列: lists$a company, x, y ,z lists$b company, x, y, z lists$c company, x, y, z
php - 重组(过滤和展平)数组的每一行
好吧，我有这个数组 $city : [2] => Array ( [0] => Array ( [0] => fr
svn - 重组 subversion 存储库的长期影响是什么
工作中的 subversion 存储库是在没有对其结构进行太多规划的情况下建立的。目前没有明确的标签、主干或分支配置，尽管通过使用 subclipse:tags 存在一些标签元数据目前存储库的格式为
git - 重组 Git 存储库中的文件
我有一个具有以下文件夹结构的 Git 存储库: allprojectfiles --otherfolders --source ----projectname ------projectname --
git - 重组 git repo
我有一个像这样的 git repo 结构- main-repo -file1 -file2 我想把它转换成类似的东西- main-repo -javascript -fil
git - 重组 git repo
我有一个像这样的 git repo 结构- main-repo -file1 -file2 我想把它转换成类似的东西- main-repo -javascript -fil
python - 重组 Pandas DataFrame
有人建议我从类结构(定义我自己的类)转移到 pandas DataFrame 领域，因为我设想对我的数据进行许多操作。此时我有一个如下所示的数据框: ID Name Recordin
python - 重组 Pandas 中的数据框
我想重构我的 pandas 数据框，其中 h1、h2 等是与小时相关的值。目前看起来像: h1 h2 h3 h4 h5 h6 h7 h8 h9
sql - 重组 Postgrest 查询
我在 postgresql 上使用查询返回这样的结果。 schedule | day | subject | grade | ========================
python - 重组 Pandas 中的无关数据
我有一个这样组织的数据框... **Name** | **Mealtime** | **Food** John | 8:00 am | cereal John | 1:00 pm | salad
javascript - 重组 CSV 文件？
我有基本的脚本知识，但我不知道如何解决这个问题。我正在尝试将银行自动生成的 CSV 文件转换为 YNAB(您需要预算)可以理解的格式。 YNAB 格式(所需的 csv 文件格式) Date,Payee
javascript - 重组 Meteor App
我正在重组我的 meteor 应用程序，突然每个逻辑(JS 脚本)与初始文件分离的 View (html 模板)都停止工作。最初我的项目看起来像这样 -project -.meteor
Navbar 问题的 HTML 重组
我试图让 children 不影响导航栏，并像下拉菜单一样显示。当前示例位于:dev4you.byethost15.com 函数应该如下: 用户将鼠标悬停在父项上子项显示在类似列表、Underne

首页

博学

6Ren·AI

商城

r - 如何通过重组 MALLET 输出文件来创建表格？