gpt4 book ai didi

json - 将包Elastic(嵌套列表?)的R输出转换为data.frame或JSON

转载 作者:行者123 更新时间:2023-12-02 22:51:18 25 4
gpt4 key购买 nike

我正在使用R和软件包“elastic”来查询包含JSON格式的twitter数据的 flex 搜索数据库。查询工作正常,我按预期得到了输出内容(输出)。

class(out) 
[1] "list"

并以$ hits $ hits返回
> out$hits$hits
[[1]]
[[1]]$`_index`
[1] "twitter_all_geo-2014-11-01"

[[1]]$`_type`
[1] "ctweet"

[[1]]$`_id`
[1] "ubicity-twitter-160f0964-6fc7-43ef-af2a-0e1b8c8184c7"

[[1]]$`_version`
[1] 1

[[1]]$`_score`
[1] 2.10757

[[1]]$`_source`
[[1]]$`_source`$id
[1] "528330489049120770"

[[1]]$`_source`$created_at
[1] "2014-10-31T23:39:39+0000"

[[1]]$`_source`$user
[[1]]$`_source`$user$name
[1] "afterlifetemis"


[[1]]$`_source`$place
[[1]]$`_source`$place$geo_point
[[1]]$`_source`$place$geo_point[[1]]
[1] 30.4529

[[1]]$`_source`$place$geo_point[[2]]
[1] 50.61104


[[1]]$`_source`$place$city
[1] "Ukraine"

[[1]]$`_source`$place$country
[1] "Ukraine"

[[1]]$`_source`$place$country_code
[1] "UA"

[[1]]$`_source`$msg
[[1]]$`_source`$msg$text
[1] "u had one job artemis\none"

[[1]]$`_source`$msg$lang
[1] "EN"

[[1]]$`_source`$msg$hash_tags
list()

[[2]]
[[2]]$`_index`
[1] "twitter_all_geo-2014-11-01"

[[2]]$`_type`
[1] "ctweet"
...
...

基本上我想将数据另存为.csv文件,所以我输入
> write.csv(out$hits$hits,'out.csv')
Error in data.frame(text = "u had one job artemis\none", lang = "EN", : arguments imply differing number of rows: 1, 0

我认为有必要将其转换为data.frame,因此我尝试了:
> df <- ldply (out, data.frame)

data.frame中的错误(文本=“您有一份工作artemis \ none”,lang =“EN” ,:
参数暗示不同的行数:1、0

(我尝试了其他一些乐观的尝试:)
> t(sapply(out$hits$hits, '[', 1:max(sapply(out$hits$hits, length))))
_index _type _id _version _score _source
[1,] "twitter_all_geo-2014-11-01" "ctweet" "ubicity-twitter-160f0964-6fc7-43ef-af2a-0e1b8c8184c7" 1 2.10757 List,5
[2,] "twitter_all_geo-2014-11-01" "ctweet" "ubicity-twitter-ba071fff-cafb-4d3f-947d-13c934905c1b" 1 2.10757 List,5
[3,] "twitter_all_geo-2014-11-01" "ctweet" "ubicity-twitter-dd64af32-4d59-4008-a3db-74471ad269d1" 1 2.10757 List,5
[4,] "twitter_all_geo-2014-11-01" "ctweet" "ubicity-twitter-4ba0d3d0-642d-4f9f-aaf9-c55929c35dc4" 1 2.10757 List,5
[5,] "twitter_all_geo-2014-11-01" "ctweet" "ubicity-twitter-d7b8cbbc-87b3-44b5-8c9c-91c7b62f1458" 1 2.10757 List,5
[6,] "twitter_all_geo-2014-11-01" "ctweet" "ubicity-twitter-76353a7c-44c9-4863-a59d-adb16716ca18" 1 2.10757 List,5
[7,] "twitter_all_geo-2014-11-01" "ctweet" "ubicity-twitter-2aec0798-9918-4b66-9b2a-ef5a4d1f3711" 1 2.10757 List,5
[8,] "twitter_all_geo-2014-11-01" "ctweet" "ubicity-twitter-c9e7637d-358a-40ee-a06c-85af04c22191" 1 2.10757 List,5
[9,] "twitter_all_geo-2014-11-01" "ctweet" "ubicity-twitter-8928c1ef-f46a-4682-99c4-4dbc55270b03" 1 2.10757 List,5
[10,] "twitter_all_geo-2014-11-01" "ctweet" "ubicity-twitter-d6b19975-b310-46c4-af11-af56971b7c4b" 1 2.10757 List,5

一开始看起来不错,但实际的tweet消息已不在矩阵中

我很乐观,认为可能首先(返回)将其转换为JSON(使用RJSON)

toJSON(out) Error in toJSON(out) : unable to escape string. String is not utf8



最后,我有一个列表,并且无法保存,无法转换为JSON,data.frame或data.table(因为它不是统一的)。有谁能给我提示a)将其转换为JSON或如何将列表保存到.csv文件或将其放入data.frame?

非常感谢,我想我听不懂。

-托比亚斯

最佳答案

我认为unlist()matrix()可以完成这项工作。

Search()-返回out转换为数据帧的示例:

# get the first 3 hits from elasticsearch store
out <- Search(index="shakespeare", size=3)

# (optional) verify that all hits expand to the same length
# (should be true for data intended to be in a table format)
stopifnot(
sapply(
out$hits$hits,
function(x) {!(length(unlist(x)) - length(unlist(out$hits$hits[[1]])))}
)
)

# count number of columns, use unlist() to convert
# nested lists to a vector, use the first hit as proxy
nColumns <- length(unlist(out$hits$hits[[1]]))

# fetch column names ... as above
nNames <- names(unlist(out$hits$hits[[1]]))

# unlist all hits and convert to matrix with ncol Columns, don't forget byrow=TRUE!
df <- data.frame(matrix(unlist(out$hits$hits), ncol=nColumns, byrow=TRUE))

# setting the column names
names(df) <- nNames

# do whatever you want with df
print(df)

干杯!

关于json - 将包Elastic(嵌套列表?)的R输出转换为data.frame或JSON,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29243639/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com