"took": 13, #> "timed_out": false, #> "_shards": { #> "tot-6ren">
gpt4 book ai didi

json - 将R字符对象写入JSON时如何去除\"

转载 作者:行者123 更新时间:2023-11-29 02:52:23 26 4
gpt4 key购买 nike

我有一个像这样的 data.table:

test <- data.table(city = c("Berlin", "Berlin", "Berlin", "Amsterdam", "Amsterdam"),
key1 = c("A", "A", "A", "B", "B"),
value1 = c(1, 2, 3, 4, 5),
value2 = c(0.1, 0.2, 0.3, 0.4, 0.5),
kpi = c(10, 15, 20, 25, 30))

我想将这些数据上传到 Elasticsearch,但具有特定的结构:

library(RJSONIO)
res <-test[, .(factors = toJSON(.SD)),
by = .(city, key1),
.SDcols = c("value1", "kpi")]

此代码在 factors 列中创建不同的 JSON。由于我想摆脱库引入的 \n 序列,我可以在赋值中替换这些字符串:

res <-test[, .(factors = gsub("\n", "", toJSON(.SD))), 
by = .(city, key1),
.SDcols = c("value1", "kpi")]

当我想将这个对象上传到 Elasticsearch 时,问题就来了(我正在使用 elastic 包)。由于 R 使用反斜杠来转义字符串中的双引号,因此当我使用以下方式编写对象时:

docs_bulk(res, "index")

它在使用内部 toJSON 创建的字符串字段中写入 \" 而不是 " (value1kpi)。这也可以在将对象写入文件时检查:

write(toJSON(res), "~/output.json")

{
"city": [ "Berlin", "Amsterdam" ],
"key1": [ "A", "B" ],
"factors": [ "{ \"value1\": [1, 2, 3 ],\"kpi\": [10, 15, 20 ] }", "{ \"value1\": [ 4, 5 ],\"kpi\": [25, 30 ] }" ]
}

由于 value1kpi 的名称以 \" 开头和结尾,因此 Elasticsearch 不会将这些字段解析为单独的数组。我想要的是这样的:

{
"city": [ "Berlin", "Amsterdam" ],
"key1": [ "A", "B" ],
"factors": [ { "value1": [1, 2, 3 ],"kpi": [10, 15, 20 ] }, { "value1": [4, 5 ],"kpi": [25, 30 ] } ]
}

我尝试了几种不同的正则表达式 gsub 组合,但我无法阻止 R 写入反斜杠。我最后的选择是将对象写入文件并使用 sed 手动解析它,但我认为应该有更简单的方法。任何帮助将不胜感激。

最佳答案

好的,我认为这应该可以。到达最终 res 对象以批量加载的代码可能更少,但无论如何

library(elastic)
library(data.table)
library(jsonlite)

test <- data.table(city = c("Berlin", "Berlin", "Berlin", "Amsterdam", "Amsterdam"),
key1 = c("A", "A", "A", "B", "B"),
value1 = c(1, 2, 3, 4, 5),
value2 = c(0.1, 0.2, 0.3, 0.4, 0.5),
kpi = c(10, 15, 20, 25, 30))

res <- test[, .(factors = jsonlite::toJSON(.SD, dataframe = "columns")),
by = .(city, key1),
.SDcols = c("value1", "kpi")]

res <- lapply(apply(res, 1, as.list), function(z) {
tt <- z[!names(z) %in% "factors"]
tt$factors <- fromJSON(z$factors)
tt
})

docs_bulk(res, "mycoolindex")

curl 'http://localhost:9200/mycoolindex/_search?size=1' | jq .
#> {
#> "took": 13,
#> "timed_out": false,
#> "_shards": {
#> "total": 5,
#> "successful": 5,
#> "failed": 0
#> },
#> "hits": {
#> "total": 2,
#> "max_score": 1,
#> "hits": [
#> {
#> "_index": "mycoolindex",
#> "_type": "mycoolindex",
#> "_id": "AVeay0KnlE0U0vVWYXkb",
#> "_score": 1,
#> "_source": {
#> "city": [
#> "Amsterdam"
#> ],
#> "key1": [
#> "B"
#> ],
#> "factors": {
#> "value1": [
#> 4,
#> 5
#> ],
#> "kpi": [
#> 25,
#> 30
#> ]
#> }
#> }
#> }
#> ]
#> }
#> }

关于json - 将R字符对象写入JSON时如何去除\",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39897177/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com