gpt4 book ai didi

json - 获取基于特定单词的所有推文并将所有推文存储在SINGLE BAG中

转载 作者:行者123 更新时间:2023-12-02 22:03:57 27 4
gpt4 key购买 nike

我正在尝试处理样本推文,并根据过滤后的标准存储推文。

例如,

样本鸣叫:-

{"created_time": "18:47:31 ", "text": "RT @Joey7Barton: ..give a word about whether the americans wins a Ryder cup. I mean surely he has slightly more important matters. #fami ...", "user_id": 450990391, "id": 252479809098223616, "created_date": "Sun Sep 30 2012"}

twitter = LOAD 'Tweet.json' USING JsonLoader('created_time:chararray, text:chararray, user_id:chararray, id:chararray, created_date:chararray');
grouped = GROUP twitter BY (text,id);
filtered =FOREACH grouped { row = FILTER $1 BY (text MATCHES '.*word.*'); GENERATE FLATTEN(row);}

它会获得与单词匹配的完整推文。

但是我需要获得如下输出:
(word)(all tweets of contained that word)

我该如何实现?

任何帮助。

莫汉

最佳答案

过滤后,将单词作为字段添加到过滤的关系中,然后按该字段分组,这将为您提供单词和一袋推文。

twitter = LOAD 'Tweet.json' USING JsonLoader('created_time:chararray, text:chararray, user_id:chararray, id:chararray, created_date:chararray');
grouped = GROUP twitter BY (text,id);
filtered = FILTER $1 BY (text MATCHES '.*word.*');
newfiltered = FOREACH filtered GENERATE 'word' AS pattern,filtered.text;
final = GROUP newfiltered BY pattern;
DUMP final;

关于json - 获取基于特定单词的所有推文并将所有推文存储在SINGLE BAG中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39244479/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com