gpt4 book ai didi

java - 在 MongoDB Map Reduce 函数中查询

转载 作者:行者123 更新时间:2023-11-29 09:44:34 27 4
gpt4 key购买 nike

我已经将大约 25 万条推文流式传输并保存到 MongoDB 中,如您所见,我正在根据推文中出现的单词或关键词检索它。

Mongo mongo = new Mongo("localhost", 27017);
DB db = mongo.getDB("TwitterData");
DBCollection collection = db.getCollection("publicTweets");
BasicDBObject fields = new BasicDBObject().append("tweet", 1).append("_id", 0);
BasicDBObject query = new BasicDBObject("tweet", new BasicDBObject("$regex", "autobiography"));
DBCursor cur=collection.find(query,fields);

我想做的是使用Map-Reduce,根据关键字,将其分类并传递给reduce函数来统计每个类别下的推文数量,有点像你看到的here。 .在示例中,他正在计算页数,因为它是一个简单的数字。我想做类似的事情:

"if (this.tweet.contains("kword1")) "+
"category = 'kword1 tweets'; " +
"else if (this.tweet.contains("kword2")) " +
"category = 'kword2 tweets';

然后使用reduce函数获取计数,就像在示例程序中一样。

我知道语法不正确,但这正是我想做的。有什么办法可以实现吗?谢谢!

PS:哦,我正在用 Java 编写代码。因此,Java 语法将受到高度赞赏。谢谢!

发布的代码的输出是这样的:

{ "tweet" : "An autobiography is a book that reveals nothing bad about its writer except his memory."}
{ "tweet" : "I refuse to read anything that's not real the only thing I've read since biff books is Jordan's autobiography #lol"}
{ "tweet" : "well we've had the 2012 publication of Ashley's Good Books, I predict 2013 will be seeing an autobiography ;)"}

当然,这适用于所有带有“自传”一词的推文。我想要的是在 map 函数中使用它,将其归类为“自传推文”(以及其他关键字),然后将其发送到 reduce 函数以计算所有内容并返回带有单词的推文数量

类似于:

{"_id" : "Autobiography Tweets" , "value" : { "publicTweets" : 3.0}}
{"_id" : "Biography Tweets" , "value" : { "publicTweets" : 15.0}}

最佳答案

您可能想尝试以下方法:

    String map = "function() { " +
" var regex1 = new RegExp('autobiography', 'i'); " +
" var regex2 = new RegExp('book', 'i'); " +
" if (regex1.test(this.tweet) ) " +
" emit('Autobiography Tweet', 1); " +
" else if (regex2.test(this.tweet) ) " +
" emit('Book Tweet', 1); " +
" else " +
" emit('Uncategorized Tweet', 1); " +
"}";

String reduce = "function(key, values) { " +
" return Array.sum(values); " +
"}";

MapReduceCommand cmd = new MapReduceCommand(collection, map, reduce,
null, MapReduceCommand.OutputType.INLINE, null);
MapReduceOutput out = collection.mapReduce(cmd);

try {
for (DBObject o : out.results()) {

System.out.println(o.toString());

}
} catch (Exception e) {
e.printStackTrace();
}

关于java - 在 MongoDB Map Reduce 函数中查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13732735/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com