mongodb - 如何为 Twitter 文章聚合器设计 MongoDB 模式-6ren

mongodb - 如何为 Twitter 文章聚合器设计 MongoDB 模式

转载作者：可可西里更新时间：2023-11-01 09:37:51

我是 MongoDB 的新手，作为练习，我正在构建一个从推文中提取链接的应用程序。这个想法是为一个主题获取最多推文的文章。我很难为此应用程序设计架构。

应用程序收集推文并保存
解析推文中的链接
链接与附加信息(标题、摘录等)一起保存
一条推文可以包含多个链接
一个链接可以有很多推文

我如何:

保存这些集合，嵌入式文档？
获取按推文数量排序的前十个链接？
获取特定日期在推特上发布次数最多的链接？
获取链接的推文？
获取最新的十条推文？

我很乐意就此获得一些意见。

最佳答案

两个一般提示:1.) 不要害怕复制。将相同的数据以不同的格式存储在不同的集合中通常是个好主意。

2.) 如果您想对内容进行排序和汇总，在各处保留计数字段会有所帮助。 mongodb 的原子更新方法与 upsert 命令一起可以很容易地对现有文档进行计数和添加字段。

以下肯定是有缺陷的，因为它是从我的脑海中打出来的。但我认为坏的例子总比没有例子要好 ;)

colletion tweets:

{
  tweetid: 123,
  timeTweeted: 123123234,  //exact time in milliseconds
  dayInMillis: 123412343,  //the day of the tweet kl 00:00:00
  text: 'a tweet with a http://lin.k and an http://u.rl',
  links: [
     'http://lin.k',
     'http://u.rl' 
  ],
  linkCount: 2
}

collection links: 

{
   url: 'http://lin.k'
   totalCount: 17,
   daycounts: {
      1232345543354: 5, //key: the day of the tweet kl 00:00:00
      1234123423442: 2,
      1234354534535: 10
   }
}

添加新推文:

db.x.tweets.insert({...}) //simply insert new document with all fields

//for each found link:
var upsert = true;
var toFind =  { url: '...'};
var updateObj = {'$inc': {'totalCount': 1, 'daycounts.12342342': 1 } }; //12342342 is the day of the tweet
db.x.links.update(toFind, updateObj, upsert);

获取按推文数量排序的前十个链接？

db.x.links.find().sort({'totalCount:-1'}).limit(10);

获取特定日期推特上发布次数最多的链接？

db.x.links.find({'$gt':{'daycount.123413453':0}}).sort({'daycount.123413453':-1}).limit(1); //123413453 is the day you're after

获取链接的推文？

db.x.tweets.find({'links': 'http://lin.k'});

获取最新的十条推文？

db.x.tweets.find().sort({'timeTweeted': -1}, -1).limit(10);

关于mongodb - 如何为 Twitter 文章聚合器设计 MongoDB 模式，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/6881974/

文章推荐： Android调试-如何知道使用了什么资源文件？

文章推荐： windows - 使用事件在 Windows 中观看、设置和获取事件语言

文章推荐： c# - 如何在 Xamarin.Webkit.Webview 中强制选择键盘类型

文章推荐： Windows Phone 8.1 公司简介

可可西里

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

mongodb - 如何为 Twitter 文章聚合器设计 MongoDB 模式