gpt4 book ai didi

java - MongoDB 读取性能差

转载 作者:可可西里 更新时间:2023-11-01 10:00:54 25 4
gpt4 key购买 nike

我有一个包含航类信息的分片集合。架构看起来像这样:

{

"_id" : ObjectId("537ef1bb5516dd401b5b109a"),
"departureAirport" : "HAJ",
"arrivalAirport" : "AYT",
"departureDate" : NumberLong("1412553600000"),
"operatingAirlineCode" : "DE",
"operatingFlightNumber" : "1808",
"flightClass" : "P",
"fareType" : "EX",
"availability" : "*"
}

以下是我 Collection 的统计数据:

{

"sharded" : true,
"systemFlags" : 1,
"userFlags" : 1,
"ns" : "flights.flight",
"count" : 2809822,
"numExtents" : 30,
"size" : 674357280,
"storageSize" : 921788416,
"totalIndexSize" : 287746144,
"indexSizes" : {
"_id_" : 103499984,"departureAirport_1_arrivalAirport_1_departureDate_1_flightClass_1_availability_1_fareType_1" : 184246160
},
"avgObjSize" : 240,
"nindexes" : 2,
"nchunks" : 869,
"shards" : {
"shard0000" : {
"ns" : "flights.flight",
"count" : 1396165,
"size" : 335079600,
"avgObjSize" : 240,
"storageSize" : 460894208,
"numExtents" : 15,
"nindexes" : 2,
"lastExtentSize" : 124993536,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 1,
"totalIndexSize" : 144633440,
"indexSizes" : {
"_id_" : 53094944,"departureAirport_1_arrivalAirport_1_departureDate_1_flightClass_1_availability_1_fareType_1" : 91538496
},
"ok" : 1
},
"shard0001" : {
"ns" : "flights.flight",
"count" : 1413657,
"size" : 339277680,
"avgObjSize" : 240,
"storageSize" : 460894208,
"numExtents" : 15,
"nindexes" : 2,
"lastExtentSize" : 124993536,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 1,
"totalIndexSize" : 143112704,
"indexSizes" : {
"_id_" : 50405040,"departureAirport_1_arrivalAirport_1_departureDate_1_flightClass_1_availability_1_fareType_1" : 92707664
},
"ok" : 1
}
},
"ok" : 1
}

我现在从 JAVA 运行查询,如下所示:

{
"departureAirport" : "BSL",
"arrivalAirport" : "SMF",
"departureDate" : {
"$gte" : 1402617600000,
"$lte" : 1403136000000
},
"flightClass" : "C",
"$or" : [
{ "availability" : { "$gte" : "3"}},
{ "availability" : "*"}
] ,
"fareType" : "OW"
}

出发日期应在一周的范围内,可用性应大于请求的数字或“*”。

我的问题是我可以做些什么来提高我的表现。当我使用每台主机 50 个连接查询数据库时,我只能得到大约 1000 次操作/秒,但我需要得到大约 3000 - 5000 次操作/秒。

当我在 shell 中运行查询时,光标看起来没问题:

“光标”:“BtreeCursor departureAirport_1_arrivalAirport_1_departureDate_1_flightClass_1_availability_1_fareType_1”

如果我忘记了什么,请写信给我。提前致谢。

最佳答案

使用 BtreeCursor 的事实并不能使查询正常。 explain 的输出将有助于确定问题。

我想一个关键问题是你的查询参数的顺序:

// equality, good
"departureAirport" : "BSL",
// equality, good
"arrivalAirport" : "SMF",
// range, bad because index based range queries should be near the end
// of contiguous index-based equality checks
"departureDate" : {
"$gte" : 1402617600000,
"$lte" : 1403136000000
},
// what is this, and how many possible values does it have? Seems to be
// a low selectivity index -> remove from index and move to end
"flightClass" : "C",
// costly $or, one op. is a range query, the other one equality...
// Simply set 'availability' to a magic number instead. That's
// ugly, but optimizations are ugly and it's unlikely we see planes with
// over e.g. 900,000 seats in the next couple of decades...
"$or" : [
{ "availability" : { "$gte" : "3"}},
{ "availability" : "*"}
] ,
// again, looks like low selectivity to me. Since it's already at the end,
// that's ok. I'd try to remove it from the index, however.
"fareType" : "OW"

您可能想将索引更改为类似

"departureAirport_1_arrivalAirport_1_departureDate_1_availability_1"

并以完全相同的顺序查询。将其他所有内容附加在后面,因此必须仅对与索引中所有其他条件匹配的文档进行扫描。

我假设 flightClassfareType 的选择性较低。如果不是这样,这将不是最佳解决方案。

关于java - MongoDB 读取性能差,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23823386/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com