gpt4 book ai didi

I am getting performance issue with this aggregation pipeline?This almost takes 20 sec to give response of 30000 records(我在这个聚合管道上遇到了性能问题?这几乎需要20秒才能响应30000条记录)

转载 作者:bug小助手 更新时间:2023-10-25 12:52:26 25 4
gpt4 key购买 nike



The aggregation pipeline taking almost 20 to 25 sec to execute and give response.But inspite of creating indexes it is still taking time 20 to 22 sec.I guess these lookups takes more time ,Byt why and how I can solve this issue?
Note:It fetches mostly 30400 records. My MongoDB version is 5.0.20…

聚合管道几乎需要20到25秒的时间来执行和响应。但是,尽管创建索引,它仍然需要20到22秒的时间。我猜这些查找需要更多的时间,但是为什么以及如何解决这个问题?注:它获取的记录最多为30400条。我的MongoDB版本是5.0.20…


I have created Indexes,

我已经创建了索引,


createdAt: 1
businessUnitId: 1
commodityId: 1
commodityVariantId: 1
createdBy: 1
isDeleted: 1
isSLCMQcInspection: 1
commodityDetail.CIDNumber_text

Aggregation pipeline:

聚合管道:


async getQCResultSLcm(
filters: GetAllQcInspectionWithFilterSlcmDto,
businessUnitId: string,
) {
let startDateQuery = {};
let endDateQuery = {};
let commoditySearchQuery = {};
let variantSearchQuery = {};
const statusQuery = {};
let cidNumberSearchQuery = {};
let lotNoSearchQuery = {};
const businessUnitFilterQuery = {};
let generalSearchQuery = {};

if (filters.startDate) {
startDateQuery = {
$expr: {
$gte: [
'$createdAt',
{
$dateFromString: {
dateString: filters.startDate,
timezone: '+05:30',
format: '%m-%d-%Y',
},
},
],
},
};
}

if (filters.endDate) {
endDateQuery = {
$expr: {
$lt: [
'$createdAt',
{
$dateAdd: {
startDate: {
$dateFromString: {
dateString: filters.endDate,
timezone: '+05:30',
format: '%m-%d-%Y',
},
},
unit: 'day',
amount: 1,
},
},
],
},
};
}

// if (filters.startDate) {
// startDateQuery = { createdAt: { $gte: DateTime.fromFormat(filters.startDate, "MM-dd-yyyy").setZone("+05:30").toJSDate() } }
// }
// if (filters.endDate) {
// endDateQuery = { createdAt: { $lt: DateTime.fromFormat(filters.startDate, "MM-dd-yyyy").plus({ days: 1 }).setZone("+05:30").toJSDate() } }
// }

if (filters.searchByCommodity) {
commoditySearchQuery = {
'commodityData.name': {
$regex: `${filters.searchByCommodity}`,
$options: 'i',
},
};
}
if (filters.searchByVariant) {
variantSearchQuery = {
'commodityVariantData.name': {
$regex: `${filters.searchByVariant}`,
$options: 'i',
},
};
}

if (filters.searchByStatus) {
statusQuery['status'] = filters.searchByStatus;
}

if (filters.searchByCIDNumber) {
cidNumberSearchQuery = {
$or: [
{
'commodityDetail.CIDNumber': {
$regex: `${filters.searchByCIDNumber}`,
$options: 'i',
},
},
// {
// 'businessUnitData.name': {
// $regex: `${filters.searchByCIDNumber}`,
// $options: 'i',
// },
// },
// {
// 'commodityDetail.LOTNumber': {
// $regex: `${filters.searchByLotNo}`,
// $options: 'i',
// },
// },
// {
// 'qcId': {
// $regex: `${filters.searchByCIDNumber}`,
// $options: 'i',
// },
// },
],
};
}
if (filters.searchByLotNo) {
lotNoSearchQuery = {
$or: [
{
'commodityDetail.LOTNumber': {
$regex: `${filters.searchByLotNo}`,
$options: 'i',
},
},
],
};
}
if (filters.searchByGeneralSearch) {
generalSearchQuery = {
$or: [
{
qcId: {
$regex: `${filters.searchByGeneralSearch}`,
$options: 'i',
},
},
{
'businessUnitData.name': {
$regex: `${filters.searchByGeneralSearch}`,
$options: 'i',
},
},
{
'userData.name': {
$regex: `${filters.searchByGeneralSearch}`,
$options: 'i',
},
},
],
};
}

if (businessUnitId) {
businessUnitFilterQuery['businessUnitId'] = new mongoose.Types.ObjectId(
businessUnitId,
);
}
// const startTime = Date.now();
const result = await this.qcInspectionModel.aggregate([
{
$match: {
$and: [
startDateQuery,
endDateQuery,
statusQuery,
businessUnitFilterQuery,
{ isDeleted: false },
{ isSLCMQcInspection: true },
],
},
},
{
$lookup: {
from: 'mastercommodities',
localField: 'commodityId',
pipeline: [
{
$project: {
name: 1,
},
},
],
foreignField: '_id',
as: 'commodityData',
},
},
{
$unwind: '$commodityData',
},
{
$lookup: {
from: 'commodityvariants',
localField: 'commodityVariantId',
pipeline: [
{
$project: {
name: 1,
},
},
],
foreignField: '_id',
as: 'commodityVariantData',
},
},
{
$unwind: '$commodityVariantData',
},
{
$lookup: {
from: 'businessunits',
localField: 'businessUnitId',
pipeline: [
{
$lookup: {
from: 'businesses',
localField: 'businessId',
foreignField: '_id',
as: 'businessClientName',
},
},
{
$unwind: '$businessClientName',
},
{
$project: {
name: 1,
businessClientName: '$businessClientName.displayName',
},
},
],
foreignField: '_id',
as: 'businessUnitData',
},
},
{
$unwind: {
path: '$businessUnitData',
preserveNullAndEmptyArrays: true,
},
},
{
$lookup: {
from: 'users',
localField: 'createdBy',
foreignField: '_id',
as: 'userData',
pipeline: [
{
$project: {
firstName: 1,
lastName: 1,
_id: 0,
name: { $concat: ['$firstName', ' ', '$lastName'] },
},
},
],
},
},
{
$unwind: {
path: '$userData',
preserveNullAndEmptyArrays: true,
},
},
{
$match: {
$and: [
commoditySearchQuery,
variantSearchQuery,
generalSearchQuery,
cidNumberSearchQuery,
lotNoSearchQuery
],
},
},
{
$sort: {
createdAt:
filters.sortOrder && filters.sortOrder != SortOrder.Ascending
? SortOrder.Descending
: SortOrder.Ascending,
},
},
{
$project: {
_id: 1,
status: 1,
commodityData: 1,
commodityDetail: 1,
commodityVariantData: 1,
createdAt: 1,
qcId: 1,
sampleName: 1,
businessUnitData: 1,
userData: 1,
location: 1,
middlewareStatus: 1,
},
},
{
$facet: {
records: [
{ $skip: (filters.pageNumber - 1) * filters.count },
{ $limit: filters.count * 1 },
],
total: [{ $count: 'count' }],
},
}
]);
// const endTime = Date.now();
// const executionTimeMs = endTime - startTime;
// console.log('Execution time:', executionTimeMs, 'ms');
return result;
}


explain() result,

解释()结果,


{
explainVersion: '1',
stages: [
{
'$cursor': [Object],
nReturned: 30104,
executionTimeMillisEstimate: 4925
},
{
'$lookup': [Object],
totalDocsExamined: 30103,
totalKeysExamined: 30103,
collectionScans: 0,
indexesUsed: [Array],
nReturned: 30103,
executionTimeMillisEstimate: 8556
},
{
'$lookup': [Object],
totalDocsExamined: 30103,
totalKeysExamined: 30103,
collectionScans: 0,
indexesUsed: [Array],
nReturned: 30103,
executionTimeMillisEstimate: 11726
},
{
'$lookup': [Object],
totalDocsExamined: 30103,
totalKeysExamined: 30103,
collectionScans: 0,
indexesUsed: [Array],
nReturned: 30103,
executionTimeMillisEstimate: 18457
},
{
'$lookup': [Object],
totalDocsExamined: 30103,
totalKeysExamined: 30103,
collectionScans: 0,
indexesUsed: [Array],
nReturned: 30103,
executionTimeMillisEstimate: 23009
},
{
'$sort': [Object],
totalDataSizeSortedBytesEstimate: 55157785,
usedDisk: false,
nReturned: 30103,
executionTimeMillisEstimate: 23014
},
{
'$project': [Object],
nReturned: 30103,
executionTimeMillisEstimate: 23154
},
{
'$facet': [Object],
nReturned: 1,
executionTimeMillisEstimate: 23232
}
],
serverInfo: {
host: '********',
port: ****,
version: '5.0.20',
gitVersion: '2cd626d8148120319d7dca5824e760fe220cb0de'
},
serverParameters: {
internalQueryFacetBufferSizeBytes: 104857600,
internalQueryFacetMaxOutputDocSizeBytes: 104857600,
internalLookupStageIntermediateDocumentMaxSizeBytes: 104857600,
internalDocumentSourceGroupMaxMemoryBytes: 104857600,
internalQueryMaxBlockingSortMemoryUsageBytes: 104857600,
internalQueryProhibitBlockingMergeOnMongoS: 0,
internalQueryMaxAddToSetBytes: 104857600,
internalDocumentSourceSetWindowFieldsMaxMemoryBytes: 104857600
},
command: {
aggregate: 'qcinspections',
pipeline: [
[Object], [Object],
[Object], [Object],
[Object], [Object],
[Object], [Object],
[Object], [Object],
[Object], [Object],
[Object]
],
cursor: {},
'$db': '*******-test'
},
ok: 1,
'$clusterTime': {
clusterTime: new Timestamp({ t: 1694197055, i: 1 }),
signature: {

},
operationTime: new Timestamp({ t: 1694197055, i: 1 })
}


Any suggestion,any solution is accepted.... you can share....Thank you..

任何建议、任何解决方案都会被接受。你们可以分享...谢谢..。


更多回答

If you are talking about 30400 records, even with indexes, your kind of aggregation is going to take a long time.... 20-25 seconds is a norm since it's going to read through all the documents. You should opt to lessen the records that needs to be searched in your first $match.

如果您谈论的是30400条记录,即使有索引,您的聚合也将需要很长时间……20-25秒是一个标准,因为它将通读所有的文件。您应该选择减少在第一个$Match中需要搜索的记录。

No,this is not normal this impacts user experience...

不,这不正常,这会影响用户体验...

Look at your explain, you are doing many things with 30k records, and look at your $lookup and its executionTimeMillisEstimate. The first 4 seconds come from your match, your third lookup took 6 seconds and then subsequently 5 seconds. There's too many records to examined. You need to narrow down your search before hitting these expensive lookups

看一下您的解释,您正在用30k的记录做很多事情,并且看一下您的$lookup及其ExecutionTimeMillisEstimate。前4秒来自您的匹配,您的第三次查找花了6秒,然后是5秒。要检查的记录太多了。在找到这些昂贵的查找之前,您需要缩小搜索范围

Agree with what @SomeoneSpecial said, having so many lookups doesn't seem a good schema design for NoSQL database. You may want to check out how to denormalize your schema as some of your lookups seemingly just want to get a field from other collection. This could be a starting point to learn more.

同意@SomeoneSpecial所说的,拥有如此多的查找对于NoSQL数据库来说似乎不是一个好的模式设计。您可能希望了解如何反规范化您的模式,因为您的一些查找似乎只是想从其他集合中获取一个字段。这可能是了解更多信息的起点。

Yes, you guys may be correct, but schema redesign for ongoing or completed projects can be very time-consuming. In a large-scale project, that collection may already be used in various places separately. So, is there any other way? I am also facing the same issue.

是的,你们可能是对的,但正在进行或已完成的项目的模式重新设计可能非常耗时。在大型项目中,这些集合可能已经在不同的地方单独使用。那么,有没有其他办法呢?我也面临着同样的问题。

优秀答案推荐
更多回答

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com