gpt4 book ai didi

javascript - 对具有相似日期时间的对象数组进行重复数据删除

转载 作者:行者123 更新时间:2023-12-05 04:27:18 24 4
gpt4 key购买 nike

我正在尝试对内容和作者相同但时间戳略有不同(即在 1 秒内)的 JSON 对象数组进行重复数据删除。我想将重复的消息保存为一个新字段,称为重复项。例如,考虑以下条目 2、3 和 5 消息,这些消息应该被删除:

myObject = [
{content: 'content1', date: '1980-08-01 12:12:40.000', author: 'Person1'},
{content: 'content2', date: '1980-08-01 12:12:40.900', author: 'Person2'},
{content: 'content2', date: '1980-08-01 12:12:41.100', author: 'Person2'},
{content: 'content3', date: '1980-08-01 12:12:41.000', author: 'Person1'},
{content: 'content2', date: '1980-08-01 12:12:41.400', author: 'Person2'},
{content: 'content4', date: '1980-08-01 12:12:45.100', author: 'Person2'},
]

应该转换为:

deduped = [
{content: 'content1', date: '1980-08-01 12:12:40.000', author: 'Person1', duplicates: 0},
{content: 'content2', date: '1980-08-01 12:12:40.900', author: 'Person2', duplicates: 2},
{content: 'content3', date: '1980-08-01 12:12:41.000', author: 'Person1', duplicates: 0},
{content: 'content4', date: '1980-08-01 12:12:45.100', author: 'Person2', duplicates: 0},
]

我遇到问题的部分是日期时间。如果在重复项之间出现非重复消息,则按日期时间排序然后减少很容易出错。比较日期时间的字符串值也容易出错,因为两条消息可能靠得很近,但根据它们所在的位置显示为相隔 1 秒。

使用 lodash _.uniqWith,我可以根据具有相同内容和作者的实际时间增量的组合进行重复数据删除,但我缺少重复字段...

const dedupedButNoCount = _.uniqWith(myObject, (item1, item2) => 
{return (item1.content== item2.content) && (item1.author== item2.author)
&& ((new Date(item1.date).getTime() - new Date(item2.date).getTime())<500)}
)

关于如何对具有相似但不相同的日期时间的对象数组进行重复数据删除的任何指示?

最佳答案

我已经做到了,但是我使用了一种...

const
getTimeMs = YMDhmsx => // date string conversion to UTC (time zone = 0)
{
let [Y,M,D,h,m,s,x] = YMDhmsx.split(/\-|\.|\s|\:/).map(Number)
return (new Date(Date.UTC(Y,--M,D,h,m,s,x))).getTime() // time UTC value in ms
}
, myObject = [
{content: 'content1', date: '1980-08-01 12:12:40.000', author: 'Person1'},
{content: 'content2', date: '1980-08-01 12:12:40.900', author: 'Person2'},
{content: 'content2', date: '1980-08-01 12:12:41.100', author: 'Person2'},
{content: 'content3', date: '1980-08-01 12:12:41.000', author: 'Person1'},
{content: 'content2', date: '1980-08-01 12:12:41.400', author: 'Person2'},
{content: 'content4', date: '1980-08-01 12:12:45.100', author: 'Person2'},
]

let result =
myObject
.sort( (a,b) =>
a.content.localeCompare(b.content) ||
a.author.localeCompare(b.author) ||
a.date.localeCompare(b.date)
)
.reduce( (r,el,i,{[i-1]:prev}) =>
{
let msTime = getTimeMs(el.date)

if (el.content === prev?.content
&& el.author === prev?.author
&& (msTime - r.msTime) <= 1000 ) // 1 second less on previous
r.current.duplicates++;
else
{
r.current = {...el, duplicates:0 }
r.result.push( r.current )
}
r.msTime = msTime
return r
}
, {msTime:0, current:null, result:[] })
.result;

console.log ( 'result:\n' + JSON.stringify( result ).replaceAll('},{','}\n,{') )
.as-console-wrapper {max-height: 100% !important;top: 0;}
.as-console-row::after {display: none !important;}

关于javascript - 对具有相似日期时间的对象数组进行重复数据删除,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72859905/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com