gpt4 book ai didi

ruby - 更快的 CSV + 试图找到独特的项目

转载 作者:太空宇宙 更新时间:2023-11-03 16:35:28 24 4
gpt4 key购买 nike

我有一个 csv 文件,我试图在第 2 列之后的列中找到所有 uniq 值,其中第 1 列具有相同的值,并将其合并到一个新的 csv 文件中。我知道,这听起来很困惑,所以这里有一个例子:

原始文件 foo.csv 的示例:

"Boom Lifts","Model Number","Manufacturer","Platform Height","Horizontal Outreach","Lift Capacity"
"Boom Lifts","Model Number","Platform Height","Horizontal Outreach","Up & Over Height","Platform Capacity"
"Boom Lifts","Model Number","Platform Height","Horizontal Outreach","Up & Over Height"
"Pusharound Lifts","Model Number","Manufacturer","Platform Height","Stowed Height"
"Scissor Lifts","Model Number","Manufacturer","Platform Height","Stowed Height","Overall Dimensions","Platform Extension"
"Scissor Lifts","Overall Dimensions","Platform Size","Platform Extension","Lift Capacity"

理想的结果bar.csv:

"Boom Lifts","Model Number","Manufacturer","Platform Height","Horizontal Outreach","Lift Capacity","Up & Over Height","Platform Capacity",,,
"Pusharound Lifts","Model Number","Manufacturer","Platform Height","Stowed Height"
"Scissor Lifts","Model Number","Manufacturer","Platform Height","Stowed Height","Overall Dimensions","Platform Size","Platform Extension","Lift Capacity"

每一行的长度都不同,而且它是一个非常大的文件(超过 5k 行),我对如何进行匹配/字符串操作完全摸不着头脑。是的,其中一些行在有“空单元格”的地方有尾随逗号。我一直在使用 Faster CSV,所以如果有办法用它来做到这一点,那就太好了。

指针?最好是不会让我的 mbp 突然停止的东西?

最佳答案

假设您可以使用 Faster CSV 将其放入二维数组中:

a = [
["Boom Lifts","Model Number","Manufacturer","Platform Height","Horizontal Outreach","Lift Capacity"]
["Boom Lifts","Model Number","Platform Height","Horizontal Outreach","Up & Over Height","Platform Capacity"]
["Boom Lifts","Model Number","Platform Height","Horizontal Outreach","Up & Over Height"]
["Pusharound Lifts","Model Number","Manufacturer","Platform Height","Stowed Height"]
["Scissor Lifts","Model Number","Manufacturer","Platform Height","Stowed Height","Overall Dimensions","Platform Extension"]
["Scissor Lifts","Overall Dimensions","Platform Size","Platform Extension","Lift Capacity"]
]

a.group_by {|e| e[0]}.map {|e| e.flatten.uniq}

让你:

[
["Boom Lifts", "Model Number", "Manufacturer", "Platform Height", "Horizontal Outreach", "Lift Capacity", "Up & Over Height", "Platform Capacity"]
["Pusharound Lifts", "Model Number", "Manufacturer", "Platform Height", "Stowed Height"]
["Scissor Lifts", "Model Number", "Manufacturer", "Platform Height", "Stowed Height", "Overall Dimensions", "Platform Extension", "Platform Size", "Lift Capacity"]
]

不会立即发生,但不应降低您的 MBP。

关于ruby - 更快的 CSV + 试图找到独特的项目,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8394958/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com