gpt4 book ai didi

ruby - 在ruby中生成热图的算法

转载 作者:塔克拉玛干 更新时间:2023-11-03 06:12:32 26 4
gpt4 key购买 nike

我想为票务系统构建热图(类似于 this 的表格)。我正在从 JSON 格式的数据库数据中接收所有票证详细信息。下面是一个例子。实际数据有1000+条记录。

{"ticketCount": 6, 
"tickets":
[
{"creationTimeMs": 1506061704724,
"expirationTimeMs": 1506083304724,
"queue": "low"},
{"creationTimeMs": 1506127874782,
"expirationTimeMs": 1506149474782,
"queue": "low"},
{"creationTimeMs": 1506283760321,
"expirationTimeMs": 1506283760322,
"queue": "high"},
{"creationTimeMs": 1506236363281,
"expirationTimeMs": 1506257963281,
"queue": "high"},
{"creationTimeMs": 1506283655948,
"expirationTimeMs": 1506283667938,
"queue": "low"},
{"creationTimeMs": 1506283781894,
"expirationTimeMs": 1506284781894,
"queue": "medium"}
]
}

我想要一个以队列名称(不固定)作为行和剩余时间(currentTime - expirationTime)作为列的表。我想要 5 列在 <10 分钟、10-30 分钟、30-1 小时、1-5 小时、>5 小时内过期。

我知道如何通过一次又一次地遍历 json 来暴力破解。我想知道我们是否有一些最好的算法,以及 ruby​​ 可以提供什么使它变得简单。

最佳答案

代码

require 'json'   

def cross_tab(json, range_mins)
JSON.parse(json)["tickets"].each_with_object(Hash.new(0)) do |g,h|
diff = g["etime"]-g["ctime"]
h[[g["queue"], range_mins.rindex { |mn| mn <= diff }]] += 1
end
end

示例

json = '{"ticketCount": 6, 
"tickets": [
{"ctime": 1506061704724, "etime": 1506083304724, "queue": "low"},
{"ctime": 1506127874782, "etime": 1506149474782, "queue": "low"},
{"ctime": 1506283760321, "etime": 1506283760322, "queue": "high"},
{"ctime": 1506236363281, "etime": 1506257963281, "queue": "high"},
{"ctime": 1506283655948, "etime": 1506283667938, "queue": "low"},
{"ctime": 1506283781894, "etime": 1506284781894, "queue": "medium"}
]
}'

range_mins = [0, 10, 30, 60, 300].map { |n| 60000 * n }
#=> [0, 600_000, 1_800_000, 3_600_000, 18_000_000]

h = cross_tab(json, range_mins)
#=> {["low", 4]=>2, ["high", 0]=>1, ["high", 4]=>1, ["low", 0]=>1, ["medium", 1]=>1}

h[["high", 4]]
#=> 1
h[["low", 3]]
#=> 0

获得第二个结果是因为 h 具有默认值 0 并且没有键 ["low", 3]

我们现在可以按如下方式构建交叉表(或交叉表列联表)的内容。

row_map = { 0=>"low", 1=>"medium", 2=>"high" }

tbl = Array.new(row_map.size) { |i|
Array.new(range_mins.size) { |j| h[[row_map[i], j]] } }
#=> [[1, 0, 0, 0, 2],
# [0, 1, 0, 0, 0],
# [1, 0, 0, 0, 1]]

行(列)标签取自row_map(range_mins)

我们也可以从 json 计算 row_map

JSON.parse(json)["tickets"].map { |h| h["queue"] }.uniq.
map.with_index { |queue, i| [i, queue] }.to_h
#=> {0=>"low", 1=>"high", 2=>"medium"}

但这不允许我们指定表格行的顺序或生成仅包含 "queue" 的某些值的表格。

解释

方法使用类方法的形式Hash::new它带有一个参数(这里是0),它是散列的默认值。这只是意味着如果 h = Hash.new(0) 并且 h 没有键 kh[k] 返回默认值。 (哈希没有改变。)

以这种方式定义的散列有时称为计数散列,通常用于(并且在此处以这种方式使用)计算 h[k] +=1 .当 Ruby 看到这个时,她做的第一件事就是将它展开为

h[k] = h[k] + 1

如果h没有键k,则h[k]在等式右边(方法Hash#[])转换为默认值 0。随后每次针对相同的键 k 执行此表达式时,右侧的 h[k] 返回 k 的当前值(即,默认值不适用)。 (注意等号左边的h[k]是方法Hash#[]=,与默认值无关。)

步骤如下。

h = JSON.parse(json)
#=> {"ticketCount"=>6,
# "tickets"=>[
# {"ctime"=>1506061704724, "etime"=>1506083304724, "queue"=>"low"},
# {"ctime"=>1506127874782, "etime"=>1506149474782, "queue"=>"low"},
# {"ctime"=>1506283760321, "etime"=>1506283760322, "queue"=>"high"},
# {"ctime"=>1506236363281, "etime"=>1506257963281, "queue"=>"high"},
# {"ctime"=>1506283655948, "etime"=>1506283667938, "queue"=>"low"},
# {"ctime"=>1506283781894, "etime"=>1506284781894, "queue"=>"medium"}
# ]
# }
a = h["tickets"]
#=> [{"ctime"=>1506061704724, "etime"=>1506083304724, "queue"=>"low"},
# {"ctime"=>1506127874782, "etime"=>1506149474782, "queue"=>"low"},
# {"ctime"=>1506283760321, "etime"=>1506283760322, "queue"=>"high"},
# {"ctime"=>1506236363281, "etime"=>1506257963281, "queue"=>"high"},
# {"ctime"=>1506283655948, "etime"=>1506283667938, "queue"=>"low"},
# {"ctime"=>1506283781894, "etime"=>1506284781894, "queue"=>"medium"}]
e = a.each_with_object(Hash.new(0))
#=> #<Enumerator: [
# {"ctime"=>1506061704724, "etime"=>1506083304724, "queue"=>"low"},
# {"ctime"=>1506127874782, "etime"=>1506149474782, "queue"=>"low"},
# ...
# {"ctime"=>1506283781894, "etime"=>1506284781894, "queue"=>"medium"}
# ]:each_with_object({})>

第一个元素元素由枚举器生成,传递给 block , block 变量设置为等于该值并执行 block 计算。

g, h = e.next
# => [{"ctime"=>1506061704724, "etime"=>1506083304724, "queue"=>"low"}, {}]
g #=> {"ctime"=>1506061704724, "etime"=>1506083304724, "queue"=>"low"}
h #=> {}
f = g["queue"]
#=> "low"
diff = g["etime"]-g["ctime"]
#=> 1506083304724 - 1506061704724 => 21600000
j = range_mins.rindex { |mn| mn <= diff }
#=> 4

这表明 range_mins[4] #=> 18_000_000 是小于或等于 diffrange_mins 的最大值( 21_600_000)`。继续,

k = [f, j]
#=> ["low", 4]
h[k] += 1
#=> 1
h #=> {["low", 4]=>1}

然后下一个值由枚举器 e 传递给 block 。

g, h = e.next
#=> [{"ctime"=>1506127874782, "etime"=>1506149474782, "queue"=>"low"},
# {["low", 4]=>1}]
g #=> {"ctime"=>1506127874782, "etime"=>1506149474782, "queue"=>"low"}
h #=> {["low", 4]=>1}
f = g["queue"]
#=> "low"
diff = g["etime"]-g["ctime"]
#=> 1506149474782 - 1506127874782 => 21600000
j = range_mins.rindex { |mn| mn <= diff }
#=> 4
k = [f, j]
#=> ["low", 4]
h[k] += 1
#=> 2
h #=> {["low", 4]=>2}

其余步骤类似。

关于ruby - 在ruby中生成热图的算法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46516924/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com