gpt4 book ai didi

ruby - 解析 ruby​​ 上的日志文件

转载 作者:数据小太阳 更新时间:2023-10-29 08:29:53 25 4
gpt4 key购买 nike

我需要你的帮助。我在 ruby​​ 上写了一个脚本,它解析日志文件。但是我不能为这样的日志写一个简单的正则表达式。请帮帮我。以下是日志中的字符串示例:

2014-01-09T06:16:53.766841+00:00 heroku[router]: at=info method=POST path=/logs/save_personal_data host=services.pocketplaylab.com fwd="5.13.87.91" dyno=web.10 connect=1ms service=42ms status=200 bytes=16
2014-01-09T06:16:53.772938+00:00 heroku[router]: at=info method=POST path=/api/users/100002844291023 host=services.pocketplaylab.com fwd="46.195.178.244" dyno=web.6 connect=2ms service=43ms status=200 bytes=52
2014-01-09T06:16:53.765430+00:00 heroku[router]: at=info method=GET path=/api/users/100005936523817/get_friends_progress host=services.pocketplaylab.com fwd="5.13.87.91" dyno=web.11 connect=1ms service=47ms status=200 bytes=7498
2014-01-09T06:16:53.760472+00:00 heroku[router]: at=info method=POST path=/api/users/1770684197 host=services.pocketplaylab.com fwd="74.139.217.81" dyno=web.5 connect=1ms service=17ms status=200 bytes=681
2014-01-09T06:15:15.893505+00:00 heroku[router]: at=info method=GET path=/api/users/1686318645/get_friends_progress host=services.pocketplaylab.com fwd="1.125.42.139" dyno=web.3 connect=8ms service=90ms status=200 bytes=7534
2014-01-09T06:16:53.768188+00:00 heroku[router]: at=info method=GET path=/api/users/100005936523817/get_friends_score host=services.pocketplaylab.com fwd="5.13.87.91" dyno=web.13 connect=2ms service=46ms status=200 bytes=9355
2014-01-09T06:15:17.858874+00:00 heroku[router]: at=info method=POST path=/api/users/1145906359 host=services.pocketplaylab.com fwd="107.220.72.53" dyno=web.14 connect=2ms service=362ms status=200 bytes=52
2014-01-09T06:16:53.797975+00:00 heroku[router]: at=info method=GET path=/api/users/100000622081059/count_pending_messages host=services.pocketplaylab.com fwd="174.239.6.42" dyno=web.12 connect=1ms service=20ms status=200 bytes=33
2014-01-09T06:16:53.796869+00:00 heroku[router]: at=info method=GET path=/api/users/100004683190675/get_friends_score host=services.pocketplaylab.com fwd="99.138.1.64" dyno=web.12 connect=2ms service=55ms status=200 bytes=16881
  • 我需要从文件中获取:
    • URL(例如:/api/users/1686318645/get_friends_progress,/api/users/1145906359);
    • 连接时间+服务时间(例如:connect=2ms service=55ms);
    • dyno(例如:dyno=web.12、dyno=web.14)。

我的代码(更新中):

     #!/usr/bin/env ruby
require 'csv'

sample_logs = File.readlines "/home/railsroger/Playlab_test/sample.log"

file_name = ARGV.last
result_parse = []
CSV.open(file_name, "wb") do |csv_line|
csv_line << ['URL', 'Dyno', 'Connect', 'Service']
sample_logs.each_with_index do |sample_log, idx|
path = sample_log.scan(/path=([^\s]+)/).first.first
dyno = sample_log.scan(/dyno=([^\s]+)/).first.first
connect = sample_log.scan(/connect=([^\s]+)/).first.first
service = sample_log.scan(/service=([^\s]+)/).first.first


result_parse = [path, dyno, connect, service]

csv_line << result_parse

end

end

谢谢。

最佳答案

好的,要编写您的正则表达式,您需要找到所有这些 some_variable=some_data 对。

以下是您可以如何做到这一点:

/\S*=\S*/ #
\S* # match any non-whitespace-character, 0-n times
= # match the equal sign
\S* # match any non-whitespace-character, 0-n times

这将匹配情侣。要提取数据,您可以使用捕获组。将要提取的内容括在方括号 (xxx) 中,用于变量名称和值。

/(\S*)=(\S*)/  
(\S*) # capture the name
(\S*) # capture the value

所以对于每一行你可以做的日志:

line_of_log.scan(/(\S*)=(\S*)\s/)

要查看会发生什么并创建正则表达式,我建议您始终在 https://regex101.com/ 等工具中进行尝试,确实有助于了解正在发生的事情。

这将返回一个数组,如下所示:

[["at", "info"],
["method", "POST"],
["path", "/api/online/platforms/facebook_canvas/users/100002266342173/add_ticket"],
["host", "services.pocketplaylab.com"],
["fwd", "\"94.66.255.106\""],
["dyno", "web.12"],
["connect", "12ms"],
["service", "21ms"],
["status", "200"],
["bytes", "78"]]

不,您可以遍历数组并创建某种对象或散列来使用。

scanresult.inject({}) do |obj, pair|
obj[pair[0].to_sym] = pair[1]
obj
end

关于ruby - 解析 ruby​​ 上的日志文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50114590/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com