gpt4 book ai didi

python - Coursera类(class)-Python Assignment 1中的数据科学介绍

转载 作者:行者123 更新时间:2023-12-05 01:33:44 24 4
gpt4 key购买 nike

我在 Coursera 上学习这门类(class),在做第一个作业时遇到了一些问题。任务基本上是使用正则表达式从给定文件中获取某些值。然后,该函数应输出包含这些值的字典:

example_dict = {"host":"146.204.224.152", 

"user_name":"feest6811",

"time":"21/Jun/2019:15:45:24 -0700",

"request":"POST /incentivize HTTP/1.1"}

这只是文件的屏幕截图。由于某些原因,如果不是直接从 Coursera 打开,链接将不起作用。对于格式错误,我提前表示歉意。我必须指出的一件事是,在某些情况下,正如您在第一个示例中看到的那样,没有用户名。而是使用“-”。

159.253.153.40 - - [21/Jun/2019:15:46:10 -0700] "POST /e-business HTTP/1.0" 504 19845
136.195.158.6 - feeney9464 [21/Jun/2019:15:46:11 -0700] "HEAD /open-source/markets HTTP/2.0" 204 21149

这就是我目前拥有的。但是,输出为无。我想我的模式有问题。

import re
def logs():

with open("assets/logdata.txt", "r") as file:
logdata = file.read()
# YOUR CODE HERE

pattern = """
(?P<host>\w*)
(\d+\.\d+.\d+.\d+\ )
(?P<user_name>\w*)
(\ -\ [a-z]+[0-9]+\ )
(?P<time>\w*)
(\[(.*?)\])
(?P<request>\w*)
(".*")
"""
for item in re.finditer(pattern,logdata,re.VERBOSE):

print(item.groupdict())

最佳答案

您可以使用以下表达式:

(?P<host>\d+(?:\.\d+){3}) # 1+ digits and 3 occurrenses of . and 3 digits
\s+\S+\s+ # 1+ whitespaces, 1+ non-whitespaces, 1+ whitespaces
(?P<user_name>\S+)\s+\[ # 1+ non-whitespaces (Group "user_name"), 1+ whitespaces and [
(?P<time>[^\]\[]*)\]\s+ # Group "time": 0+ chars other than [ and ], ], 1+ whitespaces
"(?P<request>[^"]*)" # ", Group "request": 0+ non-" chars, "

参见 regex demo .查看Python demo :

import re
logdata = r"""159.253.153.40 - - [21/Jun/2019:15:46:10 -0700] "POST /e-business HTTP/1.0" 504 19845
136.195.158.6 - feeney9464 [21/Jun/2019:15:46:11 -0700] "HEAD /open-source/markets HTTP/2.0" 204 21149"""
pattern = r'''
(?P<host>\d+(?:\.\d+){3}) # 1+ digits and 3 occurrenses of . and 3 digits
\s+\S+\s+ # 1+ whitespaces, 1+ non-whitespaces, 1+ whitespaces
(?P<user_name>\S+)\s+\[ # 1+ non-whitespaces (Group "user_name"), 1+ whitespaces and [
(?P<time>[^\]\[]*)\]\s+ # Group "time": 0+ chars other than [ and ], ], 1+ whitespaces
"(?P<request>[^"]*)" # ", Group "request": 0+ non-" chars, "
'''
for item in re.finditer(pattern,logdata,re.VERBOSE):
print(item.groupdict())

输出:

{'host': '159.253.153.40', 'user_name': '-', 'time': '21/Jun/2019:15:46:10 -0700', 'request': 'POST /e-business HTTP/1.0'}
{'host': '136.195.158.6', 'user_name': 'feeney9464', 'time': '21/Jun/2019:15:46:11 -0700', 'request': 'HEAD /open-source/markets HTTP/2.0'}

关于python - Coursera类(class)-Python Assignment 1中的数据科学介绍,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64425164/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com