gpt4 book ai didi

python - 确定长度方差

转载 作者:行者123 更新时间:2023-11-30 23:33:18 24 4
gpt4 key购买 nike

我试图确定参数值长度的方差,并在设置相应参数/值组合后打印方差值。

例如,date=2007-04-14date=2007-08-19date 的方差值将为 0。id_eve=479989id_eve=47id_eve=479id_eve 的值为2.88。

来自Group values with common domain and page values ,我们有一组 URL,它们被解析以提供一组 URL 的参数/值。

示例数据集:

www.domain.com/page?id_eve=479989&adm=no
www.domain.com/page?id_eve=47&adm=yes
www.domain.com/page?id_eve=479
domain.com/cal?view=month
domain.com/cal?view=day
ww2.domain.com/cal?date=2007-04-14
ww2.domain.com/cal?date=2007-08-19
www.domain.edu/some/folder/image.php?l=adm&y=5&id=2&page=http%3A//support.domain.com/downloads/index.asp&unique=12345
blog.news.org/news/calendar.php?view=day&date=2011-12-10
www.domain.edu/some/folder/image.php?l=adm&y=5&id=2&page=http%3A//.domain.com/downloads/index.asp&unique=12345
blog.news.org/news/calendar.php?view=month&date=2011-12-10

由以下Python代码解析:

from collections import defaultdict
from urllib import quote
from urlparse import parse_qsl, urlparse

urls = defaultdict(list)
with open('links.txt') as f:
for url in f:
parsed_url = urlparse(url.strip())
params = parse_qsl(parsed_url.query, keep_blank_values=True)
for key, value in params:
urls[parsed_url.path].append("%s=%s" % (key, quote(value)))

# printing results
for url, params in urls.iteritems():
print url
for param in params:
print param

提供:

ww2.domain.com/cal
date=2007-04-14
date=2007-08-19
www.domain.edu/some/folder/image.php
l=adm
y=5
id=2
page=http%3A//support.domain.com/downloads/index.asp
unique=12345
l=adm
y=5
id=2
page=http%3A//.domain.com/downloads/index.asp
unique=12345
domain.com/cal
view=month
view=day
www.domain.com/page
id_eve=479989
adm=no
id_eve=47
adm=yes
id_eve=479
blog.news.org/news/calendar.php
view=day
date=2011-12-10
view=month
date=2011-12-10

所需的附加部分是为每个参数/值组合打印参数值长度的变化,以便将参数与上面输出中定义的类似 URL 相匹配(希望读起来清晰)。

  • 用于对网址进行分组的组参数
  • 计算参数值的长度
  • 确定长度的变化

所以期望的输出是:

ww2.domain.com/cal
date=2007-04-14
date=2007-08-19
0
www.domain.edu/some/folder/image.php
l=adm
l=adm
0
y=5
y=5
0
id=2
id=2
0
page=http%3A//support.domain.com/downloads/index.asp
0
unique=12345
0
page=http%3A//.domain.com/downloads/index.asp
unique=12345
0
domain.com/cal
0
view=month
view=day
1
www.domain.com/page
id_eve=479989
id_eve=47
id_eve=479
2.88
adm=no
adm=yes
0.25
blog.news.org/news/calendar.php
view=day
view=month
1
date=2011-12-10
date=2011-12-10
0

最佳答案

from collections import defaultdict
from urllib import quote
from urlparse import parse_qsl, urlparse

我们需要能够计算方差:

def variance(values):
mean = sum(values) / float(len(values))
return sum((elem - mean)**2 for elem in values) / float(len(values))

我们希望按“key”进行分组,因此我们将在 defaultdict 中添加另一层,而不是放置 "%s=%s"

urls = defaultdict(lambda: defaultdict(list))
with open('links.txt') as f:
for url in f:
parsed_url = urlparse(url.strip())
params = parse_qsl(parsed_url.query, keep_blank_values=True)
for key, value in params:
urls[parsed_url.path][key].append(quote(value))

然后我们就可以浏览并打印内容

for domain, keys in urls.items():
print domain
for key, values in keys.items():
for value in values:
print "%s=%s" % (key, value)

if len(values) > 1:
print variance(map(len, values))

关于python - 确定长度方差,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18934978/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com