gpt4 book ai didi

python - 使用 python 处理每个用户的一行数据

转载 作者:太空宇宙 更新时间:2023-11-03 21:10:06 25 4
gpt4 key购买 nike

我正在从 Google Analytics(分析)自定义报告生成的 csv 文件中收集用户数据,并且某些用户属性每个用户有多个值。我的目标是获取用户数据并将与一个用户相关的所有相关数据放入一行。

示例:

users = [
{ id:1, name: 'user1'},
{ id:2, name: 'user2'},
{ id:3, name: 'user3'}
]

products_purchased = [
{id:1 , product_purchased: 'sardines'},
{id:1, product_purchased: 'shoes'},
{id:2 , product_purchased: 'fish'},
{id:2, product_purchased: 'chicken'},
{id:3 , product_purchased: 'eggs'},
{id:3, product_purchased: 'chicken'},
]

我正在尝试重新排列机器学习的数据,如下所示:

users = [
{ id:1, name: 'user1', product_purchased-1: 'sardines',
product_purchased-2: 'shoes'},
{ id:2, name: 'user2', product_purchased-1:'fish',
product_purchased-2: 'chicken' },
{ id:3, name: 'user3', product_purchased-1: 'eggs',
product_purchased-2: 'chicken'}
]

以下是我的Python代码:

import csv


processed = []
columns = ['id', 'username', 'country','city','region','event-1','event-2','event-3',
'event-4', 'event-5','event-6','event-7','event-8','event-9','event-10','product-1',
'product-2','product-3','product-4','product-5','product-6','product-7','product-8',
'product-9','product-10','page-1','page-2','page-3','page-4','page-5','Sessions with Event',
'Total Events','Adding a product on to the cart (Goal 4 Conversion Rate)',
'Adding a product on to the cart (Goal 4 Completions)']
i = 0
#columns =[] 'ID', 'WP Username' , country, city, region, event action 1 (10 actions), products (multiple), 'Sessions with Event',Total Events, Adding a product on to the cart (Goal 4 Conversion Rate),
# Adding a product on to the cart (Goal 4 Completions)

# Completed the main dimentions of the GA data
# getting details per unique user
with open('users.csv') as users_data:
user_dict = csv.DictReader(users_data)
users = list(user_dict)

for user in users:
processed.append({
'id': user['ID'],
'username': user['WordPress_Username'],
'country':user['Country'],
'city':user['City'],
'region':user['Region']
})

with open('events.csv') as events_data:
events_dict = csv.DictReader(events_data)
events = list(events_dict)

for p in processed:
for event in events:
i += 1
if p['id'] == event['ID']:
p['event-' + str(i)] = event['Event Action']
else:
i = 0

with open('products.csv') as products_data:
products_dict = csv.DictReader(products_data)
products = list(products_dict)

for p in processed:
for product in products:
i += 1
if p['id'] == product['ID']:
p['product-' + str(i)] = product['Product ID']
else:
i = 0

with open('pages.csv') as page_visited:
pages_dict = csv.DictReader(page_visited)
pages = list(pages_dict)

for p in processed:
for page in pages:
i +=1
if p['id'] == page['ID']:
p['page-' + str(i)] = page['Page']
else:
i = 0


for p in processed:
for user in users:
p['Sessions with Event'] = user['Sessions with Event']
p['Total Events'] = user['Total Events']
p['Adding a product on to the cart (Goal 4 Conversion Rate)'] = user['Adding a product on to the cart (Goal 4 Conversion Rate)']
p['Adding a product on to the cart (Goal 4 Completions)'] = user[ 'Adding a product on to the cart (Goal 4 Completions)']
for event in events:
if p['id'] == event['ID']:
p['Sessions'] = event['Sessions']
for page in pages:
if p['id'] == page['ID']:
p['id'] = page['Pages / Session']



try:
with open('data.csv', 'w') as data:
writer = csv.DictWriter(data, fieldnames=columns)
writer.writeheader()
for p in processed:
writer.writerow(p)
except IOError:
print("I/O error")

我想知道我的代码有什么问题,或者有一个替代方案也可以实现我正在寻找的目标。我之前尝试过 Google Data Studio,似乎 GA 中的选项可以让我做到这一点。

我打算稍后在数据上使用 scikit learn 聚类,这就是为什么我要格式化数据以创建 pandas 数据框架。

出于好奇:数据透视表可以用于创建数据框吗?这种格式适用于 scikit learn 吗?

更新:我修复了 event['Sessions'] 行上的括号问题。但现在我收到以下错误:

  File "data_processing.py", line 87, in <module>
writer.writerow(p)
File "/usr/lib/python2.7/csv.py", line 152, in writerow
return self.writer.writerow(self._dict_to_list(rowdict))
ValueError: I/O operation on closed file

最佳答案

我要感谢所有回答我问题的人!非常感谢@xbello 和@G.Anderson。正如他们都指出的那样,问题是我在第 75 行有一个括号(“[event['Sessions']”)并且没有在最后的 open 语句中正确缩进。

关于python - 使用 python 处理每个用户的一行数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55129157/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com