gpt4 book ai didi

python - 自动从csv文件中提取数据到特定的矩阵位置

转载 作者:行者123 更新时间:2023-12-05 04:24:59 26 4
gpt4 key购买 nike

我有一个相当大的 csv 文件,我需要程序读取它,然后将数据输入到零矩阵的正确位置。 csv block 示例(也附上文件):

 Sector,Service,Data_Point
Bio,Electricity NonEmitting,0
NEElectricity,Electricity NonEmitting,0.5
RE,Electricity NonEmitting,0
Electricity,Electricity NonEmitting,-1
Bio,Electricity Bio,0.8
NEElectricity,Electricity Bio,0
RE,Electricity Bio,0.04
Electricity,Electricity Bio,-2
Bio,Electricity BECCS,0.84
NEElectricity,Electricity BECCS,0
RE,Electricity BECCS,0.4
Electricity,Electricity BECCS,-1
Bio,Ammonia HB,0
Electricity,Ammonia HB,2.8
RE,Ammonia HB,0.06
Ammonia,Ammonia HB,-1
Bio,Biofuel TBD,0.30
Electricity,Biofuel TBD,0.02
RE,Biofuel TBD,0.012
Electricity,CarUse BEV,0.5
RE,CarUse BEV,0
CarUse,CarUse BEV,-1
Hydrogen,CarUse HFCEV,0.2
RE,CarUse HFCEV,0
CarUse,CarUse HFCEV,-1
Bio,NET DAC,0
NEElectricity,NET DAC,10.5
RE,NET DAC,-1

问题是我需要它能够根据“部门”和“服务”列对数据进行排序。 IE。部门 = 行,服务 = 矩阵中的列。因此,如果程序将 Sector 读取为 Bio:row = 1,并将 Service 读取为 Electricity NonEmitting:column 1,它将 Data_Point 中的相应数字(在本例中 Data_Point 为“0”)输入矩阵的第 1 行第 1 列。或者,如果它读取 Sector 为 NEElectricity:row = 2,但再次作为 Electricity NonEmitting 服务:第 1 列,相应的 Data_Point '0.5' 被输入到矩阵的第 2 行第 1 列。

下面我编写了代码,根据扇区和服务列中唯一元素的数量自动生成零矩阵。我只是不知道如何将值排序到正确的矩阵位置,所以非常感谢任何帮助。

import csv
import numpy as np
import pandas as pd

sector = pd.read_csv('Coeff_Sample.csv', usecols=["Sector"])
matrix_column = int(sector.nunique())

service = pd.read_csv('Coeff_Sample.csv', usecols=["Service"])
matrix_row = int(service.nunique())

coeff_matrix = np.zeros((matrix_row, matrix_column))

最好的问候

最佳答案

Matrix

这是您想要创建的那种矩阵吗?

我使用以下源代码在没有 pandas 的情况下创建了这个矩阵:

import csv
import numpy as np

rows = []
columns = []
all_rows = []

with open('test.csv', 'r') as read_obj:
csv_dict_reader = csv.DictReader(read_obj)
for row in csv_dict_reader:
columns.append(row['Sector'])
rows.append(row['Service'])
all_rows.append(row)

rows_set = set(rows)
columns_set = set(columns)
coeff_matrix = np.full((len(rows_set)+1, len(columns_set)+1), 0).tolist()

row_list = list(rows_set)
columns_list = list(columns_set)

for idx, x in enumerate(columns_list):
coeff_matrix[0][idx+1] = x

for idy, y in enumerate(row_list):
coeff_matrix[idy+1][0] = y

for e in all_rows:
sector = e['Sector']
service = e['Service']
value = e['Data_Point']
for row_idx, row in enumerate(coeff_matrix):
if row[0] == service:
row_index = row_idx
for column_idx, column in enumerate(coeff_matrix[0]):
if column == sector:
column_index = column_idx

coeff_matrix[row_index][column_index] = value

np_coeff_matrix = np.asarray(coeff_matrix)

但是里面有很多循环。也许有一些方法可以使用 pandas 或 list/np.array 函数更快地完成该任务。

关于python - 自动从csv文件中提取数据到特定的矩阵位置,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73387241/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com