gpt4 book ai didi

python - CSV Python 列/行平均值

转载 作者:塔克拉玛干 更新时间:2023-11-03 06:18:59 25 4
gpt4 key购买 nike

我试图在不使用 Panda 的情况下获得此解决方案(右下方),基本上我想做的是获得健康患者和患病患者的平均值在 CSV 文件中。该文件有 303 名患者,分为 14 个类别(因此有 13 行) 一些数据丢失,因此它被替换为 ? 的第 13 行将患病患者与健康患者分开,大于 0 的任何患者都是生病的,= 或低于 0 的任何患者都是健康的。我找到了一种拆分它们的方法,但我确实知道如何将这些线相加以分别获得健康患者和患病患者的平均值。关于如何进行的任何想法都很棒

 Please enter a training file name: train.csv
Total Lines Processed: 303
Total Healthy Count: 164
Total Ill Count: 139
Averages of Healthy Patients:
[52.59, 0.56, 2.79, 129.25, 242.64, 0.14, 0.84, 158.38, 0.14, 0.59, 1.41, 0.27, 3.77, 0.00]
Averages of Ill Patients:
[56.63, 0.82, 3.59, 134.57, 251.47, 0.16, 1.17, 139.26, 0.55, 1.57, 1.83, 1.13, 5.80, 2.04]
Seperation Values are:
[54.61, 0.69, 3.19, 131.91, 247.06, 0.15, 1.00, 148.82, 0.34, 1.08, 1.62, 0.70, 4.79, 1.02]

我的代码还有很长的路要走,我只是在寻找一种简单的方法来获取患者的平均值。我目前的方法只得到第 13 列,但我需要像上面那样的所有 13 列。任何关于我应该尝试以何种方式解决这个问题的帮助都将非常棒,非常感谢。

import csv
#turn csv files into a list of lists
with open('train.csv') as csvfile:
reader = csv.reader(csvfile, delimiter=',')
csv_data = list(reader)

i_list = []
for row in csv_data:
if (row and int(row[13]) > 0):
i_list.append(int(row[13]))
H_list = []
for row in csv_data:
if (row and int(row[13]) <= 0):
H_list.append(int(row[13]))
for row in reader:

Icount = len(i_list)
IPavg = sum(i_list)/len(i_list)
Hcount = len(H_list)
HPavg = sum(H_list)/len(H_list)
file = open("train.csv")
numline = len(file.readlines())

print(numline)
print("Total amount of healthy patients " + str(Icount))
print("Total amount of ill patients " + str(Hcount))
print("Averages of healthy patients " + str(HPavg))
print("Averages of ill patients " + str(IPavg)

评论中提出的问题示例

CVS File
A B C D N(so on to column 13)
10 .50 ? 44 0
4 4.5 20 34 0
12 ? 33 23 3 (this one would be Ill patient)
11 3.2 32 33 0
[![CSVfile][1]][1]

屏幕截图 Screenshot

最佳答案

这是完整的教程(评论)。如果您想了解如何掌握 Python,请通读它们。

import csv

#turn csv files into a list of lists
with open('train.csv','rU') as csvfile:
reader = csv.reader(csvfile)
csv_data = list(reader)

# Create two lists to handle the patients
# And two more lists to collect the 'sum' of the columns
# The one that needs to hold the sum 'must' have 0 so we
# can work with them more easily
iList = []
iList_sum = [0,0,0,0,0,0,0,0,0,0,0,0,0]

hList = []
hList_sum = [0,0,0,0,0,0,0,0,0,0,0,0,0]

# Only use one loop to make the process mega faster
for row in csv_data:
# If row 13 is greater than 0, then place them as unhealthy
if (row and int(row[13]) > 0):
# This appends the whole 'line'/'row' for storing :)
# That's what you want (instead of saving only one cell at a time)
iList.append(row)

# If it failed the initial condition (greater than 0), then row 13
# is either less than or equal to 0. That's simply the logical outcome
else:
hList.append(row)

# Use these to verify the data and make sure we collected the right thing
# print iList
# [['67', '1', '4', '160', '286', '0', '2', '108', '1', '1.5', '2', '3', '3', '2'], ['67', '1', '4', '120', '229', '0', '2', '129', '1', '2.6', '2', '2', '7', '1']]
# print hList
# [['63', '1', '1', '145', '233', '1', '2', '150', '0', '2.3', '3', '0', '6', '0'], ['37', '1', '3', '130', '250', '0', '0', '187', '0', '3.5', '3', '0', '3', '0']]

# We can use list comprehension, but since this is a beginner task, let's go with basics:

# Loop through all the 'rows' of the ill patient
for ill_data in iList:

# Loop through the data within each row, and sum them up
for i in range(0,len(ill_data) - 1):
iList_sum[i] += float(ill_data[i])


# Now repeat the process for healthy patient
# Loop through all the 'rows' of the healthy patient
for healthy_data in hList:

# Loop through the data within each row, and sum them up
for i in range(0,len(healthy_data) - 1):
hList_sum[i] += float(ill_data[i])

# Using list comprehension, I basically go through each number
# In ill list (sum of all columns), and divide it by the lenght of iList that
# I found from the csv file. So, if there are 22 ill patients, then len(iList) will
# be 22. You can see that the whole thing is wrapped in brackets, so it would show
# as a python list

ill_avg = [ ill / len(iList) for ill in iList_sum]
hlt_avg = [ hlt / len(hList) for hlt in hList_sum]

# Do whatever....

关于python - CSV Python 列/行平均值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36581253/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com