gpt4 book ai didi

python - 来自 CSV 文件的 Python 中的 Case/IfElse 语句

转载 作者:行者123 更新时间:2023-12-01 03:11:29 25 4
gpt4 key购买 nike

我有一个 csv 文件 (original.csv),其中包含唯一 ID 列 (uid) 和我想要评估的列,然后创建一个新文件 (result.csv)使用未修改的 uid 并根据评估创建新列。

我的原始文件如下所示:

uid,var01,var02,var03,var04,var05
1,2,3,2,3,1
2,2,2,2,2,1
3,,2,2,1,1
4,2,2,2,1,1
5,1,2,2,1,2
6,3,,2,3,2
7,3,,1,1,1
8,2,3,1,,3
9,3,1,,3,
10,,3,2,3,3

我想做一个与此逻辑相同的评估(用 SQL 编写): case when var01 = 1 then 1 else 0 end as var01_new, case when var02 = 1 then 1 else 0 end as var02_new ,...

结果将如下所示:

uid,var01_new,var02_new,var03_new,var04_new,var05_new
1,0,0,0,0,1
2,0,0,0,0,1
3,0,0,0,1,1
4,0,0,0,1,1
5,1,0,0,1,0
6,0,0,0,0,0
7,0,0,1,1,1
8,0,0,1,0,0
9,0,1,0,0,0
10,0,0,0,0,0

考虑到实际文件的大小(约 20M 行,50+ 列),我希望将解决方案保留在基本 Python 中,而不是像 Pandas 这样的内存有限的包> 和Numpy。我试过modifying this S/O question但我无法让它适用于我的用例。

我尝试了这段代码,但没有成功。

>>> import csv
>>>
>>> sourcepath = "/Users/me/python_case_statement.csv"
>>> destpath = "/Users/me/python_case_statement_flat.csv"
>>>
>>> with open(sourcepath, "rb") as source, open(destpath, "wb") as dest:
... reader = csv.reader(source, delimiter = ',', quotechar='"')
... writer = csv.writer(dest, delimiter = ',', quotechar='"')
... headers = reader.next()
... writer.writerow(headers)
... for rownum, row in enumerate(reader):
... 'uid' = 'uid'
... if 'var01' == 1:
... 'var01_new' == 1
... else:
... 'var01_new' == 0
... row.append(result)
... writer.writerow(row)
...
File "<stdin>", line 7
SyntaxError: can't assign to literal
>>>

最佳答案

所以Python不像SQL那样是一种纯粹的声明性语言,它是过程性的,所以你必须描述控制流,尽管它有很多声明性的结构。所以,

>>> s = """uid,var01,var02,var03,var04,var05
... 1,2,3,2,3,1
... 2,2,2,2,2,1
... 3,,2,2,1,1
... 4,2,2,2,1,1
... 5,1,2,2,1,2
... 6,3,,2,3,2
... 7,3,,1,1,1
... 8,2,3,1,,3
... 9,3,1,,3,
... 10,,3,2,3,3"""
>>> reader = csv.reader(io.StringIO(s))
>>> result = io.StringIO()
>>> writer = csv.writer(result)

上面只是让我们假装我们正在使用流(io.StringIO)来处理文件。但是您可以像使用 with 语句一样完成此操作。现在,问题的症结是:

>>> header = next(reader)
>>> writer.writerow(["{}_new".format(v) for v in header])
59
>>> for row in reader:
... new_row = [row[0]] # uid the same
... new_row.extend(1 if c == '1' else 0 for c in row[1:])
... writer.writerow(new_row)
...
13
13
13
13
13
13
13
13
13
14
>>> print(result.getvalue())
uid_new,var01_new,var02_new,var03_new,var04_new,var05_new
1,0,0,0,0,1
2,0,0,0,0,1
3,0,0,0,1,1
4,0,0,0,1,1
5,1,0,0,1,0
6,0,0,0,0,0
7,0,0,1,1,1
8,0,0,1,0,0
9,0,1,0,0,0
10,0,0,0,0,0

>>>

我使用了理解结构和条件表达式,它们允许使用更好、更具声明性的方式来转换数据。但是,如果没有它们,您可以使用 if-else 语句并构建行来执行相同的操作:

>>> result = io.StringIO()
>>> reader = csv.reader(io.StringIO(s))
>>> writer = csv.writer(result)
>>> header = next(reader)
>>> new_header = []
>>> for s in header:
... new_header.append("{}_new".format(s))
...
>>> writer.writerow(new_header)
59
>>> for row in reader:
... new_row = []
... for c in row:
... if c == '1':
... new_row.append(1)
... else:
... new_row.append(0)
... writer.writerow(new_row)
...
13
13
13
13
13
13
13
13
13
13
>>> print(result.getvalue())
uid_new,var01_new,var02_new,var03_new,var04_new,var05_new
1,0,0,0,0,1
0,0,0,0,0,1
0,0,0,0,1,1
0,0,0,0,1,1
0,1,0,0,1,0
0,0,0,0,0,0
0,0,0,1,1,1
0,0,0,1,0,0
0,0,1,0,0,0
0,0,0,0,0,0

关于python - 来自 CSV 文件的 Python 中的 Case/IfElse 语句,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42864955/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com