我对 python 和编码都很陌生,对于任何愚蠢的问题提前表示歉意。我的程序需要根据关键字“MYLOG”将现有日志文件拆分为多个 *.csv 文件(run1、.csv、run2.csv、...)。如果关键字出现,它应该开始将两个所需的列复制到新文件中,直到关键字再次出现。完成后,需要有与关键字一样多的 csv 文件。
<小时/>
53.2436 EXP MYLOG: START RUN specs/run03_block_order.csv
53.2589 EXP TextStim: autoDraw = None
53.2589 EXP TextStim: autoDraw = None
55.2257 DATA Keypress: t
57.2412 DATA Keypress: t
59.2406 DATA Keypress: t
61.2400 DATA Keypress: t
63.2393 DATA Keypress: t
...
89.2314 EXP MYLOG: START BLOCK scene [specs/run03_block01.csv]
89.2336 EXP Imported specs/run03_block01.csv as conditions
89.2339 EXP Created sequence: sequential, trialTypes=9
...
<小时/>
[编辑]:每个文件的输出 (run*.csv) 应如下所示:
onset type
53.2436 EXP
53.2589 EXP
53.2589 EXP
55.2257 DATA
57.2412 DATA
59.2406 DATA
61.2400 DATA
...
<小时/>
程序根据需要创建尽可能多的 run*.csv,但我无法将所需的列存储在新文件中。完成后,我得到的只是空的 csv 文件。如果我将计数器变量移至 == 1,它只会创建一个包含所需列的大文件。
再次感谢!
import csv
QUERY = 'MYLOG'
with open('localizer.log', 'rt') as log_input:
i = 0
for line in log_input:
if QUERY in line:
i = i + 1
with open('run' + str(i) + '.csv', 'w') as output:
reader = csv.reader(log_input, delimiter = ' ')
writer = csv.writer(output)
content_column_A = [0]
content_column_B = [1]
for row in reader:
content_A = list(row[j] for j in content_column_A)
content_B = list(row[k] for k in content_column_B)
writer.writerow(content_A)
writer.writerow(content_B)
查看代码,有几处可能是错误的:
- csv 读取器应采用文件处理程序,而不是一行。
- 读取器分隔符不应是单个空格字符,因为日志中的实际分隔符看起来是由多个空格字符组成的可变数量。
- 循环逻辑似乎有点偏离,使文件/行/行有点困惑。
您可能正在查看类似下面的代码(问题中有待澄清):
import csv
NEW_LOG_DELIMITER = 'MYLOG'
def write_buffer(_index, buffer):
"""
This function takes an index and a buffer.
The buffer is just an iterable of iterables (ex a list of lists)
Each buffer item is a row of values.
"""
filename = 'run{}.csv'.format(_index)
with open(filename, 'w') as output:
writer = csv.writer(output)
writer.writerow(['onset', 'type']) # adding the heading
writer.writerows(buffer)
current_buffer = []
_index = 1
with open('localizer.log', 'rt') as log_input:
for line in log_input:
# will deal ok with multi-space as long as
# you don't care about the last column
fields = line.split()[:2]
if not NEW_LOG_DELIMITER in line or not current_buffer:
# If it's the first line (the current_buffer is empty)
# or the line does NOT contain "MYLOG" then
# collect it until it's time to write it to file.
current_buffer.append(fields)
else:
write_buffer(_index, current_buffer)
_index += 1
current_buffer = [fields] # EDIT: fixed bug, new buffer should not be empty
if current_buffer:
# We are now out of the loop,
# if there's an unwritten buffer then write it to file.
write_buffer(_index, current_buffer)
我是一名优秀的程序员,十分优秀!