python - 如何使用python将日志文件拆分为多个csv文件-6ren

python - 如何使用python将日志文件拆分为多个csv文件

转载作者：太空宇宙更新时间：2023-11-03 15:56:34

我对 python 和编码都很陌生，对于任何愚蠢的问题提前表示歉意。我的程序需要根据关键字“MYLOG”将现有日志文件拆分为多个 *.csv 文件(run1、.csv、run2.csv、...)。如果关键字出现，它应该开始将两个所需的列复制到新文件中，直到关键字再次出现。完成后，需要有与关键字一样多的 csv 文件。

<小时/>

53.2436     EXP     MYLOG: START RUN specs/run03_block_order.csv
53.2589     EXP     TextStim: autoDraw = None
53.2589     EXP     TextStim: autoDraw = None
55.2257     DATA    Keypress: t
57.2412     DATA    Keypress: t
59.2406     DATA    Keypress: t
61.2400     DATA    Keypress: t
63.2393     DATA    Keypress: t
...
89.2314     EXP     MYLOG: START BLOCK scene [specs/run03_block01.csv]
89.2336     EXP     Imported specs/run03_block01.csv as conditions
89.2339     EXP     Created sequence: sequential, trialTypes=9
...

<小时/>

[编辑]:每个文件的输出 (run*.csv) 应如下所示:

onset       type
53.2436     EXP     
53.2589     EXP     
53.2589     EXP     
55.2257     DATA    
57.2412     DATA    
59.2406     DATA    
61.2400     DATA    
...

<小时/>

程序根据需要创建尽可能多的 run*.csv，但我无法将所需的列存储在新文件中。完成后，我得到的只是空的 csv 文件。如果我将计数器变量移至 == 1，它只会创建一个包含所需列的大文件。

再次感谢!

import csv

QUERY = 'MYLOG'

with open('localizer.log', 'rt') as log_input:
i = 0

for line in log_input:

    if QUERY in line:
        i = i + 1

        with open('run' + str(i) + '.csv', 'w') as output:
            reader = csv.reader(log_input, delimiter = ' ')
            writer = csv.writer(output)
            content_column_A = [0]
            content_column_B = [1]

            for row in reader:
                content_A = list(row[j] for j in content_column_A)
                content_B = list(row[k] for k in content_column_B)
                writer.writerow(content_A)
                writer.writerow(content_B)

最佳答案

查看代码，有几处可能是错误的:

csv 读取器应采用文件处理程序，而不是一行。
读取器分隔符不应是单个空格字符，因为日志中的实际分隔符看起来是由多个空格字符组成的可变数量。
循环逻辑似乎有点偏离，使文件/行/行有点困惑。

您可能正在查看类似下面的代码(问题中有待澄清):

import csv
NEW_LOG_DELIMITER = 'MYLOG'

def write_buffer(_index, buffer):
    """
    This function takes an index and a buffer.
    The buffer is just an iterable of iterables (ex a list of lists)
    Each buffer item is a row of values.
    """
    filename = 'run{}.csv'.format(_index)
    with open(filename, 'w') as output:
        writer = csv.writer(output)
        writer.writerow(['onset', 'type'])  # adding the heading
        writer.writerows(buffer)

current_buffer = []
_index = 1

with open('localizer.log', 'rt') as log_input:
    for line in log_input:
        # will deal ok with multi-space as long as
        # you don't care about the last column
        fields = line.split()[:2]
        if not NEW_LOG_DELIMITER in line or not current_buffer:
            # If it's the first line (the current_buffer is empty)
            # or the line does NOT contain "MYLOG" then
            # collect it until it's time to write it to file.
            current_buffer.append(fields)
        else:
            write_buffer(_index, current_buffer)
            _index += 1
            current_buffer = [fields]  # EDIT: fixed bug, new buffer should not be empty
    if current_buffer:
        # We are now out of the loop,
        # if there's an unwritten buffer then write it to file.
        write_buffer(_index, current_buffer)