gpt4 book ai didi

Python循环遍历csv列表并检查值?

转载 作者:行者123 更新时间:2023-12-04 09:46:18 25 4
gpt4 key购买 nike

我有五个 .csv 文件,它们具有相同顺序的相同字段,需要这样处理:

  • 获取文件列表
  • 将每个文件变成一个数据框
  • 检查一列字母数字组合是否具有特定值(每个文件不同)例如:检查数字 PT333column1对于文件名 data1 :
  • column1   column2    column3    
    PT389 LA image.jpg
    PT372 NY image2.jpg
  • 如果该列具有特定值,则打印它具有的值以及我分配给该文件的文件名/变量名,然后将该数据框重命名为 output1

  • 我试图这样做,但我不知道如何让它循环并为每个文件做同样的事情。
    目前它返回数字,但我也希望它返回数据框名称,我还希望它遍历所有文件(a 到 e)以检查 numbers 中的所有值。列表。

    这就是我所拥有的:
    import os
    import glob
    import pandas as pd
    from glob import glob
    from os.path import expanduser

    home = expanduser("~")
    os.chdir(home + f'/files/')

    data = glob.glob('data*.csv')
    data

    # If you have tips on how to loop through these rather than
    # have a line for each one, open to feedback
    a = pd.read_csv(data[0], encoding='ISO-8859-1', error_bad_lines=False)
    b = pd.read_csv(data[1], encoding='ISO-8859-1', error_bad_lines=False)
    c = pd.read_csv(data[2], encoding='ISO-8859-1', error_bad_lines=False)
    d = pd.read_csv(data[3], encoding='ISO-8859-1', error_bad_lines=False)
    e = pd.read_csv(data[4], encoding='ISO-8859-1', error_bad_lines=False)
    filenames = [a,b,c,d,e]
    filelist= ['a','b','c','d','e']

    # I am aware that this part is repetitive. Unsure how to fix this,
    # I keep getting errors
    # Any help appreciated
    numbers = ['PT333', 'PT121', 'PT111', 'PT211', 'PT222']
    def type():
    for i in a.column1:
    if i == numbers[0]:
    print(numbers[0])
    elif i == numbers[1]:
    print(numbers[1])
    elif i == numbers[2]:
    print(numbers[2])
    elif i == numbers[3]:
    print(numbers[3])
    elif i == numbers[4]:
    print(numbers[4])
    type()

    也很乐意接受关于如何减少重复代码并使事情更顺畅的任何 build 性批评。 TIA

    最佳答案

    试试这个

    for file in glob.glob('data*.csv'):       # loop through each file
    df = pd.read_csv(file, # create the DataFrame of the file
    encoding='ISO-8859-1',
    error_bad_lines=False)
    result = df.where( \ # Check where the DF contains these numbers
    df.isin(numbers)) \
    .melt()['value'] \ # melt the DF to be a series of 'value'
    .dropna() \ # Remove any nans (non match)
    .unique().tolist() # Return the unique values as a list.
    if result: # If there are any results
    print(file, ', '.join(result) # print the file name, and the results

    如果您要复制和粘贴代码,请删除注释和尾随空格。为 result行,以防您遇到 SyntaxError .

    如前所述,您应该也可以在没有 DataFrame 的情况下执行相同操作:
    for file in glob.glob('data*.csv'):
    data = file.read()
    for num in numbers:
    if num in data:
    print(file, num)

    关于Python循环遍历csv列表并检查值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62090623/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com