gpt4 book ai didi

python - 按列值将 CSV 文件分类为不同的 CSV

转载 作者:行者123 更新时间:2023-11-28 18:15:54 26 4
gpt4 key购买 nike

在编程方面,我仍然是一个初学者,我的代码遇到了一些问题。在这里搜索解决方案,但遗憾的是没有任何帮助。

我正在尝试做的事情:我有一个 csv 文件(我从多个 txt.files 导入)。我的一个专栏列出了从 2015 年到 1991 年的年份,我想根据相应的年份将文件的所有行排序到不同的 csvs 中。我当前的代码看起来像这样(尽管我对其进行了相当多的更改,尝试使用我在这边找到的技巧)

einzel = pd.read_csv("501-1000.csv", sep='\t',header=0,index_col=False,usecols = ("TI","AB","PY","DI"),dtype = str)

with open("501-1000.csv", "r",encoding="utf_8"):

for row in einzel:
if einzel["PY"] == ["2015","2014","2013","2012","2011"]:
with open("a.csv","a") as out:
writer.writerow(row)
elif einzel["PY"] == ["2010","2009","2008","2007","2006"]:
with open("b.csv","a") as out:
writer.writerow(row)
elif einzel["PY"] == ["2005","2004","2003","2002","2001"]:
with open("c.csv","a") as out:
writer.writerow(row)
elif einzel["PY"] == ["2000","1999","1998","1997","1996"]:
with open("d.csv","a") as out:
writer.writerow(row)
elif einzel["PY"] == ["1995","1994","1993","1992","1991"]:
with open("e.csv","a") as out:
writer.writerow(row)

现在......这不起作用,我得到一个错误

ValueError:数组长度不同:489 与 5

回溯是

ValueError                                Traceback (most recent call last)
<ipython-input-10-72280961cb7d> in <module>()
19 # writer = csv.writer(out)
20 for row in einzel:
---> 21 if einzel["PY"] == ["2015","2014","2013","2012","2011"]:
22 with open("a.csv","a") as out:
23 writer.writerow(row)

~\Anaconda3\lib\site-packages\pandas\core\ops.py in wrapper(self, other, axis)
859
860 with np.errstate(all='ignore'):
--> 861 res = na_op(values, other)
862 if is_scalar(res):
863 raise TypeError('Could not compare %s type with Series' %

~\Anaconda3\lib\site-packages\pandas\core\ops.py in na_op(x, y)
763
764 if is_object_dtype(x.dtype):
--> 765 result = _comp_method_OBJECT_ARRAY(op, x, y)
766 else:
767

~\Anaconda3\lib\site-packages\pandas\core\ops.py in _comp_method_OBJECT_ARRAY(op, x, y)
741 y = y.values
742
--> 743 result = lib.vec_compare(x, y, op)
744 else:
745 result = lib.scalar_compare(x, y, op)

pandas\_libs\lib.pyx in pandas._libs.lib.vec_compare()

ValueError: Arrays were different lengths: 489 vs 5

我在这里搜索了错误,但遗憾的是没有一个解决方案有效或者我不理解它们。我开始使用类似这样的东西,但它也不起作用..

with open("501-1000.csv", "r",encoding="utf_8") as inp:
#reader = csv.reader(inp)
#writer = csv.writer(out)

如果有任何提示或更正,我会非常高兴,如果我提出问题的方式有任何问题,我会更正它......第一篇文章等等。

最佳答案

这是一个 pandas 解决方案。

import pandas as pd

filemap_dict = {'a': set(range(2011, 2016)),
'b': set(range(2006, 2011)),
'c': set(range(2001, 2006)),
'd': set(range(1996, 2001)),
'e': set(range(1991, 1996))}

# check your mappings are mutually exclusive
assert not set.intersection(*list(filemap_dict.values())), "Year ranges are not mutually exclusive!"

# load data; note dtype not set to str since there appear to be numeric columns
cols = ['TI', 'AB', 'PY', 'DI']
df = pd.read_csv('501-1000.csv', sep='\t', header=None, index_col=False, names=cols, usecols=cols)

# cycle through filename_dict, slice and export to csv
for k, v in filemap_dict.items():
df[df['PY'].isin(v)].to_csv(k+'.csv', index=False)

关于python - 按列值将 CSV 文件分类为不同的 CSV,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48423851/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com