gpt4 book ai didi

python - 连接多个 CSV 文件

转载 作者:太空宇宙 更新时间:2023-11-03 11:28:08 25 4
gpt4 key购买 nike

我有一个包含多个 csv 文件的文件夹。每个文件都包含一个日期和一个值列。我想将所有文件合并到一个文件中,其中第一列包含值日期(每个文件都相同),其他列由每个单一 vile 的值填充,即(日期,value_file1,value_file2 ... )

关于如何通过简单的 python 脚本或通过 unix 命令的 evan 实现这一目标的任何建议?

感谢您的帮助!

最佳答案

我建议使用像 csvkit's csvjoin 这样的工具

pip install csvkit
$ csvjoin --help
usage: csvjoin [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
[-p ESCAPECHAR] [-z MAXFIELDSIZE] [-e ENCODING] [-S] [-v] [-l]
[--zero] [-c COLUMNS] [--outer] [--left] [--right]
[FILE [FILE ...]]

Execute a SQL-like join to merge CSV files on a specified column or columns.

positional arguments:
FILE The CSV files to operate on. If only one is specified,
it will be copied to STDOUT.

optional arguments:
-h, --help show this help message and exit
-d DELIMITER, --delimiter DELIMITER
Delimiting character of the input CSV file.
-t, --tabs Specifies that the input CSV file is delimited with
tabs. Overrides "-d".
-q QUOTECHAR, --quotechar QUOTECHAR
Character used to quote strings in the input CSV file.
-u {0,1,2,3}, --quoting {0,1,2,3}
Quoting style used in the input CSV file. 0 = Quote
Minimal, 1 = Quote All, 2 = Quote Non-numeric, 3 =
Quote None.
-b, --doublequote Whether or not double quotes are doubled in the input
CSV file.
-p ESCAPECHAR, --escapechar ESCAPECHAR
Character used to escape the delimiter if --quoting 3
("Quote None") is specified and to escape the
QUOTECHAR if --doublequote is not specified.
-z MAXFIELDSIZE, --maxfieldsize MAXFIELDSIZE
Maximum length of a single field in the input CSV
file.
-e ENCODING, --encoding ENCODING
Specify the encoding the input CSV file.
-S, --skipinitialspace
Ignore whitespace immediately following the delimiter.
-v, --verbose Print detailed tracebacks when errors occur.
-l, --linenumbers Insert a column of line numbers at the front of the
output. Useful when piping to grep or as a simple
primary key.
--zero When interpreting or displaying column numbers, use
zero-based numbering instead of the default 1-based
numbering.
-c COLUMNS, --columns COLUMNS
The column name(s) on which to join. Should be either
one name (or index) or a comma-separated list with one
name (or index) for each file, in the same order that
the files were specified. May also be left
unspecified, in which case the two files will be
joined sequentially without performing any matching.
--outer Perform a full outer join, rather than the default
inner join.
--left Perform a left outer join, rather than the default
inner join. If more than two files are provided this
will be executed as a sequence of left outer joins,
starting at the left.
--right Perform a right outer join, rather than the default
inner join. If more than two files are provided this
will be executed as a sequence of right outer joins,
starting at the right.

Note that the join operation requires reading all files into memory. Don't try
this on very large files.

关于python - 连接多个 CSV 文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29813964/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com