gpt4 book ai didi

mysql - 使用 bash 将多个 .csv 文件导入到 mysql

转载 作者:行者123 更新时间:2023-11-29 15:39:26 24 4
gpt4 key购买 nike

我想使用 bash 脚本首先将文件从 google 驱动器复制到本地驱动器并重命名(以删除空格并添加前缀),然后将这些 .csv 文件导入到 mysql 数据库中。这些 .csv 有不同的列,因此我需要从每个 .csv 中获取列名称,并使用这些列名称创建一个表,然后导入数据。

如果有任何帮助,我将不胜感激。

我在这里找到了一些代码 https://ericlondon.com/2011/04/10/a-bash-shell-script-to-import-a-large-number-of-csv-files-into-mysql.html我已经修改了但我无法让它工作。我承认我不明白这一切的作用。

这是我目前的脚本:

#!/bin/bash

# show commands being executed, per debug
set -x

# define database connectivity
_db="dbname"
_db_user="user"
_db_password="password"

# delete existing files
rm -rf /path/to/*.csv

# copy files to local disk
cp /old/path/to/*.csv /path/to/

# define directory containing CSV files
_csv_directory="/path/to"

# go into directory
cd $_csv_directory || exit

# edit file name
rename "s/ //g" *.csv
rename "s/^/tp/g" *.csv

# get a list of CSV files in directory
_csv_files=`ls -1 *.csv`

# loop through csv files
for _csv_file in ${_csv_files[@]}
do

# remove file extension
_csv_file_extensionless=`echo "$_csv_file" | sed 's/\(.*\)\..*/\1/'`

# define table name
_table_name="${_csv_file_extensionless}"

# get header columns from CSV file
_header_columns=`head -1 $_csv_directory/$_csv_file | tr ',' '\n' | sed 's/"//' | sed 's/ /_/g'`
_header_columns_string=`head -1 $_csv_directory/$_csv_file | sed 's/ /_/g' | sed 's/"//g' | sed 's/(//g' | sed 's/)//g'`

# ensure table exists
mysql -u $_db_user -p$_db_password $_db << eof
CREATE TABLE IF NOT EXISTS \`$_table_name\` ENGINE=MyISAM DEFAULT CHARSET=utf8
eof

# loop through header columns
for _header in "${_header_columns[@]}"
do

# add column
mysql -u $_db_user -p$_db_password $_db --execute="alter table \`$_table_name\` add column IF NOT EXISTS \`$_header\` text"

done

# import csv into mysql
mysqlimport --fields-enclosed-by='"' --fields-terminated-by=',' --lines-terminated-by="\r\n" --columns=$_header_columns_string -u $_db_user -p$_db_password $_db $_csv_directory/$_csv_file

done
exit

这是我为每个 .csv 获得的更新输出:

++ for _csv_file in ${_csv_files[@]}
+++ echo tpTriplexBC.csv
+++ sed 's/\(.*\)\..*/\1/'
++ _csv_file_extensionless=tpTriplexBC
++ _table_name=tpTriplexBC
+++ head -1 /path/to/tpTriplexBC.csv
+++ tr , '\n'
+++ sed 's/^"//'
+++ sed 's/"$//'
+++ sed 's/ /_/g'
+++ sed 's/(//g'
+++ sed 's/)//g'
++ _header_columns='ProductName
ProductID
Quantity
Tax
Paper_Type
PriceExpress
Production_Days_Express
PriceStandard
Production_Days_Standard
PriceSaver
Production_Days_Saver
PriceSameday
Production_Days_Sameday
Price_Just_Print
Price_File_Check
Price_File_Check_with_Proofing
Price_File_Assist
Artwork_Tax
Size
Flat_Width_mm
Flat_Height_mm
Finished_Width_mm
Finished_Height_mm
productionDataJSON
'pdatedAt"
+++ head -1 /path/to/tpTriplexBC.csv
+++ sed 's/ /_/g'
+++ sed 's/"//g'
+++ sed 's/(//g'
+++ sed 's/)//g'
++ _header_columns_string=$'ProductName,ProductID,Quantity,Tax,Paper_Type,PriceExpress,Production_Days_Express,PriceStandard,Production_Days_Standard,PriceSaver,Production_Days_Saver,PriceSameday,Production_Days_Sameday,Price_Just_Print,Price_File_Check,Price_File_Check_with_Proofing,Price_File_Assist,Artwork_Tax,Size,Flat_Width_mm,Flat_Height_mm,Finished_Width_mm,Finished_Height_mm,productionDataJSON,updatedAt\r'
++ mysql -u user -ppassword dbname
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `ProductName` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `ProductID` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Quantity` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Tax` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Paper_Type` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `PriceExpress` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Production_Days_Express` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `PriceStandard` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Production_Days_Standard` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `PriceSaver` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Production_Days_Saver` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `PriceSameday` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Production_Days_Sameday` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Price_Just_Print` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Price_File_Check` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Price_File_Check_with_Proofing` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Price_File_Assist` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Artwork_Tax` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Size` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Flat_Width_mm` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Flat_Height_mm` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Finished_Width_mm` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `Finished_Height_mm` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexBC` ADD COLUMN IF NOT EXISTS `productionDataJSON` text'
++ for _header in ${_header_columns[@]}
++ mysql -u user -ppassword dbname '--execute=ALTER TABLE `tpTriplexB` text'COLUMN IF NOT EXISTS `updatedAt"
'RROR 1166 (42000) at line 1: Incorrect column name 'updatedAt"
++ mysqlimport '--fields-enclosed-by="' --fields-terminated-by=, '--lines-terminated-by=\n' $'--columns=ProductName,ProductID,Quantity,Tax,Paper_Type,PriceExpress,Production_Days_Express,PriceStandard,Production_Days_Standard,PriceSaver,Production_Days_Saver,PriceSameday,Production_Days_Sameday,Price_Just_Print,Price_File_Check,Price_File_Check_with_Proofing,Price_File_Assist,Artwork_Tax,Size,Flat_Width_mm,Flat_Height_mm,Finished_Width_mm,Finished_Height_mm,productionDataJSON,updatedAt\r' -u user -ppassword dbname /path/to/tpTriplexBC.csv
mysqlimport: Error: 1054, Unknown column 'updatedAt' in 'field list', when using table: tpTriplexBC
++ exit

列已插入到每个表中,但没有数据。还创建了我不想要的名为 -1 和 ls 的表。

csv 的第一行是:“产品名称”、“产品 ID”、“数量”、“税费”、“特殊表面处理”、“层压板”、“打印面”、“ Material ”、“价格标准”、“生产天数标准”、“PriceSaver” ,“节省生产天数”,“当天定价”,“当天生产天数”,“只需打印价格”,“价格文件检查”,“价格文件检查与校对”,“价格文件协助”,“艺术品税”,“尺寸","平面宽度(毫米)","平面高度(毫米)","成品宽度(毫米)","成品高度(毫米)","生产数据JSON","updatedAt"

header 的八进制转储输出:

0000000   "   P   r   o   d   u   c   t   N   a   m   e   "   ,   "   P
0000020 r o d u c t I D " , " Q u a n t
0000040 i t y " , " T a x " , " P a p e
0000060 r T y p e " , " P r i c e E x
0000100 p r e s s " , " P r o d u c t i
0000120 o n D a y s E x p r e s s "
0000140 , " P r i c e S t a n d a r d "
0000160 , " P r o d u c t i o n D a y
0000200 s S t a n d a r d " , " P r i
0000220 c e S a v e r " , " P r o d u c
0000240 t i o n D a y s S a v e r "
0000260 , " P r i c e S a m e d a y " ,
0000300 " P r o d u c t i o n D a y s
0000320 S a m e d a y " , " P r i c e
0000340 J u s t P r i n t " , " P r
0000360 i c e F i l e C h e c k " ,
0000400 " P r i c e F i l e C h e c
0000420 k w i t h P r o o f i n g "
0000440 , " P r i c e F i l e A s s
0000460 i s t " , " A r t w o r k T a
0000500 x " , " S i z e " , " F l a t
0000520 W i d t h ( m m ) " , " F l a
0000540 t H e i g h t ( m m ) " , "
0000560 F i n i s h e d W i d t h (
0000600 m m ) " , " F i n i s h e d H
0000620 e i g h t ( m m ) " , " p r o
0000640 d u c t i o n D a t a J S O N "
0000660 , " u p d a t e d A t " \r \n

最佳答案

查看 CSV header 的 od 输出,您会发现 CSV 行末尾有一个回车符和换行符(\r\n顺序)。 CSV 是否来自 Windows 下的处理? \r\n 是 Windows 下常见的文本行终止符,而 Linux 只使用 \n\r 作为字符串的一部分包含在内,并且不被视为行终止的一部分。它还可能产生奇怪的输出,因为 \r 本身在输出时将光标位置重置为第一列,并且后续字符显示在先前可能在同一行上显示的任何内容的顶部.

我建议首先在 CSV 文件上运行 dos2unix,然后进行处理。我认为这将清除奇怪的错误并帮助文件其余部分的处理正常工作。

关于mysql - 使用 bash 将多个 .csv 文件导入到 mysql,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57833142/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com