gpt4 book ai didi

database - 如何使用 mongoimport 指定前导字段名称?

转载 作者:可可西里 更新时间:2023-11-01 09:53:57 26 4
gpt4 key购买 nike

我正在将一个非常大的 csv 文件导入到 mongodb,它遵循以下格式:

"zzzàms@hotmail.com","12071988"
"zzzг ms@hotmail.com","12071988"
"zzпїѕпїѕmmbbii2@bk.ru","MA15042002"
"zzпїѕпїѕmmbbii2@list.ru","MA15042002"
"zzпїѕпїѕmmbbii2@rambler.ru","MA15042002"
"zzпїѕпїѕmmbbii2@yandex.ru","MA15042002"

但是,我不确定在电子邮件字段之后会有多少字段/列。

我已经使用这个命令导入了:

mongoimport -d emails -c second --file all.csv --type csv --fields email, number

但是,number 字段之后的任何字段/列都发出默认值“field2”、“field3”等。

{ "_id" : ObjectId("5a5cd95e598f1e910d353e3b"), "email" : "00-amber-00@embarqmail.com", " number" : "number1", "field2" : "number2" }

如何在同一列的数字字段后添加任何内容,使其归类为“数字”?

有时,一个条目可能有 40 列。

除非确实有必要,否则我不想修改 csv 文件。

抱歉,英语不是第一语言,谢谢。

最佳答案

您可以使用 Unix 命令,例如 awk,根据逻辑和 stdin 将行解析为 jsonmongoimport

示例文件

saravana@ubuntu:~$ cat sample-doc.txt 
"zzzàms@hotmail.com","12071988"
"zzzг ms@hotmail.com","12071988"
"zzпїѕпїѕmmbbii2@bk.ru","MA15042002"
"zzпїѕпїѕmmbbii2@list.ru","MA15042002"
"zzпїѕпїѕmmbbii2@rambler.ru","MA15042002","34534"
"zzпїѕпїѕmmbbii2@yandex.ru","MA15042002","1232434","3435435","53534"

awk 转换json, email 后跟数字

saravana@ubuntu:~$ cat sample-doc.txt | awk 'BEGIN{FS=","}{print "{ email :" $1 ", numbers : [ " substr($0,length($1)+2) " ] } " }'
{ email :"zzzàms@hotmail.com", numbers : [ "12071988" ] }
{ email :"zzzг ms@hotmail.com", numbers : [ "12071988" ] }
{ email :"zzпїѕпїѕmmbbii2@bk.ru", numbers : [ "MA15042002" ] }
{ email :"zzпїѕпїѕmmbbii2@list.ru", numbers : [ "MA15042002" ] }
{ email :"zzпїѕпїѕmmbbii2@rambler.ru", numbers : [ "MA15042002","34534" ] }
{ email :"zzпїѕпїѕmmbbii2@yandex.ru", numbers : [ "MA15042002","1232434","3435435","53534" ] }
saravana@ubuntu:~$

mongoimport 使用 stdin

saravana@ubuntu:~$ cat sample-doc.txt | awk 'BEGIN{FS=","}{print "{ email :" $1 ", numbers : [ " substr($0,length($1)+2) " ] } " }' | mongoimport --type json --db test --collection emailnos -v
2018-01-17T09:58:11.559+0530 reading from stdin
2018-01-17T09:58:11.559+0530 using fields:
2018-01-17T09:58:11.561+0530 connected to: localhost
2018-01-17T09:58:11.561+0530 ns: test.emailnos
2018-01-17T09:58:11.561+0530 connected to node type: standalone
2018-01-17T09:58:11.561+0530 using write concern: w='1', j=false, fsync=false, wtimeout=0
2018-01-17T09:58:11.561+0530 using write concern: w='1', j=false, fsync=false, wtimeout=0
2018-01-17T09:58:11.726+0530 imported 6 documents

收藏

> db.emailnos.find()
{ "_id" : ObjectId("5a5ed0dbead4f5f7ae68da90"), "email" : "zzzàms@hotmail.com", "numbers" : [ "12071988" ] }
{ "_id" : ObjectId("5a5ed0dbead4f5f7ae68da91"), "email" : "zzпїѕпїѕmmbbii2@list.ru", "numbers" : [ "MA15042002" ] }
{ "_id" : ObjectId("5a5ed0dbead4f5f7ae68da92"), "email" : "zzпїѕпїѕmmbbii2@rambler.ru", "numbers" : [ "MA15042002", "34534" ] }
{ "_id" : ObjectId("5a5ed0dbead4f5f7ae68da93"), "email" : "zzпїѕпїѕmmbbii2@yandex.ru", "numbers" : [ "MA15042002", "1232434", "3435435", "53534" ] }
{ "_id" : ObjectId("5a5ed0dbead4f5f7ae68da94"), "email" : "zzzг ms@hotmail.com", "numbers" : [ "12071988" ] }
{ "_id" : ObjectId("5a5ed0dbead4f5f7ae68da95"), "email" : "zzпїѕпїѕmmbbii2@bk.ru", "numbers" : [ "MA15042002" ] }
>

关于database - 如何使用 mongoimport 指定前导字段名称?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48267551/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com