gpt4 book ai didi

php - 在 mysql 中存储 10000 行的 python 字典 - 而不是打印

转载 作者:可可西里 更新时间:2023-11-01 07:52:30 25 4
gpt4 key购买 nike

见下文更新 2 我按照你的建议做了:

更新:查看重要更新:

想把数据存到mysql-db

{'url': 'http://dom1', 'name': 'miller', 'name2': 'phil man', 'email-adress': 'waddehadde@hotmail.com'}
{'url': 'http://dom2', 'name': 'jonboy', 'name2': 'Josef dude', 'email-adress': 'waddehadde@hotmail.com'}

我有一个非常简单的数据集,但它非常大:大约 1 万条记录。i-connect-to-a-mysql-database-in-python 我发现我可能可以使用 peewee

import peewee
from peewee import *

db = MySQLDatabase('jonhydb', user='john',passwd='megajonhy')

class Book(peewee.Model):
author = peewee.CharField()
title = peewee.TextField()

class Meta:
database = db

Book.create_table()
book = Book(author="me", title='Peewee is cool')
book.save()
for book in Book.filter(author="me"):
print book.title

Peewee is cool

我知道有两种方法可以做到这一点,一种是将号码存储在服务器上的文件中另一个是在数据库中提供它,使用 MySQL 来存储它。我可以使用 PHP - 将其存储在 mysql-db 中。简单吧?但昨天我发现 Python 甚至更好。我在我的 linux-distri 上安装了 python

但这对我来说似乎太复杂了。

所以这让我想到了一个问题,有没有办法使用 MySQL 和 python peewee 或一些类似的轻量级 orm 来存储这个数据集!?

首先 - 非常感谢您的快速回答

我使用的是 opensuse 版本 13.1mysql 数据库已准备就绪并正在运行。

首先我必须安装 peewee 和 json当我有一些 isseus 安装 peewee - 见下文后来我尝试安装 simplejson - 作为 json 的替代品然后我猜 peewee 是 installabe。 - 见下文:

安装 peewee 时遇到问题

martin@linux-70ce:~>
martin@linux-70ce:~> git clone https://github.com/coleifer/peewee.git
Klone nach 'peewee'...
remote: Reusing existing pack: 5673, done.
remote: Counting objects: 13, done.
remote: Compressing objects: 100% (13/13), done.
remote: Total 5686 (delta 2), reused 0 (delta 0)
Empfange Objekte: 100% (5686/5686), 3.54 MiB | 102.00 KiB/s, done.
Löse Unterschiede auf: 100% (3468/3468), done.
Prüfe Konnektivität... Fertig
martin@linux-70ce:~> cd peewee
martin@linux-70ce:~/peewee> python setup.py install
running install
error: can't create or remove files in install directory

The following error occurred while trying to add or remove files in the
installation directory:

[Errno 13] Permission denied: '/usr/lib/python2.7/site-packages/test-easy-install-5717.write-test'

The installation directory you specified (via --install-dir, --prefix, or
the distutils default setting) was:

/usr/lib/python2.7/site-packages/

Perhaps your account does not have write access to this directory? If the
installation directory is a system-owned directory, you may need to sign in
as the administrator or "root" account. If you do not have administrative
access to this machine, you may wish to choose a different installation
directory, preferably one that is listed in your PYTHONPATH environment
variable.

For information on other options, you may wish to consult the
documentation at:
Please make the appropriate changes for your system and try again.
martin@linux-70ce:~/peewee>

之后 - 如上所述 - 我安装了 simplejson 然后我尝试再次安装 peewee::现在我想我有更多的运气....

martin@linux-70ce:~/peewee> git clone https://github.com/coleifer/peewee.git
Klone nach 'peewee'...
remote: Reusing existing pack: 5673, done.
remote: Counting objects: 13, done.
remote: Compressing objects: 100% (13/13), done.
remote: Total 5686 (delta 2), reused 0 (delta 0)
Empfange Objekte: 100% (5686/5686), 3.54 MiB | 309.00 KiB/s, done.
fix the conflicts: 100% (3468/3468), done.

Prüfe Konnektivität... Fertig
马丁@linux-70ce:~/peewee>

查看完整代码——我正在运行

import urllib
import urlparse
import re
from peewee import *
import json

db = MySQLDatabase('cpan', user='root',passwd='rimbaud')

class User(Model):
name = TextField()
cname = TextField()
email = TextField()
url = TextField()

class Meta:
database = db # this model uses the cpan database


User.create_table() #ensure table is created


url = "http://search.cpan.org/author/?W"
html = urllib.urlopen(url).read()
for lk, capname, name in re.findall('<a href="(/~.*?/)"><b>(.*?)</b></a><br/><small>(.*?)</small>', html):

alk = urlparse.urljoin(url, lk)

data = { 'url':alk, 'name':name, 'cname':capname }

phtml = urllib.urlopen(alk).read()
memail = re.search('<a href="mailto:(.*?)">', phtml)
if memail:
data['email'] = memail.group(1)


data = json.load() #your json data file here

for entry in data: #assuming your data is an array of JSON objects
user = User.create(name=entry["name"], cname=entry["cname"],
email=entry["email"], url=entry["url"])
user.save()

以及随后的结果:

       martin@linux-70ce:~/perl> python cpan6.py
python: can't open file 'cpan6.py': [Errno 2] No such file or directory
martin@linux-70ce:~/perl> python cpan5.py
Traceback (most recent call last):
File "cpan5.py", line 7, in <module>
from peewee import *
ImportError: No module named peewee
martin@linux-70ce:~/perl>

好吧,现在我有点无能为力 - 很想听听你的消息提前谢谢了!!

更新 2 我按照你的建议做了:

martin@linux-70ce:~/peewee> sudo python setup.py install 

t
linux-70ce:/home/martin/perl # cd ..
linux-70ce:/home/martin # cd peewee/
linux-70ce:/home/martin/peewee # sudo python setup.py install
running install
running bdist_egg
running egg_info
creating peewee.egg-info
writing peewee.egg-info/PKG-INFO
writing top-level names to peewee.egg-info/top_level.txt
writing dependency_links to peewee.egg-info/dependency_links.txt
writing manifest file 'peewee.egg-info/SOURCES.txt'
reading manifest file 'peewee.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'peewee.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-i686/egg
running install_lib
running build_py
creating build
creating build/lib
copying peewee.py -> build/lib
copying pwiz.py -> build/lib
creating build/lib/playhouse
copying playhouse/tests_shortcuts.py -> build/lib/playhouse
copying playhouse/read_slave.py -> build/lib/playhouse
copying playhouse/tests_kv.py -> build/lib/playhouse
copying playhouse/tests_pwiz.py -> build/lib/playhouse
copying playhouse/gfk.py -> build/lib/playhouse
copying playhouse/test_utils.py -> build/lib/playhouse
copying playhouse/apsw_ext.py -> build/lib/playhouse
copying playhouse/__init__.py -> build/lib/playhouse
copying playhouse/tests_pool.py -> build/lib/playhouse
copying playhouse/tests_apsw.py -> build/lib/playhouse
copying playhouse/kv.py -> build/lib/playhouse
copying playhouse/postgres_ext.py -> build/lib/playhouse
copying playhouse/signals.py -> build/lib/playhouse
copying playhouse/sqlcipher_ext.py -> build/lib/playhouse
copying playhouse/tests_gfk.py -> build/lib/playhouse
copying playhouse/tests_read_slave.py -> build/lib/playhouse
copying playhouse/pool.py -> build/lib/playhouse
copying playhouse/tests_csv_loader.py -> build/lib/playhouse
copying playhouse/tests_berkeleydb.py -> build/lib/playhouse
copying playhouse/djpeewee.py -> build/lib/playhouse
copying playhouse/tests_test_utils.py -> build/lib/playhouse
copying playhouse/migrate.py -> build/lib/playhouse
copying playhouse/csv_loader.py -> build/lib/playhouse
copying playhouse/tests_postgres.py -> build/lib/playhouse
copying playhouse/tests_migrate.py -> build/lib/playhouse
copying playhouse/tests_djpeewee.py -> build/lib/playhouse
copying playhouse/tests_sqlite_ext.py -> build/lib/playhouse
copying playhouse/berkeleydb.py -> build/lib/playhouse
copying playhouse/proxy.py -> build/lib/playhouse
copying playhouse/tests_signals.py -> build/lib/playhouse
copying playhouse/sqlite_ext.py -> build/lib/playhouse
copying playhouse/tests_sqlcipher_ext.py -> build/lib/playhouse
copying playhouse/shortcuts.py -> build/lib/playhouse
creating build/bdist.linux-i686
creating build/bdist.linux-i686/egg
copying build/lib/peewee.py -> build/bdist.linux-i686/egg
creating build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_shortcuts.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/read_slave.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_kv.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_pwiz.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/gfk.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/test_utils.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/apsw_ext.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/__init__.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_pool.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_apsw.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/kv.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/postgres_ext.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/signals.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/sqlcipher_ext.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_gfk.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_read_slave.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/pool.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_csv_loader.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_berkeleydb.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/djpeewee.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_test_utils.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/migrate.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/csv_loader.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_postgres.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_migrate.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_djpeewee.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_sqlite_ext.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/berkeleydb.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/proxy.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_signals.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/sqlite_ext.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/tests_sqlcipher_ext.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/playhouse/shortcuts.py -> build/bdist.linux-i686/egg/playhouse
copying build/lib/pwiz.py -> build/bdist.linux-i686/egg
byte-compiling build/bdist.linux-i686/egg/peewee.py to peewee.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_shortcuts.py to tests_shortcuts.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/read_slave.py to read_slave.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_kv.py to tests_kv.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_pwiz.py to tests_pwiz.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/gfk.py to gfk.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/test_utils.py to test_utils.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/apsw_ext.py to apsw_ext.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/__init__.py to __init__.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_pool.py to tests_pool.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_apsw.py to tests_apsw.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/kv.py to kv.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/postgres_ext.py to postgres_ext.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/signals.py to signals.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/sqlcipher_ext.py to sqlcipher_ext.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_gfk.py to tests_gfk.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_read_slave.py to tests_read_slave.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/pool.py to pool.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_csv_loader.py to tests_csv_loader.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_berkeleydb.py to tests_berkeleydb.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/djpeewee.py to djpeewee.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_test_utils.py to tests_test_utils.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/migrate.py to migrate.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/csv_loader.py to csv_loader.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_postgres.py to tests_postgres.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_migrate.py to tests_migrate.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_djpeewee.py to tests_djpeewee.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_sqlite_ext.py to tests_sqlite_ext.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/berkeleydb.py to berkeleydb.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/proxy.py to proxy.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_signals.py to tests_signals.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/sqlite_ext.py to sqlite_ext.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/tests_sqlcipher_ext.py to tests_sqlcipher_ext.pyc
byte-compiling build/bdist.linux-i686/egg/playhouse/shortcuts.py to shortcuts.pyc
byte-compiling build/bdist.linux-i686/egg/pwiz.py to pwiz.pyc
creating build/bdist.linux-i686/egg/EGG-INFO
installing scripts to build/bdist.linux-i686/egg/EGG-INFO/scripts
running install_scripts
running build_scripts
creating build/scripts-2.7
copying and adjusting pwiz.py -> build/scripts-2.7
changing mode of build/scripts-2.7/pwiz.py from 644 to 755
creating build/bdist.linux-i686/egg/EGG-INFO/scripts
copying build/scripts-2.7/pwiz.py -> build/bdist.linux-i686/egg/EGG-INFO/scripts
changing mode of build/bdist.linux-i686/egg/EGG-INFO/scripts/pwiz.py to 755
copying peewee.egg-info/PKG-INFO -> build/bdist.linux-i686/egg/EGG-INFO
copying peewee.egg-info/SOURCES.txt -> build/bdist.linux-i686/egg/EGG-INFO
copying peewee.egg-info/dependency_links.txt -> build/bdist.linux-i686/egg/EGG-INFO
copying peewee.egg-info/top_level.txt -> build/bdist.linux-i686/egg/EGG-INFO
zip_safe flag not set; analyzing archive contents...
creating dist
creating 'dist/peewee-2.2.5-py2.7.egg' and adding 'build/bdist.linux-i686/egg' to it
removing 'build/bdist.linux-i686/egg' (and everything under it)
Processing peewee-2.2.5-py2.7.egg
Copying peewee-2.2.5-py2.7.egg to /usr/lib/python2.7/site-packages
Adding peewee 2.2.5 to easy-install.pth file
Installing pwiz.py script to /usr/bin

Installed /usr/lib/python2.7/site-packages/peewee-2.2.5-py2.7.egg
Processing dependencies for peewee==2.2.5
Finished processing dependencies for peewee==2.2.5
linux-70ce:/home/martin/peewee #

查看最新更新 - 正确安装 peewee 后,我运行脚本,现在看看发生了什么。

import urllib
import urlparse
import re
import peewee
import json

db = MySQLDatabase('cpan', user='root',passwd='rimbaud')

class User(Model):
name = TextField()
cname = TextField()
email = TextField()
url = TextField()

class Meta:
database = db # this model uses the cpan database


User.create_table() #ensure table is created


url = "http://search.cpan.org/author/?W"
html = urllib.urlopen(url).read()
for lk, capname, name in re.findall('<a href="(/~.*?/)"><b>(.*?)</b></a><br/><small>(.*?)</small>', html):
alk = urlparse.urljoin(url, lk)

data = { 'url':alk, 'name':name, 'cname':capname }

phtml = urllib.urlopen(alk).read()
memail = re.search('<a href="mailto:(.*?)">', phtml)
if memail:
data['email'] = memail.group(1)


data = json.load() #your json data file here

for entry in data: #assuming your data is an array of JSON objects
user = User.create(name=entry["name"], cname=entry["cname"],
email=entry["email"], url=entry["url"])
user.save()

我找回了这个错误。

Traceback (most recent call last):
File "cpan5.py", line 10, in <module>
db = MySQLDatabase('cpan', user='root',passwd='rimbaud')
NameError: name 'MySQLDatabase' is not defined
linux-70ce:/home/martin/perl #

最新更新:从 7 月 14 日开始

假设现在一切正常 - 我已经设置好了...所以好吧 - 但它在某个时候失败了。

import urllib
import urlparse
import re
# import peewee
import json
from peewee import *



#from peewee import MySQLDatabase ('cpan', user='root',passwd='rimbaud')


db = MySQLDatabase('cpan', user='root',passwd='rimbaud')

class User(Model):
name = TextField()
cname = TextField()
email = TextField()
url = TextField()

class Meta:
database = db # this model uses the cpan database


User.create_table() #ensure table is created


url = "http://search.cpan.org/author/?W"
html = urllib.urlopen(url).read()
for lk, capname, name in re.findall('<a href="(/~.*?/)"><b>(.*?)</b></a><br/><small>(.*?)</small>', html):
alk = urlparse.urljoin(url, lk)

data = { 'url':alk, 'name':name, 'cname':capname }

phtml = urllib.urlopen(alk).read()
memail = re.search('<a href="mailto:(.*?)">', phtml)
if memail:
data['email'] = memail.group(1)


data = json.load('email') #your json data file here

for entry in data: #assuming your data is an array of JSON objects
user = User.create(name=entry["name"], cname=entry["cname"],
email=entry["email"], url=entry["url"])
user.save()

猜测必须存在一个数据文件:一个在解析过程中由脚本创建的...对吗?

)
martin@linux-70ce:~/perl> python cpan_100.py
Traceback (most recent call last):
File "cpan_100.py", line 47, in <module>
data = json.load('email') #your json data file here
File "/usr/lib/python2.7/json/__init__.py", line 286, in load
return loads(fp.read(),
AttributeError: 'str' object has no attribute 'read'
martin@linux-70ce:~/perl>

很高兴收到你的来信

最佳答案

假设你想使用 python 和 peewee,我会做类似下面的事情:

from peewee import *
import json

db = MySQLDatabase('jonhydb', user='john',passwd='megajonhy')

class User(Model):
name = TextField()
name2 = TextField()
email_address = TextField()
url = TextField()

class Meta:
database = db # this model uses the jonhydb database

User.create_table() #ensure table is created

data = json.load() #your json data file here

for entry in data: #assuming your data is an array of JSON objects
user = User.create(name=entry["name"], name2=entry["name2"],
email_address=entry["email-adress"], url=entry["url"])
user.save()

关于php - 在 mysql 中存储 10000 行的 python 字典 - 而不是打印,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24635811/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com