gpt4 book ai didi

python - 使用 sqlalchemy 将 csv 文件加载到数据库中

转载 作者:IT老高 更新时间:2023-10-28 22:16:34 28 4
gpt4 key购买 nike

我想将 csv 文件加载到数据库中

最佳答案

由于 SQLAlchemy 的强大功能,我也在一个项目中使用它。它的力量来自于与数据库“对话”的面向对象的方式,而不是硬编码难以管理的 SQL 语句。更不用说,它也快了很多。

直截了本地回答你的问题,是的!使用 SQLAlchemy 将数据从 CSV 存储到数据库中是小菜一碟。这是一个完整的工作示例(我使用了 SQLAlchemy 1.0.6 和 Python 2.7.6):

from numpy import genfromtxt
from time import time
from datetime import datetime
from sqlalchemy import Column, Integer, Float, Date
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

def Load_Data(file_name):
data = genfromtxt(file_name, delimiter=',', skip_header=1, converters={0: lambda s: str(s)})
return data.tolist()

Base = declarative_base()

class Price_History(Base):
#Tell SQLAlchemy what the table name is and if there's any table-specific arguments it should know about
__tablename__ = 'Price_History'
__table_args__ = {'sqlite_autoincrement': True}
#tell SQLAlchemy the name of column and its attributes:
id = Column(Integer, primary_key=True, nullable=False)
date = Column(Date)
opn = Column(Float)
hi = Column(Float)
lo = Column(Float)
close = Column(Float)
vol = Column(Float)

if __name__ == "__main__":
t = time()

#Create the database
engine = create_engine('sqlite:///csv_test.db')
Base.metadata.create_all(engine)

#Create the session
session = sessionmaker()
session.configure(bind=engine)
s = session()

try:
file_name = "t.csv" #sample CSV file used: http://www.google.com/finance/historical?q=NYSE%3AT&ei=W4ikVam8LYWjmAGjhoHACw&output=csv
data = Load_Data(file_name)

for i in data:
record = Price_History(**{
'date' : datetime.strptime(i[0], '%d-%b-%y').date(),
'opn' : i[1],
'hi' : i[2],
'lo' : i[3],
'close' : i[4],
'vol' : i[5]
})
s.add(record) #Add all the records

s.commit() #Attempt to commit all the records
except:
s.rollback() #Rollback the changes on error
finally:
s.close() #Close the connection
print "Time elapsed: " + str(time() - t) + " s." #0.091s

(注意:这不一定是执行此操作的“最佳”方式,但我认为这种格式对于初学者来说非常易读;它也非常快:插入 251 条记录只需 0.091 秒!)

我认为,如果您逐行浏览它,您会发现使用起来是多么轻而易举。注意缺少 SQL 语句——万岁!我还冒昧地使用 numpy 在两行中加载 CSV 内容,但如果您愿意,也可以不使用它。

如果您想与传统的做法进行比较,这里有一个完整的示例供引用:

import sqlite3
import time
from numpy import genfromtxt

def dict_factory(cursor, row):
d = {}
for idx, col in enumerate(cursor.description):
d[col[0]] = row[idx]
return d


def Create_DB(db):
#Create DB and format it as needed
with sqlite3.connect(db) as conn:
conn.row_factory = dict_factory
conn.text_factory = str

cursor = conn.cursor()

cursor.execute("CREATE TABLE [Price_History] ([id] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL UNIQUE, [date] DATE, [opn] FLOAT, [hi] FLOAT, [lo] FLOAT, [close] FLOAT, [vol] INTEGER);")


def Add_Record(db, data):
#Insert record into table
with sqlite3.connect(db) as conn:
conn.row_factory = dict_factory
conn.text_factory = str

cursor = conn.cursor()

cursor.execute("INSERT INTO Price_History({cols}) VALUES({vals});".format(cols = str(data.keys()).strip('[]'),
vals=str([data[i] for i in data]).strip('[]')
))


def Load_Data(file_name):
data = genfromtxt(file_name, delimiter=',', skiprows=1, converters={0: lambda s: str(s)})
return data.tolist()


if __name__ == "__main__":
t = time.time()

db = 'csv_test_sql.db' #Database filename
file_name = "t.csv" #sample CSV file used: http://www.google.com/finance/historical?q=NYSE%3AT&ei=W4ikVam8LYWjmAGjhoHACw&output=csv

data = Load_Data(file_name) #Get data from CSV

Create_DB(db) #Create DB

#For every record, format and insert to table
for i in data:
record = {
'date' : i[0],
'opn' : i[1],
'hi' : i[2],
'lo' : i[3],
'close' : i[4],
'vol' : i[5]
}
Add_Record(db, record)

print "Time elapsed: " + str(time.time() - t) + " s." #3.604s

(注意:即使是“旧”方式,这绝不是最好的方式,但它非常易读,并且是 SQLAlchemy 方式与“旧”方式的“一对一”翻译方式。)

注意SQL语句:一个是创建表,另一个是插入记录。另外,请注意,维护长 SQL 字符串与简单的类属性添加相比要麻烦一些。到目前为止喜欢 SQLAlchemy?

当然,至于您的外键查询。 SQLAlchemy 也有能力做到这一点。下面是一个外键赋值的类属性示例(假设 ForeignKey 类也已从 sqlalchemy 模块导入):

class Asset_Analysis(Base):
#Tell SQLAlchemy what the table name is and if there's any table-specific arguments it should know about
__tablename__ = 'Asset_Analysis'
__table_args__ = {'sqlite_autoincrement': True}
#tell SQLAlchemy the name of column and its attributes:
id = Column(Integer, primary_key=True, nullable=False)
fid = Column(Integer, ForeignKey('Price_History.id'))

将“fid”列作为 Price_History 的 id 列的外键。

希望有帮助!

关于python - 使用 sqlalchemy 将 csv 文件加载到数据库中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31394998/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com