gpt4 book ai didi

python - 如何加速 SQLAlchemy 查询?

转载 作者:太空狗 更新时间:2023-10-29 21:12:11 28 4
gpt4 key购买 nike

我有一个超过 1000 万行的表。大约有 50 多列。该表存储传感器数据/参数。假设我需要查询一整天或 86,400 秒的数据。完成此查询大约需要 20 秒或更长时间。

我在几个列上添加了单独的索引,例如 recordTimestamp(在捕获数据时存储)、deviceId(传感器的标识)、positionValid(GPS 地理定位是否有效)。然后我添加了一个包含所有三列的复合索引。

下面是我的查询:

t1 = time.time()
conn = engine.connect()
select_statement = select([Datatable]).where(and_(
Datatable.recordTimestamp >= start_date,
Datatable.recordTimestamp <= end_date,
Datatable.deviceId == device_id,
Datatable.positionValid != None,
Datatable.recordTimestamp % query_interval == 0))
lol_data = conn.execute(select_statement).fetchall()
conn.close()
t2 = time.time()
time_taken = t2 - t1
print('Select: ' + time_taken)

下面是我的 EXPLAIN ANALYZE 语句:

EXPLAIN ANALYZE SELECT datatable.id, datatable."createdAt", datatable."analogInput01", datatable."analogInput02", datatable."analogInput03", datatable."analogInput04", datatable."analogInput05", datatable."analogInput06", datatable."analogInput07", datatable."canEngineRpm", datatable."canEngineTemperature", datatable."canFuelConsumedLiters", datatable."canFuelLevel", datatable."canVehicleMileage", datatable."deviceId", datatable."deviceTemperature", datatable."deviceInternalVoltage", datatable."deviceExternalVoltage", datatable."deviceAntennaCut", datatable."deviceEnum", datatable."deviceVehicleMileage", datatable."deviceSimSignal", datatable."deviceSimStatus", datatable."iButton01", datatable."iButton02", datatable."recordSequence", datatable."recordTimestamp", datatable."accelerationAbsolute", datatable."accelerationBrake", datatable."accelerationBump", datatable."accelerationTurn", datatable."accelerationX", datatable."accelerationY", datatable."accelerationZ", datatable."positionAltitude", datatable."positionDirection", datatable."positionSatellites", datatable."positionSpeed", datatable."positionLatitude", datatable."positionLongitude", datatable."positionHdop", datatable."positionMovement", datatable."positionValid", datatable."positionEngine" FROM datatable WHERE datatable."recordTimestamp" >= 1519744521 AND datatable."recordTimestamp" <= 1519745181 AND datatable."deviceId" = '864495033990901' AND datatable."positionValid" IS NOT NULL AND datatable."recordTimestamp" % 1 = 0;

下面是 SELECT 的 EXPLAIN ANALYZE 的结果:

Index Scan using "ix_dataTable_recordTimestamp" on dataTable (cost=0.44..599.35 rows=5 width=301) (actual time=0.070..10.487 rows=661 loops=1)
Index Cond: (("recordTimestamp" >= 1519744521) AND ("recordTimestamp" <= 1519745181))
Filter: (("positionValid" IS NOT NULL) AND (("deviceId")::text = '864495033990901'::text) AND (("recordTimestamp" % 1) = 0))
Rows Removed by Filter: 6970
Planning time: 0.347 ms
Execution time: 10.658 ms

而下面是用Python计算时间的结果:

Select:  47.98712515830994 
JSON: 0.19731807708740234

下面是我的代码分析:

10302 function calls (10235 primitive calls) in 12.612 seconds

Ordered by: cumulative time

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 12.595 12.595 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py:882(execute)
1 0.000 0.000 12.595 12.595 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/sql/elements.py:267(_execute_on_connection)
1 0.000 0.000 12.595 12.595 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py:1016(_execute_clauseelement)
1 0.000 0.000 12.592 12.592 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py:1111(_execute_context)
1 0.000 0.000 12.590 12.590 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/engine/default.py:506(do_execute)
1 12.590 12.590 12.590 12.590 {method 'execute' of 'psycopg2.extensions.cursor' objects}
1 0.000 0.000 0.017 0.017 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/engine/result.py:1113(fetchall)
1 0.000 0.000 0.017 0.017 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/engine/result.py:1080(_fetchall_impl)
1 0.008 0.008 0.017 0.017 {method 'fetchall' of 'psycopg2.extensions.cursor' objects}

最佳答案

尝试使用内置的 Postgres COPY或者如果你真的需要在 Python 中检索结果(例如你不能通过 COPY 直接写入磁盘)你可以通过 psycopgs 使用 COPY copy_expert功能:

cur = conn.cursor()

outputquery = "COPY ({0}) TO STDOUT WITH CSV HEADER".format(query)

with open('resultsfile', 'w') as f:
cur.copy_expert(outputquery, f)

conn.close()

这应该避免一起序列化。

关于python - 如何加速 SQLAlchemy 查询?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49097346/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com