gpt4 book ai didi

python - 在 Python 中使用 Redis 将数据保存在内存中的最快方法

转载 作者:IT王子 更新时间:2023-10-29 06:03:13 25 4
gpt4 key购买 nike

我需要使用 Python 3 在 Flask 应用程序中保存一次并多次加载一些大数组。我最初使用 json 库将这些数组存储在磁盘上。为了加快速度,我在同一台机器上使用 Redis 通过将数组序列化为 JSON 字符串来存储数组。我想知道为什么我没有任何改进(实际上它在我使用的服务器上花费了更多时间)而 Redis 将数据保存在 RAM 中。我猜 JSON 序列化没有优化,但我不知道如何加快它:

import json
import redis
import os
import time

current_folder = os.path.dirname(os.path.abspath(__file__))
file_path = os.path.join(current_folder, "my_file")

my_array = [1]*10000000

with open(file_path, 'w') as outfile:
json.dump(my_array, outfile)

start_time = time.time()
with open(file_path, 'r') as infile:
my_array = json.load(infile)
print("JSON from disk : ", time.time() - start_time)

r = redis.Redis()
my_array_as_string = json.dumps(my_array)
r.set("my_array_as_string", my_array_as_string)

start_time = time.time()
my_array_as_string = r.get("my_array_as_string")
print("Fetch from Redis:", time.time() - start_time)

start_time = time.time()
my_array = json.loads(my_array_as_string)
print("Parse JSON :", time.time() - start_time)

结果:

JSON from disk  : 1.075700044631958
Fetch from Redis: 0.078125
Parse JSON : 1.0247752666473389

编辑:似乎从redis 获取实际上很快,但JSON 解析很慢。有没有办法在没有 JSON 序列化部分的情况下直接从 Redis 获取数组?这就是我们用 pyMySQL 所做的,而且速度很快。

最佳答案

更新:2019 年 11 月 8 日——在 Python3.6 上运行相同的测试

结果:

转储时间:JSON > msgpack > pickle > marshal
加载时间:JSON > pickle > msgpack > marshal
空格:marshal > JSON > pickle > msgpack

+---------+-----------+-----------+-------+
| package | dump time | load time | size |
+---------+-----------+-----------+-------+
| json | 0.00134 | 0.00079 | 30049 |
| pickle | 0.00023 | 0.00019 | 20059 |
| msgpack | 0.00031 | 0.00012 | 10036 |
| marshal | 0.00022 | 0.00010 | 50038 |
+---------+-----------+-----------+-------+

我试过 pickle vs json vs msgpack vs marshal。

Pickle 比 JSON 快得多。msgpack 至少比 JSON 快 4 倍。MsgPack 看起来是您拥有的最佳选择。

编辑:编码(marshal)也试过。 Marshal 比 JSON 快,但比 msgpack 慢。

所用时间:Pickle > JSON > Marshal > MsgPack
占用空间:Marshal > Pickle > Json > MsgPack

import time
import json
import pickle
import msgpack
import marshal
import sys

array = [1]*10000

start_time = time.time()
json_array = json.dumps(array)
print "JSON dumps: ", time.time() - start_time
print "JSON size: ", sys.getsizeof(json_array)
start_time = time.time()
_ = json.loads(json_array)
print "JSON loads: ", time.time() - start_time

# --------------

start_time = time.time()
pickled_object = pickle.dumps(array)
print "Pickle dumps: ", time.time() - start_time
print "Pickle size: ", sys.getsizeof(pickled_object)
start_time = time.time()
_ = pickle.loads(pickled_object)
print "Pickle loads: ", time.time() - start_time


# --------------

start_time = time.time()
package = msgpack.dumps(array)
print "Msg Pack dumps: ", time.time() - start_time
print "MsgPack size: ", sys.getsizeof(package)
start_time = time.time()
_ = msgpack.loads(package)
print "Msg Pack loads: ", time.time() - start_time

# --------------

start_time = time.time()
m_package = marshal.dumps(array)
print "Marshal dumps: ", time.time() - start_time
print "Marshal size: ", sys.getsizeof(m_package)
start_time = time.time()
_ = marshal.loads(m_package)
print "Marshal loads: ", time.time() - start_time

结果:

    JSON dumps:  0.000760078430176
JSON size: 30037
JSON loads: 0.000488042831421
Pickle dumps: 0.0108790397644
Pickle size: 40043
Pickle loads: 0.0100247859955
Msg Pack dumps: 0.000202894210815
MsgPack size: 10040
Msg Pack loads: 7.58171081543e-05
Marshal dumps: 0.000118017196655
Marshal size: 50042
Marshal loads: 0.000118970870972

关于python - 在 Python 中使用 Redis 将数据保存在内存中的最快方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52298118/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com