gpt4 book ai didi

python - 我是否错误地使用了 LMDB?它说在 0 次插入后达到环境映射大小限制

转载 作者:搜寻专家 更新时间:2023-10-30 19:57:31 26 4
gpt4 key购买 nike

我正在尝试为我的 Caffe 机器学习项目创建一个 LMDB 数据库。但是 LMDB 在第一次尝试插入数据点时抛出错误,说环境映射大小已满。

这是尝试填充数据库的代码:

import numpy as np
from PIL import Image
import os
import lmdb
import random
# my data structure for holding image/label pairs
from serialization import DataPoint

class LoadImages(object):
def __init__(self, image_data_path):
self.image_data_path = image_data_path
self.dirlist = os.listdir(image_data_path)

# find the number of images that are to be read from disk
# in this case there are 370 images.
num = len(self.dirlist)

# shuffle the list of image files so that they are read in a random order
random.shuffle(self.dirlist)

map_size = num*10

j=0

# load images from disk
for image_filename in os.listdir(image_data_path):
# check that every image belongs to either category _D_ or _P_
assert (image_filename[:3] == '_D_' or image_filename[:3] == '_P_'), "ERROR: unknown category"

# set up the LMDB datbase object
env = lmdb.open('image_lmdb', map_size=map_size)
with env.begin(write=True) as txn:

# iterate over (shuffled) list of image files
for image_filename in self.dirlist:
print "Loading " + str(j) + "th image from disk - percentage complete: " + str((float(j)/num) * 100) + " %"

# open the image
with open(str(image_data_path + "/" + image_filename), 'rb') as f:
image = Image.open(f)
npimage = np.asarray(image, dtype=np.float64)

# discard alpha channel, if necessary
if npimage.shape[2] == 4:
npimage = npimage[:,:,:3]
print image_filename + " had its alpha channel removed."

# get category
if image_filename[:3] == '_D_':
category = 0
elif image_filename[:3] == '_P_':
category = 1

# wrap image data and label into a serializable data structure
datapoint = DataPoint(npimage, category)
serialized_datapoint = datapoint.serialize()

# a database key
str_id = '{:08}'.format(j)

# put the data point in the LMDB
txn.put(str_id.encode('ascii'), serialized_datapoint)

j+=1

我还做了一个小数据结构来保存图片和标签并序列化它们,上面用到了:

import numpy as np

class DataPoint(object):
def __init__(self, image=None, label=None, dtype=np.float64):
self.image = image
if self.image is not None:
self.image = self.image.astype(dtype)
self.label = label

def serialize(self):
image_string = self.image.tobytes()
label_string = chr(self.label)
datum_string = label_string + image_string
return datum_string

def deserialize(self, string):
image_string = string[1:]
label_string = string[:1]
image = np.fromstring(image_string, dtype=np.float64)
label = ord(label_string)
return DataPoint(image, label)

这是错误:

/usr/bin/python2.7 /home/hal9000/PycharmProjects/Caffe_Experiments_0.6/gather_images.py
Loading 0th image from disk - percentage complete: 0.0 %
Traceback (most recent call last):
File "/home/hal9000/PycharmProjects/Caffe_Experiments_0.6/gather_images.py", line 69, in <module>
g = LoadImages(path)
File "/home/hal9000/PycharmProjects/Caffe_Experiments_0.6/gather_images.py", line 62, in __init__
txn.put(str_id.encode('ascii'), serialized_datapoint)
lmdb.MapFullError: mdb_put: MDB_MAP_FULL: Environment mapsize limit reached

最佳答案

map 大小是整个数据库的最大大小,包括元数据 - 看起来您使用了预期的记录数。

你增加这个数字

关于python - 我是否错误地使用了 LMDB?它说在 0 次插入后达到环境映射大小限制,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37642885/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com