gpt4 book ai didi

python - 从图像中改进 pytesseract 正确的文本识别

转载 作者:太空宇宙 更新时间:2023-11-03 21:39:49 25 4
gpt4 key购买 nike

我正在尝试使用 pytesseract 模块读取验证码。它在大多数时候都提供准确的文本,但并非始终如此。

这是读取图像、操作图像和从图像中提取文本的代码。

import cv2
import numpy as np
import pytesseract

def read_captcha():
# opencv loads the image in BGR, convert it to RGB
img = cv2.cvtColor(cv2.imread('captcha.png'), cv2.COLOR_BGR2RGB)

lower_white = np.array([200, 200, 200], dtype=np.uint8)
upper_white = np.array([255, 255, 255], dtype=np.uint8)

mask = cv2.inRange(img, lower_white, upper_white) # could also use threshold
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))) # "erase" the small white points in the resulting mask
mask = cv2.bitwise_not(mask) # invert mask

# load background (could be an image too)
bk = np.full(img.shape, 255, dtype=np.uint8) # white bk

# get masked foreground
fg_masked = cv2.bitwise_and(img, img, mask=mask)

# get masked background, mask must be inverted
mask = cv2.bitwise_not(mask)
bk_masked = cv2.bitwise_and(bk, bk, mask=mask)

# combine masked foreground and masked background
final = cv2.bitwise_or(fg_masked, bk_masked)
mask = cv2.bitwise_not(mask) # revert mask to original

# resize the image
img = cv2.resize(mask,(0,0),fx=3,fy=3)
cv2.imwrite('ocr.png', img)

text = pytesseract.image_to_string(cv2.imread('ocr.png'), lang='eng')

return text

对于图像的处理,我从这个stackoverflow 得到了帮助发布。

这是原始验证码图像:

enter image description here

这张图片是经过处理后生成的:

enter image description here

但是,通过使用 pytesseract,我得到文本:AX#7rL

谁能指导我如何将成功率提高到 100%?

最佳答案

由于生成的图像中有微小的孔洞,形态学转换,特别是cv2.MORPH_CLOSE,可以在这里关闭孔洞并平滑图像

Threshold获取二值图像(黑白)

enter image description here

执行 morphological operations关闭前景中的小孔

enter image description here

反转图像得到结果

enter image description here

4X#7rL

可能在插入 tesseract 之前使用 cv2.GaussianBlur() 也会有帮助

import cv2
import pytesseract

# Path for Windows
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Read in image as grayscale
image = cv2.imread('1.png',0)
# Threshold to obtain binary image
thresh = cv2.threshold(image, 220, 255, cv2.THRESH_BINARY)[1]

# Create custom kernel
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
# Perform closing (dilation followed by erosion)
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)

# Invert image to use for Tesseract
result = 255 - close
cv2.imshow('thresh', thresh)
cv2.imshow('close', close)
cv2.imshow('result', result)

# Throw image into tesseract
print(pytesseract.image_to_string(result))
cv2.waitKey()

关于python - 从图像中改进 pytesseract 正确的文本识别,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57210342/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com