gpt4 book ai didi

opencv - 如何消除给定图像中的噪点,以使ocr输出完美?

转载 作者:行者123 更新时间:2023-12-02 17:38:06 26 4
gpt4 key购买 nike

enter image description here

我已经对该孟加拉文本图像进行了otsu阈值处理,并使用tesseract进行了OCR,但输出非常糟糕。我应该采用什么预处理来消除噪音?我也想对图像进行校正,因为它略有倾斜。
我的代码如下

import tesserocr
from PIL import Image
import cv2
import codecs
image = cv2.imread("crop2.bmp", 0)
(thresh, bw_img) = cv2.threshold(image, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)

img = Image.fromarray(bw_img)
text = tesserocr.image_to_text(img, lang='ben')
file = codecs.open("output_text", "w", "utf-8")
file.write(text)
file.close()

最佳答案

您可以通过移除可能会提高精度的小型连接组件来消除噪声。您还需要获得噪声分量阈值的最佳值。

import cv2 
import numpy as np

img = cv2.imread(r'D:\Image\st5.png',0)
ret, bw = cv2.threshold(img, 128,255,cv2.THRESH_BINARY_INV)

connectivity = 4
nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(bw, connectivity, cv2.CV_32S)
sizes = stats[1:, -1]; nb_components = nb_components - 1
min_size = 50 #threshhold value for small noisy components
img2 = np.zeros((output.shape), np.uint8)

for i in range(0, nb_components):
if sizes[i] >= min_size:
img2[output == i + 1] = 255

res = cv2.bitwise_not(img2)

降噪后的图像:

enter image description here

关于opencv - 如何消除给定图像中的噪点,以使ocr输出完美?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48177052/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com