gpt4 book ai didi

java - 使用 OpenCV 进行光学盲文识别

转载 作者:塔克拉玛干 更新时间:2023-11-02 20:23:00 27 4
gpt4 key购买 nike

我实际上是在尝试识别文档中的盲文字符。我打算将盲文文档转换为纯文本。我正在使用带有 Java 的 OpenCV 来进行图像处理。

首先,我导入了盲文文档的图像:

Image of the original Braille document

然后,我做了一些图像处理,以便对原始图像进行二值化。我读到重要的步骤是:

  • 将图像转换为灰度级
  • 降低噪音
  • 增强边缘对比度
  • 将图像二值化

这是我使用的代码:

public static void main(String args[]) {

Mat imgGrayscale = new Mat();

Mat image = Imgcodecs.imread("C:/Users/original_braille.jpg", 1);


Imgproc.cvtColor(image, imgGrayscale, Imgproc.COLOR_BGR2GRAY);

Imgproc.GaussianBlur(imgGrayscale, imgGrayscale, new Size(3, 3), 0);
Imgproc.adaptiveThreshold(imgGrayscale, imgGrayscale, 255, Imgproc.ADAPTIVE_THRESH_MEAN_C, Imgproc.THRESH_BINARY_INV, 5, 4);

Imgproc.medianBlur(imgGrayscale, imgGrayscale, 3);
Imgproc.threshold(imgGrayscale, imgGrayscale, 0, 255, Imgproc.THRESH_OTSU);

Imgproc.GaussianBlur(imgGrayscale, imgGrayscale, new Size(3, 3), 0);
Imgproc.threshold(imgGrayscale, imgGrayscale, 0, 255, Imgproc.THRESH_OTSU);

Imgcodecs.imwrite( "C:/Users/Jean-Baptiste/Desktop/Reconnaissance_de_formes/result.jpg", imgGrayscale );

}

这一步我得到了以下结果:

Image Binarization

据我所知,我们可以提高此图像的质量以获得更好的结果,但我对不同的图像处理技术没有经验。我可以提高过滤器的质量吗?

之后,我想对图像进行分割以检测该文档的不同字符。我想将文档的不同字符分开,以便将它们转换为文本。

例如我手动绘制了文档的分隔线:

Separation lines

但是我没有找到这一步的解决方案。有没有可能对 OpenCV 做同样的事情?

最佳答案

这是一个小脚本,可以在您的图像中找到线条。它是在 python 中,我没有安装 openCV 的 java 版本,但我认为你无论如何都能理解算法的概念。

找到垂直线并不容易,因为点之间的间距取决于彼此后面的字母。您或许可以尝试使用一些常用字母的模板匹配算法。鉴于此时您知道字母的高度,应该不会太难。

当然,整个方法都假设文档没有旋转。

import numpy as np
import cv2

# This is just the transposition of your code in python
img = cv2.imread('L1ZzA.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray,(3,3),0)
thres = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY,5,4)
blur2 = cv2.medianBlur(thres,3)
ret2,th2 = cv2.threshold(blur2,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
blur3 = cv2.GaussianBlur(th2,(3,3),0)
ret3,th3 = cv2.threshold(blur3,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

# Find connected components and extract the mean height and width
output = cv2.connectedComponentsWithStats(255-th3, 6, cv2.CV_8U)
mean_h = np.mean(output[2][:,cv2.CC_STAT_HEIGHT])
mean_w = np.mean(output[2][:,cv2.CC_STAT_WIDTH])

# Find empty rows, defined as having less than mean_h/2 pixels
empty_rows = []
for i in range(th3.shape[0]):
if np.sum(255-th3[i,:]) < mean_h/2.0:
empty_rows.append(i)

# Group rows by labels
d = np.ediff1d(empty_rows, to_begin=1)

good_rows = []
good_labels = []
label = 0

# 1: assign labels to each row
# based on whether they are following each other or not (i.e. diff >1)
for i in range(1,len(empty_rows)-1):
if d[i+1] == 1:
good_labels.append(label)
good_rows.append(empty_rows[i])

elif d[i] > 1 and d[i+1] > 1:
label = good_labels[len(good_labels)-1] + 1

# 2: find the mean row value associated with each label, and color that line in green in the original image
for i in range(label):
frow = np.mean(np.asarray(good_rows)[np.where(np.asarray(good_labels) == i)])
img[int(frow),:,1] = 255

# Display the image with the green rows
cv2.imshow('test',img)
cv2.waitKey(0)

关于java - 使用 OpenCV 进行光学盲文识别,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50683212/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com