gpt4 book ai didi

python - 检测文本页面上的初始/草图

转载 作者:行者123 更新时间:2023-12-04 00:50:14 26 4
gpt4 key购买 nike

我想获取下一页上首字母(“H”)周围框的坐标(以及与其他首字母类似的框,因此 opencv 模板匹配不是一个选项):

enter image description here

正在关注 this教程,我尝试用 opencv contours 解决问题:

import cv2
import matplotlib.pyplot as plt

page = "image.jpg"

# read the image
image = cv2.imread(page)

# convert to RGB
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)

# create a binary thresholded image
_, binary = cv2.threshold(gray, 0,150,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# find the contours from the thresholded image
contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# draw all contours
image = cv2.drawContours(image, contours, 3, (0, 255, 0), 2)
plt.savefig("result.png")

结果当然不是我想要的:

enter image description here

有谁知道可以为我的任务提供简单解决方案的可行算法(可能还有其实现)?

最佳答案

您可以通过过滤轮廓找到目标区域。现在,您至少可以使用两个过滤条件。一种是按区域 过滤 - 也就是说,丢弃太小太大 轮廓,直到获得您正在寻找的轮廓。另一种是通过计算每个轮廓的范围extent 是轮廓区域与其边界矩形区域的比率。您正在寻找类似方形的轮廓,因此它的 extent 应该接近 1.0

让我们看一下代码:

# imports:
import cv2
import numpy as np

# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Deep copy for results:
inputImageCopy = inputImage.copy()

# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)

# Get binary image via Otsu:
_, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

代码的第一部分为您提供了一个二值图像,您可以将其用作计算轮廓的掩码:

现在,让我们过滤轮廓。让我们首先使用 area 方法。您需要定义一个最小面积最大面积 的范围,以过滤掉不在此范围内的所有内容。我启发式确定了从 30000 像素到 150000 像素的区域范围:

# Find the contours on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):

# Get blob area:
currentArea = cv2.contourArea(c)
print("Contour Area: "+str(currentArea))

# Set an area range:
minArea = 30000
maxArea = 150000

if minArea < currentArea < maxArea:

# Get the contour's bounding rectangle:
boundRect = cv2.boundingRect(c)

# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]

# Set bounding rect:
color = (0, 0, 255)
cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )

cv2.imshow("Rectangles", inputImageCopy)
cv2.waitKey(0)

成功过滤区域后,您可以使用 cv2.boundingRect 计算轮廓的 bounding rectangle。您可以检索边界矩形的 xy(左上角)坐标及其 widthheight。之后,只需在原始输入的深拷贝上绘制矩形。

现在,让我们看看第二个选项,使用轮廓的范围for 循环修改如下:

# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):

# Get blob area:
currentArea = cv2.contourArea(c)

# Get the contour's bounding rectangle:
boundRect = cv2.boundingRect(c)

# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]

# Calculate extent:
extent = float(currentArea)/(rectWidth *rectHeight)
print("Extent: " + str(extent))

# Set the extent filter, look for an extent close to 1.0:
delta = abs(1.0 - extent)
epsilon = 0.1

if delta < epsilon:

# Set bounding rect:
color = (0, 0, 255)
cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )

cv2.imshow("Rectangles", inputImageCopy)
cv2.waitKey(0)

两种方法都会产生这样的结果:

关于python - 检测文本页面上的初始/草图,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67288866/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com