gpt4 book ai didi

python - 自动扫描文档图像增强

转载 作者:太空宇宙 更新时间:2023-11-03 23:10:26 26 4
gpt4 key购买 nike

我正在开发基于微软论文的自动图像增强 Whiteboard scanning and image enhancement

在“白平衡和图像增强”部分中,他们提供了增强步骤:

首先:他们估计扫描文档或检测到的白板的背景:

1. "Divide the whiteboard region into rectangular cells. The cell size should be roughly the same as what we expect the size of a single character on the board (15 by 15 pixels in our implementation)."

然后

2. "Sort the pixels in each cell by their luminance values. Since the ink absorbs the incident light, the luminance of the whiteboard pixels is higher than stroke pixels’. The whiteboard color within the cell is, therefore, the color with the highest luminance. In practice, we average the colors of the pixels in the top 25 percentile in order to reduce the error introduced by sensor noise"

然后

3. "Filter the colors of the cells by locally fitting a plane in the RGB space. Occasionally there are cells that are entirely covered by pen strokes, the cell color computed in Step 2 is consequently incorrect. Those colors are rejected as outliers by the locally fitted plane and are replaced by the interpolated values from its neighbors."

我的问题是第二步和第三步:

他们如何获得亮度值,我应该将输入图像转换为 YUV 颜色空间并从 Y channel 获取亮度值,还是只在 RGB 颜色空间上工作?

如何在 RGB 空间中拟合局部平面?

这是我的 python 代码,我尝试从输入图像制作单元格,从 YUV 颜色空间获取亮度值,以及一个简单的结果,与他们在论文中获得的结果相比似乎不正确。

Python 代码:

import cv2
import numpy as np



## Return List of cells from a given Image
def SubImage(image):
Cells = []
CellRows = []
for i in range(0,rows/CellSize):
subIm = image[i*CellSize:(i+1)*CellSize,:]
CellRows.append(subIm)
for img in CellRows:
for i in range(0,cols/CellSize):
subIm = img[:,i*CellSize:(i+1)*CellSize]
Cells.append(subIm)
return Cells


## Sort luminosity Value
def GetLuminance(Cells):
luminance = []
for cel in Cells:
luminance.append(cel.max())
return luminance


## Estimate the background color of the white board
def UniformBackground(CelImage,img,luminance):
a = 0

for c in range(0,len(CelImage)):
cel = CelImage[c]
for i in range(0,cel.shape[0]):
for j in range(0, cel.shape[1]):
cel[i,j] = min(1,cel[i,j]/ luminance[c])
for i in range(0,rows/CellSize):
for j in range(0,cols/CellSize):
img[i*CellSize:(i+1)*CellSize,j*CellSize:(j+1)*CellSize] = CelImage[a]
a = a + 1

if __name__ == '__main__':
img = cv2.imread('4.png')
CellSize = 15
rows,cols,depth = img.shape


if (rows%CellSize !=0):
rows = rows - rows%CellSize

if (cols%CellSize !=0):
cols = cols - cols%CellSize

yuvImg = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
# Get cells from Y channel
CellsY = SubImage(yuvImg[:,:,0])
CellsB = SubImage(img[:,:,0])
CellsG = SubImage(img[:,:,1])
CellsR = SubImage(img[:,:,2])

# Get Luminance From Y cells
LuminanceY = GetLuminance(CellsY)

# Uniform Background
UniformBackground(CellsB, img[:,:,0], LuminanceY)
UniformBackground(CellsG, img[:,:,1], LuminanceY)
UniformBackground(CellsR,img[:,:,2], LuminanceY)

#bgrImg = cv2.cvtColor(imgB, cv2.COLOR_GRAY2BGR)
#print imgB
cv2.imwrite('unifrom.jpg',img)

输入白板图像:

white Board image

输出图片:

Output image

预期输出:

expected Output

最佳答案

让我们逐步解决:

  1. "Sort the pixels in each cell by their luminance values"

是的,您必须将图像转换为其他具有亮度分量的颜色空间,例如 Lab 颜色空间。

... In practice, we average the colors of the pixels in the top 25 percentile in order to reduce the error introduced by sensor noise

意思是,在你得到 LAB 图像之后,你需要将它分成多个 channel ,即 L channel 图像,获取它的直方图,比如有 100 个 bin(我夸大了)并且只获取落入的像素最白的垃圾箱(比如从 75 到 100)。现在,在您找到每个单元格中的白色像素后 - 记住它们!!! 例如,您可以创建一个蒙版图像,除了那些被选为“白色”的像素外,所有像素都为 0

Filter the colors of the cells by locally fitting a plane in the RGB space

现在回到 RBG 空间。如您所见,白板随着它的消失而变暗。如果您将白板像素 RGB 颜色绘制为轴为 R、G 和 B 的 3d 世界中的 3d 点,您将得到一个近似于平面的散点(因为所有这些白板颜色都带有灰色调) .现在获取您在上一步中标记为“白板”的点,并为它们安装一个平面。如何装飞机?你可以使用像 this 这样的最小二乘法,但从他们在文章中的写法来看,我认为他们考虑到了 RANSAC。

关于python - 自动扫描文档图像增强,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51306598/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com