gpt4 book ai didi

c++ - 如何检测圣诞树?

转载 作者:bug小助手 更新时间:2023-10-28 01:31:22 26 4
gpt4 key购买 nike

哪些图像处理技术可用于实现检测以下图像中显示的圣诞树的应用程序?









我正在寻找适用于所有这些图像的解决方案。因此,需要训练 haar 级联分类器或模板匹配的方法并不是很有趣。

我正在寻找可以用任何编程语言编写的东西,只要它只使用开源技术。该解决方案必须使用在此问题上共享的图像进行测试。有 6 个输入图像,答案应该显示处理每个图像的结果。最后,对于每个输出图像,必须绘制红线来包围检测到的树。

您将如何以编程方式检测这些图像中的树木?

最佳答案

我有一种方法,我认为它很有趣,并且与其他方法略有不同。与其他一些方法相比,我的方法的主要区别在于图像分割步骤的执行方式——我使用了 DBSCAN来自 Python scikit-learn 的聚类算法;它经过优化,可以找到可能不一定具有单个清晰质心的无定形形状。

在顶层,我的方法相当简单,可以分为大约 3 个步骤。首先,我应用一个阈值(或者实际上是两个独立且不同的阈值的逻辑“或”)。与许多其他答案一样,我假设圣诞树将是场景中较亮的对象之一,因此第一个阈值只是一个简单的单色亮度测试;任何在 0-255 范围内值高于 220 的像素(其中黑色为 0,白色为 255)都将保存为二进制黑白图像。第二个阈值试图寻找红色和黄色的光,它们在六张图像的左上角和右下角的树木中尤为突出,并且在大多数照片中普遍存在的蓝绿色背景下非常突出。我将rgb图像转换为hsv空间,并要求色调在0.0-1.0范围内小于0.2(大致对应于黄色和绿色之间的边界)或大于0.95(对应于紫色和红色之间的边界)此外,我需要明亮、饱和的颜色:饱和度和值都必须高于 0.7。两个阈值程序的结果在逻辑上是“或”在一起的,得到的黑白二值图像矩阵如下所示:

Christmas trees, after thresholding on HSV as well as monochrome brightness

您可以清楚地看到,每个图像都有一个大的像素簇,大致对应于每棵树的位置,另外一些图像还有一些其他的小簇,对应于某些建筑物 window 中的灯光,或者对应于地平线上的背景场景。下一步是让计算机识别这些是独立的簇,并用簇成员 ID 号正确标记每个像素。

对于这个任务,我选择了 DBSCAN .相对于其他聚类算法,DBSCAN 通常的行为方式有一个非常好的视觉比较,可用 here .正如我之前所说,它适用于无定形形状。 DBSCAN 的输出,每个集群用不同的颜色绘制,如下所示:

DBSCAN clustering output

在查看此结果时,需要注意一些事项。首先是 DBSCAN 要求用户设置一个“邻近度”参数以调节其行为,这有效地控制了一对点必须如何分离,以便算法声明一个新的独立集群,而不是将测试点聚集到一个已经存在的集群。我将此值设置为每个图像对角线尺寸的 0.04 倍。由于图像大小从大约 VGA 到大约 HD 1080 不等,因此这种相对于比例的定义至关重要。

另一点值得注意的是,在 scikit-learn 中实现的 DBSCAN 算法具有内存限制,这对于本示例中的一些较大图像来说相当具有挑战性。因此,对于一些较大的图像,我实际上不得不“抽取”(即,仅保留每第 3 个或第 4 个像素并丢弃其他像素)每个集群以保持在此限制内。由于这种剔除过程,在一些较大的图像上很难看到剩余的单个稀疏像素。因此,仅出于显示目的,以上图像中的彩色编码像素已被有效地“放大”,只是稍微进行了处理,以便更好地突出显示。这纯粹是为了叙述而进行的整容手术;尽管在我的代码中有提到这种膨胀的评论,请放心,它与任何实际重要的计算无关。

一旦识别并标记了集群,第三步也是最后一步就很容易了:我只需取每张图像中最大的集群(在这种情况下,我选择根据成员像素的总数来衡量“大小”,尽管可以使用某种类型的度量来衡量物理范围一样容易)并计算该集群的凸包。凸包然后成为树边界。通过这种方法计算的六个凸包如下红色所示:

Christmas trees with their calculated borders

源代码是为 Python 2.7.6 编写的,它依赖于 numpy , scipy , matplotlibscikit-learn .我把它分成两部分。第一部分负责实际的图像处理:

from PIL import Image
import numpy as np
import scipy as sp
import matplotlib.colors as colors
from sklearn.cluster import DBSCAN
from math import ceil, sqrt

"""
Inputs:

rgbimg: [M,N,3] numpy array containing (uint, 0-255) color image

hueleftthr: Scalar constant to select maximum allowed hue in the
yellow-green region

huerightthr: Scalar constant to select minimum allowed hue in the
blue-purple region

satthr: Scalar constant to select minimum allowed saturation

valthr: Scalar constant to select minimum allowed value

monothr: Scalar constant to select minimum allowed monochrome
brightness

maxpoints: Scalar constant maximum number of pixels to forward to
the DBSCAN clustering algorithm

proxthresh: Proximity threshold to use for DBSCAN, as a fraction of
the diagonal size of the image

Outputs:

borderseg: [K,2,2] Nested list containing K pairs of x- and y- pixel
values for drawing the tree border

X: [P,2] List of pixels that passed the threshold step

labels: [Q,2] List of cluster labels for points in Xslice (see
below)

Xslice: [Q,2] Reduced list of pixels to be passed to DBSCAN

"""

def findtree(rgbimg, hueleftthr=0.2, huerightthr=0.95, satthr=0.7,
valthr=0.7, monothr=220, maxpoints=5000, proxthresh=0.04):

# Convert rgb image to monochrome for
gryimg = np.asarray(Image.fromarray(rgbimg).convert('L'))
# Convert rgb image (uint, 0-255) to hsv (float, 0.0-1.0)
hsvimg = colors.rgb_to_hsv(rgbimg.astype(float)/255)

# Initialize binary thresholded image
binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
# Find pixels with hue<0.2 or hue>0.95 (red or yellow) and saturation/value
# both greater than 0.7 (saturated and bright)--tends to coincide with
# ornamental lights on trees in some of the images
boolidx = np.logical_and(
np.logical_and(
np.logical_or((hsvimg[:,:,0] < hueleftthr),
(hsvimg[:,:,0] > huerightthr)),
(hsvimg[:,:,1] > satthr)),
(hsvimg[:,:,2] > valthr))
# Find pixels that meet hsv criterion
binimg[np.where(boolidx)] = 255
# Add pixels that meet grayscale brightness criterion
binimg[np.where(gryimg > monothr)] = 255

# Prepare thresholded points for DBSCAN clustering algorithm
X = np.transpose(np.where(binimg == 255))
Xslice = X
nsample = len(Xslice)
if nsample > maxpoints:
# Make sure number of points does not exceed DBSCAN maximum capacity
Xslice = X[range(0,nsample,int(ceil(float(nsample)/maxpoints)))]

# Translate DBSCAN proximity threshold to units of pixels and run DBSCAN
pixproxthr = proxthresh * sqrt(binimg.shape[0]**2 + binimg.shape[1]**2)
db = DBSCAN(eps=pixproxthr, min_samples=10).fit(Xslice)
labels = db.labels_.astype(int)

# Find the largest cluster (i.e., with most points) and obtain convex hull
unique_labels = set(labels)
maxclustpt = 0
for k in unique_labels:
class_members = [index[0] for index in np.argwhere(labels == k)]
if len(class_members) > maxclustpt:
points = Xslice[class_members]
hull = sp.spatial.ConvexHull(points)
maxclustpt = len(class_members)
borderseg = [[points[simplex,0], points[simplex,1]] for simplex
in hull.simplices]

return borderseg, X, labels, Xslice

第二部分是一个用户级脚本,它调用第一个文件并生成上面的所有图:
#!/usr/bin/env python

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from findtree import findtree

# Image files to process
fname = ['nmzwj.png', 'aVZhC.png', '2K9EF.png',
'YowlH.png', '2y4o5.png', 'FWhSP.png']

# Initialize figures
fgsz = (16,7)
figthresh = plt.figure(figsize=fgsz, facecolor='w')
figclust = plt.figure(figsize=fgsz, facecolor='w')
figcltwo = plt.figure(figsize=fgsz, facecolor='w')
figborder = plt.figure(figsize=fgsz, facecolor='w')
figthresh.canvas.set_window_title('Thresholded HSV and Monochrome Brightness')
figclust.canvas.set_window_title('DBSCAN Clusters (Raw Pixel Output)')
figcltwo.canvas.set_window_title('DBSCAN Clusters (Slightly Dilated for Display)')
figborder.canvas.set_window_title('Trees with Borders')

for ii, name in zip(range(len(fname)), fname):
# Open the file and convert to rgb image
rgbimg = np.asarray(Image.open(name))

# Get the tree borders as well as a bunch of other intermediate values
# that will be used to illustrate how the algorithm works
borderseg, X, labels, Xslice = findtree(rgbimg)

# Display thresholded images
axthresh = figthresh.add_subplot(2,3,ii+1)
axthresh.set_xticks([])
axthresh.set_yticks([])
binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
for v, h in X:
binimg[v,h] = 255
axthresh.imshow(binimg, interpolation='nearest', cmap='Greys')

# Display color-coded clusters
axclust = figclust.add_subplot(2,3,ii+1) # Raw version
axclust.set_xticks([])
axclust.set_yticks([])
axcltwo = figcltwo.add_subplot(2,3,ii+1) # Dilated slightly for display only
axcltwo.set_xticks([])
axcltwo.set_yticks([])
axcltwo.imshow(binimg, interpolation='nearest', cmap='Greys')
clustimg = np.ones(rgbimg.shape)
unique_labels = set(labels)
# Generate a unique color for each cluster
plcol = cm.rainbow_r(np.linspace(0, 1, len(unique_labels)))
for lbl, pix in zip(labels, Xslice):
for col, unqlbl in zip(plcol, unique_labels):
if lbl == unqlbl:
# Cluster label of -1 indicates no cluster membership;
# override default color with black
if lbl == -1:
col = [0.0, 0.0, 0.0, 1.0]
# Raw version
for ij in range(3):
clustimg[pix[0],pix[1],ij] = col[ij]
# Dilated just for display
axcltwo.plot(pix[1], pix[0], 'o', markerfacecolor=col,
markersize=1, markeredgecolor=col)
axclust.imshow(clustimg)
axcltwo.set_xlim(0, binimg.shape[1]-1)
axcltwo.set_ylim(binimg.shape[0], -1)

# Plot original images with read borders around the trees
axborder = figborder.add_subplot(2,3,ii+1)
axborder.set_axis_off()
axborder.imshow(rgbimg, interpolation='nearest')
for vseg, hseg in borderseg:
axborder.plot(hseg, vseg, 'r-', lw=3)
axborder.set_xlim(0, binimg.shape[1]-1)
axborder.set_ylim(binimg.shape[0], -1)

plt.show()

关于c++ - 如何检测圣诞树?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20772893/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com