gpt4 book ai didi

algorithm - Viola-Jones 的人脸检测声称拥有 18 万个特征

转载 作者:塔克拉玛干 更新时间:2023-11-03 02:11:54 29 4
gpt4 key购买 nike

我一直在实现 Viola-Jones' face detection algorithm 的改编版.该技术依赖于在图像中放置一个 24x24 像素的子帧,然后在其中的每个位置以各种可能的尺寸放置矩形特征。

这些特征可以由两个、三个或四个矩形组成。下面给出了示例。

Rectangle features

他们声称详尽的集合超过 180k(第 2 部分):

Given that the base resolution of the detector is 24x24, the exhaustive set of rectangle features is quite large, over 180,000 . Note that unlike the Haar basis, the set of rectangle features is overcomplete.

以下陈述未在论文中明确说明,因此它们是我的假设:

  1. 只有 2 个二矩形特征、2 个三矩形特征和 1 个四矩形特征。这背后的逻辑是,我们正在观察突出显示的矩形之间的差异,而不是明确的颜色或亮度或任何类似的东西。
  2. 我们不能将要素类型 A 定义为 1x1 像素 block ;它至少必须至少为 1x2 像素。此外,类型 D 必须至少为 2x2 像素,此规则相应地适用于其他特征。
  3. 我们不能将特征类型 A 定义为 1x3 像素 block ,因为中间像素无法分割,并且从自身中减去它与 1x2 像素 block 相同;此特征类型仅为偶数宽度定义。此外,特征类型 C 的宽度必须能被 3 整除,此规则相应地适用于其他特征。
  4. 我们不能定义宽度和/或高度为 0 的特征。因此,我们将 xy 迭代为 24 减去特征的大小。<

基于这些假设,我计算了详尽的集合:

const int frameSize = 24;
const int features = 5;
// All five feature types:
const int feature[features][2] = {{2,1}, {1,2}, {3,1}, {1,3}, {2,2}};

int count = 0;
// Each feature:
for (int i = 0; i < features; i++) {
int sizeX = feature[i][0];
int sizeY = feature[i][1];
// Each position:
for (int x = 0; x <= frameSize-sizeX; x++) {
for (int y = 0; y <= frameSize-sizeY; y++) {
// Each size fitting within the frameSize:
for (int width = sizeX; width <= frameSize-x; width+=sizeX) {
for (int height = sizeY; height <= frameSize-y; height+=sizeY) {
count++;
}
}
}
}
}

结果是162,336

我发现接近 Viola & Jones 所说的“超过 180,000”的唯一方法是放弃假设 #4 并在代码中引入错误。这涉及将四行分别更改为:

for (int width = 0; width < frameSize-x; width+=sizeX)
for (int height = 0; height < frameSize-y; height+=sizeY)

结果是 180,625。 (请注意,这将有效地防止特征触及子框架的右侧和/或底部。)

当然还有一个问题:他们在实现过程中犯了错误吗?考虑具有零曲面的特征是否有意义?还是我看错了?

最佳答案

仔细一看,您的代码在我看来是正确的;这让人想知道原作者是否有一个错误。我猜有人应该看看 OpenCV 是如何实现它的!

尽管如此,一个更容易理解的建议是先遍历所有大小,然后遍历给定大小的可能位置,从而翻转 for 循环的顺序:

#include <stdio.h>
int main()
{
int i, x, y, sizeX, sizeY, width, height, count, c;

/* All five shape types */
const int features = 5;
const int feature[][2] = {{2,1}, {1,2}, {3,1}, {1,3}, {2,2}};
const int frameSize = 24;

count = 0;
/* Each shape */
for (i = 0; i < features; i++) {
sizeX = feature[i][0];
sizeY = feature[i][1];
printf("%dx%d shapes:\n", sizeX, sizeY);

/* each size (multiples of basic shapes) */
for (width = sizeX; width <= frameSize; width+=sizeX) {
for (height = sizeY; height <= frameSize; height+=sizeY) {
printf("\tsize: %dx%d => ", width, height);
c=count;

/* each possible position given size */
for (x = 0; x <= frameSize-width; x++) {
for (y = 0; y <= frameSize-height; y++) {
count++;
}
}
printf("count: %d\n", count-c);
}
}
}
printf("%d\n", count);

return 0;
}

与前面的162336

结果相同

为了验证它,我测试了 4x4 窗口的情况并手动检查了所有情况(很容易计数,因为 1x2/2x1 和 1x3/3x1 形状相同,只是旋转了 90 度):

2x1 shapes:
size: 2x1 => count: 12
size: 2x2 => count: 9
size: 2x3 => count: 6
size: 2x4 => count: 3
size: 4x1 => count: 4
size: 4x2 => count: 3
size: 4x3 => count: 2
size: 4x4 => count: 1
1x2 shapes:
size: 1x2 => count: 12 +-----------------------+
size: 1x4 => count: 4 | | | | |
size: 2x2 => count: 9 | | | | |
size: 2x4 => count: 3 +-----+-----+-----+-----+
size: 3x2 => count: 6 | | | | |
size: 3x4 => count: 2 | | | | |
size: 4x2 => count: 3 +-----+-----+-----+-----+
size: 4x4 => count: 1 | | | | |
3x1 shapes: | | | | |
size: 3x1 => count: 8 +-----+-----+-----+-----+
size: 3x2 => count: 6 | | | | |
size: 3x3 => count: 4 | | | | |
size: 3x4 => count: 2 +-----------------------+
1x3 shapes:
size: 1x3 => count: 8 Total Count = 136
size: 2x3 => count: 6
size: 3x3 => count: 4
size: 4x3 => count: 2
2x2 shapes:
size: 2x2 => count: 9
size: 2x4 => count: 3
size: 4x2 => count: 3
size: 4x4 => count: 1

关于algorithm - Viola-Jones 的人脸检测声称拥有 18 万个特征,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/1707620/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com