gpt4 book ai didi

tensorflow - 使用对象检测API的默认配置时,图像缩放器的不同尺寸有何影响

转载 作者:行者123 更新时间:2023-12-03 00:36:58 25 4
gpt4 key购买 nike

我尝试使用 Tensorflow 的对象检测 API 来训练模型。我正在使用更快的 rcnn resnet101 ( https://github.com/tensorflow/models/blob/master/object_detection/samples/configs/faster_rcnn_resnet101_voc07.config ) 的示例配置。
以下代码是我不太理解的配置文件的一部分:

image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}

我的问题是:

  1. min_dimensionmax_dimension 的确切含义是什么?这是否意味着输入图像的大小将调整为 600x1024 或 1024x600?
  2. 如果我有不同尺寸的图像,并且其中一些图像相对大于 600x1024(或 1024x600),我可以/应该增加 min_dimensionmax_dimension 的值?

我之所以有这样的疑问,是来自这篇文章: TensorFlow Object Detection API Weird Behaviour

在这篇文章中,作者自己也给出了这个问题的答案:

Then I decided to crop the input image and provide that as an input. Just to see if the results improve and it did!
It turns out that the dimensions of the input image were much larger than the 600 x 1024 that is accepted by the model. So, it was scaling down these images to 600 x 1024 which meant that the cigarette boxes were losing their details :)

它使用的配置与我使用的相同。我不确定是否可以更改这些参数,如果它们是这个特殊模型的默认或推荐设置,faster_rcnn_resnet101。

最佳答案

经过一些测试,我想我找到了答案。如有错误请指正。

在.config文件中:

image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}

根据'object_detection/builders/image_resizer_builder.py'的图像缩放设置

if image_resizer_config.WhichOneof(
'image_resizer_oneof') == 'keep_aspect_ratio_resizer':
keep_aspect_ratio_config = image_resizer_config.keep_aspect_ratio_resizer
if not (keep_aspect_ratio_config.min_dimension
<= keep_aspect_ratio_config.max_dimension):
raise ValueError('min_dimension > max_dimension')
return functools.partial(
preprocessor.resize_to_range,
min_dimension=keep_aspect_ratio_config.min_dimension,
max_dimension=keep_aspect_ratio_config.max_dimension)

然后它尝试使用“object_detection/core/preprocessor.py”的“resize_to_range”函数

  with tf.name_scope('ResizeToRange', values=[image, min_dimension]):
image_shape = tf.shape(image)
orig_height = tf.to_float(image_shape[0])
orig_width = tf.to_float(image_shape[1])
orig_min_dim = tf.minimum(orig_height, orig_width)

# Calculates the larger of the possible sizes
min_dimension = tf.constant(min_dimension, dtype=tf.float32)
large_scale_factor = min_dimension / orig_min_dim
# Scaling orig_(height|width) by large_scale_factor will make the smaller
# dimension equal to min_dimension, save for floating point rounding errors.
# For reasonably-sized images, taking the nearest integer will reliably
# eliminate this error.
large_height = tf.to_int32(tf.round(orig_height * large_scale_factor))
large_width = tf.to_int32(tf.round(orig_width * large_scale_factor))
large_size = tf.stack([large_height, large_width])

if max_dimension:
# Calculates the smaller of the possible sizes, use that if the larger
# is too big.
orig_max_dim = tf.maximum(orig_height, orig_width)
max_dimension = tf.constant(max_dimension, dtype=tf.float32)
small_scale_factor = max_dimension / orig_max_dim
# Scaling orig_(height|width) by small_scale_factor will make the larger
# dimension equal to max_dimension, save for floating point rounding
# errors. For reasonably-sized images, taking the nearest integer will
# reliably eliminate this error.
small_height = tf.to_int32(tf.round(orig_height * small_scale_factor))
small_width = tf.to_int32(tf.round(orig_width * small_scale_factor))
small_size = tf.stack([small_height, small_width])

new_size = tf.cond(
tf.to_float(tf.reduce_max(large_size)) > max_dimension,
lambda: small_size, lambda: large_size)
else:
new_size = large_size

new_image = tf.image.resize_images(image, new_size,
align_corners=align_corners)

从上面的代码中,我们可以知道是否有一张尺寸为800*1000的图像。最终输出图像的尺寸为600*750。

也就是说,此图像调整器将始终根据“min_dimension”和“max_dimension”的设置调整您的输入图像的大小。

关于tensorflow - 使用对象检测API的默认配置时,图像缩放器的不同尺寸有何影响,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45137835/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com