tensorflow - 关于使用 tf.image.crop_and

tensorflow - 关于使用 tf.image.crop_and_resize

转载作者：行者123 更新时间：2023-11-30 08:34:38

我正在研究适用于 fast-rcnn 的 ROI 池化层，并且我习惯使用 tensorflow 。我发现 tf.image.crop_and_resize 可以充当 ROI 池化层。

但是我尝试了很多次都没有得到我想要的结果。或者说真正的结果就是我得到的结果吗？

这是我的代码

import cv2
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt 

img_path = r'F:\IMG_0016.JPG'
img = cv2.imread(img_path)
img = img.reshape([1,580,580,3])
img = img.astype(np.float32)
#img = np.concatenate([img,img],axis=0)

img_ = tf.Variable(img) # img shape is [580,580,3]
boxes = tf.Variable([[100,100,300,300],[0.5,0.1,0.9,0.5]])
box_ind = tf.Variable([0,0])
crop_size = tf.Variable([100,100])

#b = tf.image.crop_and_resize(img,[[0.5,0.1,0.9,0.5]],[0],[50,50])
c = tf.image.crop_and_resize(img_,boxes,box_ind,crop_size)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
a = c.eval(session=sess)

plt.imshow(a[0])
plt.imshow(a[1])

我交出了我的出身img结果:a0 , a1
如果我错了，谁能教我如何使用这个功能？谢谢。

最佳答案

其实这里用Tensorflow是没有问题的。

来自doc tf.image.crop_and_resize(重点是我的):

boxes: A Tensor of type float32. A 2-D tensor of shape [num_boxes, 4]. The i-th row of the tensor specifies the coordinates of a box in the box_ind[i] image and is specified in normalized coordinates [y1, x1, y2, x2]. A normalized coordinate value of y is mapped to the image coordinate at y * (image_height - 1), so as the [0, 1] interval of normalized image height is mapped to [0, image_height - 1] in image height coordinates. We do allow y1 > y2, in which case the sampled crop is an up-down flipped version of the original image. The width dimension is treated similarly. Normalized coordinates outside the [0, 1] range are allowed, in which case we use extrapolation_value to extrapolate the input image values.

boxes 参数需要标准化坐标。这就是为什么您会得到一个黑框，其中包含第一组坐标 [100,100,300,300] (未标准化，并且未提供外推值)，而不是第二组坐标 [0.5,0.1,0.9, 0.5]。

但是，这就是为什么 matplotlib 在第二次尝试时显示乱码的原因，这只是因为您使用了错误的数据类型。引用 matplotlib documentation plt.imshow(重点是我的):

All values should be in the range [0 .. 1] for floats or [0 .. 255] for integers. Out-of-range values will be clipped to these bounds.

当您使用 [0,1] 范围之外的 float 时，matplotlib 会将您的值限制为 1。这就是为什么你会得到这些彩色像素(纯红色、纯绿色或纯蓝色，或这些的混合)。将数组转换为 uint_8 以获得有意义的图像。

plt.imshow( a[1].astype(np.uint8))

<小时/>

编辑:根据要求，我将更深入地探讨tf.image.crop_and_resize。

[当提供非标准化坐标且没有外推值时]，为什么我只得到空白结果？

引用文档:

Normalized coordinates outside the [0, 1] range are allowed, in which case we use extrapolation_value to extrapolate the input image values.

因此，允许在 [0,1] 之外的标准化坐标。但它们仍然需要正常化!在您的示例中，[100,100,300,300]，您提供的坐标形成红色方 block 。您的原始图像是左上角的小绿点!参数 extrapolation_value 的默认值为 0，因此原始图像框架之外的值将被推断为 [0,0,0] > 因此是黑色。

但是如果您的用例需要另一个值，您可以提供它。每个 channel 上的像素将采用 extrapolation_value%256 的 RGB 值。如果您需要裁剪的区域未完全包含在原始图像中，则此选项非常有用。 (例如，一个可能的用例是滑动窗口)。

关于tensorflow - 关于使用 tf.image.crop_and_resize，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/51843509/

文章推荐： machine-learning - PAC 学习中的大 O 表示法

文章推荐： javascript - 如何递归合并 2 个 javascript 对象？

文章推荐： machine-learning - 具有分类值的 KNN 无法正确预测

c - 如何对四边形使用tensorflow.crop_and_resize()
我想从由 8 个坐标点定义的图像中裁剪随机四边形并调整其大小: (xtl, ytl), (xtr, ytr), (xbr, ybr), (xbl, ybl) 我有a code sample imple
tensorflow - 关于使用 tf.image.crop_and_resize
我正在研究适用于 fast-rcnn 的 ROI 池化层，并且我习惯使用 tensorflow 。我发现 tf.image.crop_and_resize 可以充当 ROI 池化层。但是我尝试了很多

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

tensorflow - 关于使用 tf.image.crop_and_resize