python - 图像转换的单应性不起作用

转载作者：太空宇宙更新时间：2023-11-03 21:40:22

我正在尝试进行多尺度模板匹配来检测模板，然后使用 alpha 混合将 png 粘贴到检测到的区域，并使用单应性来转换图像。我在实时网络摄像头捕获中执行此操作，但在使用单应性后我没有得到预期的结果。我将按照我的描述逐部分提及我的代码。

1)多尺度模板匹配

import cv2 as cv2
import numpy as np
import imutils


def main():
    template1 = cv2.imread("C:\\Users\\Manthika\\Desktop\\opencvtest\\templates\\template1.jpg")
    template2 = cv2.imread("C:\\Users\\Manthika\\Desktop\\opencvtest\\templates\\temp.jpg")
    templates = [template1, template2]

    for i in range(len(templates)):
        templates[i] = cv2.cvtColor(templates[i], cv2.COLOR_BGR2GRAY)
        templates[i] = cv2.Canny(templates[i], 50, 140)
        templates[i] = cv2.GaussianBlur(templates[i],(5,5),0)
        templates[i] = imutils.resize(templates[i], width=50)

    (tH, tW) = templates[0].shape[:2]
    # print(tH)
    # print(tW)
    # cv2.imshow("Template", template)

    cap = cv2.VideoCapture(0)

    if cap.isOpened():
        ret, frame = cap.read()
    else:
        ret = False

    # loop over the frames to find the template
    while ret:
        # load the image, convert it to grayscale, and initialize the
        # bookkeeping variable to keep track of the matched region
        ret, frame = cap.read()
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        found = None

        # loop over the scales of the image
        for scale in np.linspace(0.2, 1.0, 20)[::-1]:
            # resize the image according to the scale, and keep track
            # of the ratio of the resizing
            resized = imutils.resize(gray, width=int(gray.shape[1] * scale))
            r = gray.shape[1] / float(resized.shape[1])

            # if the resized image is smaller than the template, then break
            # from the loop
            if resized.shape[0] < tH or resized.shape[1] < tW:
                print("frame is smaller than the template")
                break

            # detect edges in the resized, grayscale image and apply template
            # matching to find the template in the image
            edged = cv2.Canny(resized, 50, 160)
            blurred = cv2.GaussianBlur(edged,(5,5),0)

            curr_max = 0
            index = 0
            result = None

            # find the best match
            for i in range(len(templates)):
                # perform matchtemplate
                res = cv2.matchTemplate(blurred, templates[i], cv2.TM_CCOEFF)
                # get the highest correlation value of the result
                maxVal = res.max()
                # if the correlation is highest thus far, store the value and index of template
                if maxVal > curr_max:
                    curr_max = maxVal
                    index = i
                    result = res

            print(index)
            # result = cv2.matchTemplate(edged, templates[index], cv2.TM_CCOEFF)
            (_, maxVal, _, maxLoc) = cv2.minMaxLoc(result)

            # if we have found a new maximum correlation value, then update
            # the bookkeeping variable
            if found is None or maxVal > found[0]:
                found = (maxVal, maxLoc, r)

            # unpack the bookkeeping variable and compute the (x, y) coordinates
            # of the bounding box based on the resized ratio
        # print(found)
        (_, maxLoc, r) = found
        (startX, startY) = (int(maxLoc[0] * r), int(maxLoc[1] * r))
        (endX, endY) = (int((maxLoc[0] + tW) * r), int((maxLoc[1] + tH) * r))

这工作正常，正如我所料，而且没有错误。我可以获得 (startX, startY) 和 (endX, endY) 值以在检测到的区域周围绘制边界框。

2) 使用 alpha 混合在检测到的区域上粘贴 png

        cropped = frame[startY:endY, startX:endX]
        cv2.imshow("cropped", cropped)

        # Read the foreground image with alpha channel
        foreGroundImage = cv2.imread("C:\\Users\\Manthika\\Desktop\\opencvtest\\tattoo2.png", -1)
        # Read background image
        background = cropped
        dim = (background.shape[1], background.shape[0])
        foreGroundImage = cv2.resize(foreGroundImage, dim)

        # Split png foreground image
        b, g, r, a = cv2.split(foreGroundImage)

        # Save the foregroung RGB content into a single object
        foreground = cv2.merge((b, g, r))

        # Save the alpha information into a single Mat
        alpha = cv2.merge((a, a, a))

        # background = cv2.resize(background, dim, interpolation = cv2.INTER_AREA)

        # Convert uint8 to float
        foreground = foreground.astype(float)
        background = background.astype(float)
        alpha = alpha.astype(float) / 255

        # Perform alpha blending
        foreground = cv2.multiply(alpha, foreground)
        beta = 1.0 - alpha
        background = cv2.multiply(beta, background)
        outImage = cv2.add(foreground, background)
        outImage = outImage/255
        cv2.imshow("outImage", outImage)
        print(outImage.shape)

在这里，我裁剪了检测到的框架部分并在其上粘贴了 png。 outImage 是该过程的输出。我也如我所料得到它。

3)使用单应性变换图像

        # Read source image.
        im_src = outImage.copy()
        size = im_src.shape

        # Create a vector of source points.
        pts_src = np.array(
            [
                [0, 0],
                [size[1] - 1, 0],
                [size[1] - 1, size[0] - 1],
                [0, size[0] - 1]
            ], dtype=float
        )

        # Read destination image
        im_dst = frame.copy()
        cv2.imshow("im_dst", im_dst)

        # Create a vector of destination points.
        pts_dst = np.array(
            [
                [startX, startY],
                [endX, startY],
                [endX, endY],
                [startX, endY]
            ]
        )

        # Calculate Homography between source and destination points
        h, status = cv2.findHomography(pts_src, pts_dst)

        # Warp source image
        im_temp = cv2.warpPerspective(im_src, h, (im_dst.shape[1], im_dst.shape[0]))

        # Black out polygonal area in destination image.
        cv2.fillConvexPoly(im_dst, pts_dst.astype(int), 0, 16)

        # Add warped source image to destination image.
        im_dst = im_dst + im_temp


        cv2.imshow("Final", im_dst)
        cv2.imshow("frame2222", frame)


        if cv2.waitKey(1) == 27:
            break

    cv2.destroyAllWindows()
    cap.release()


if __name__ == "__main__":
    main()

在这里，我想要将 alpha 混合 outImage 粘贴到框架的给定点上。当我从 im_src = cv2.imread("someimage.png") 替换 im_src = outImage.copy() 并运行时，它工作正常。 我可以读取图像并将其粘贴到框架上，但我无法获取 outImage 并执行相同的操作。如果您能帮我解决这个问题，那就太好了。如果您需要我使用的图像或输出，请告诉我。

编辑:

使用 im_src = cv2.imread("someimage.png") 输出someimage.png 显示在模板上

使用 im_src = outImage.copy() 输出框架的其他部分是白色

最佳答案

问题

在糟糕的情况下，即使用 im_src = outImage.copy() 你有 dtype= float64 如果你看一下这一行:

outImage = outImage/255

然后您会注意到您的值从 0 到 1。那么你有:

im_dst = frame.copy()

和

im_temp = cv2.warpPerspective(im_src, h, (im_dst.shape[1], im_dst.shape[0]))
...
im_dst = im_dst + im_temp

这意味着 im_dst 的类型为 CV_8UC3(或在 numpy numpy.uint8 中)，因为它是从相机帧复制的。此值介于 0-255 之间。然后你添加两个具有不同类型和不同值范围的图像，最后给出一个 float 类型的图像，但背景的值是从 0-255，在 imshow 中显示为白色，如果对于 float 类型图像，值为 >= 1。

在好的情况下，类型相同，不会出现这个问题。

解决方案:

一个是做:

im_src = np.uint8(outImage.copy() * 255)

但是如果你不需要 outImage 作为其他东西的 float ，只需替换:

outImage = cv2.add(foreground, background)
outImage = outImage/255

对于:

outImage = cv2.add(foreground, background, dtype=np.uint8)

建议:

我发现有几件事可以更快地完成(更少的操作)，这只是我建议您做的几项更改:

1) 这个:

    # Split png foreground image
    b, g, r, a = cv2.split(foreGroundImage)

    # Save the foregroung RGB content into a single object
    foreground = cv2.merge((b, g, r))

等同于:

    # Split png foreground image
    a = foreGroundImage[:,:,3]

    # Save the foregroung RGB content into a single object
    foreground = foreGroundImage[:,:,0:3]

2) 最后一部分完全可以通过调整大小和复制来完成。除非您打算对图像进行旋转或其他操作，否则单应性就太过分了。

类似于:

im_dst[startY:endY, startX,endX] = cv2.resize(im_src, (endX-startX, endY-startY))

关于python - 图像转换的单应性不起作用，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55966345/

文章推荐： javascript - CSS 如何禁用 window.onscroll？

文章推荐： python - 无法在 Windows 上运行 Pangolin pyexamples

文章推荐： c# - 是否可以根据条件设置临界区？

文章推荐： css - 我怎样才能使响应图像居中

实例分析Try {} Catch{} 作用
今天有小伙伴给我留言问到，try{...}catch(){...}是什么意思？它用来干什么？简单的说他们是用来捕获异常的下面我们通过一个例子来详细讲解下
html - 列表社交媒体链接的 ARIA 作用
我正在努力提高网站的可访问性，但我不知道如何在页脚中标记社交媒体链接列表。这些链接指向我在 facecook、twitter 等上的帐户。我不想用 role="navigation" 标记这些链接，因
java.util.Timer SystemTime 作用？
说现在是 6 点，我有一个 Timer 并在 10 点安排了一个 TimerTask。之后，System DateTime 被其他服务(例如 ntp)调整为 9 点钟。我仍然希望我的 TimerTas
php - 什么是 Doctrine hydration 作用？
就目前而言，这个问题不适合我们的问答形式。我们希望答案得到事实、引用资料或专业知识的支持，但这个问题可能会引发辩论、争论、投票或扩展讨论。如果您觉得这个问题可以改进并可能重新打开，visit the
python入门:argparse浅析 nargs='+'作用
我就废话不多说了，大家还是直接看代码吧~ ? 1
Maven是什么?Maven的概念+作用+仓库的介绍+常用命令的详解
Maven系列1 1.什么是Maven？ Maven是一个项目管理工具，它包含了一个对象模型。一组标准集合，一个依赖管理系统。和用来运行定义在生命周期阶段中插件目标和逻辑。核心功能 Mav

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 图像转换的单应性不起作用

问题

解决方案:

建议: