gpt4 book ai didi

c++ - cudaMemcpy2D的未处理异常

转载 作者:行者123 更新时间:2023-12-02 10:06:50 26 4
gpt4 key购买 nike

我是C++的新手(以及Cuda和OpenCV),所以对我的任何错误深表歉意。
我有一个使用Cuda的现有代码。最近,它使用.png(已解码)作为输入,但是现在我使用相机生成实时图像。这些图像是代码的新输入。这里是:

using namespace cv;

INT height = 2160;
INT width = 3840;
Mat image(height, width, CV_8UC3);
size_t pitch;
uint8_t* image_gpu;

// capture image
VideoCapture camera(0);
camera.set(CAP_PROP_FRAME_WIDTH, width);
camera.set(CAP_PROP_FRAME_HEIGHT, height);
camera.read(image);

// here I checked if image is definitly still a CV_8UC3 Mat with the initial height and width; and it is

cudaMallocPitch(&image_gpu, &pitch, width * 4, height);

// here I use cv::Mat::data to get the pointer to the data of the image:
cudaMemcpy2D(image_gpu, pitch, image.data, width*4, width*4, height, cudaMemcpyHostToDevice);

该代码可以编译,但是我在最后一行(cudaMemcpy2D)收到“抛出异常”的错误代码:
在realtime.exe中在0x00007FFE838D6660(nvcuda.dll)处引发异常:0xC0000005:访问冲突读取位置0x000001113AE10000。

Google没有给我答案,我不知道ho从这里开始。

感谢您的提示!

最佳答案

将OpenCV Mat复制到使用cudaMallocPitch分配的设备内存的一种相当通用的方法是利用step对象的Mat成员。同样,在分配设备内存时,您必须牢记直观的感觉,即如何分配设备内存以及如何将Mat对象复制到该对象。这是一个简单的示例,演示了使用VideoCapture捕获的视频帧的过程。

#include<iostream>
#include<cuda_runtime.h>
#include<opencv2/opencv.hpp>

using std::cout;
using std::endl;

size_t getPixelBytes(int type)
{
switch(type)
{
case CV_8UC1:
case CV_8UC3:
return sizeof(uint8_t);
break;
case CV_16UC1:
case CV_16UC3:
return sizeof(uint16_t);
break;
case CV_32FC1:
case CV_32FC3:
return sizeof(float);
break;
case CV_64FC1:
case CV_64FC3:
return sizeof(double);
break;
default:
return 0;
}
}

int main()
{
cv::VideoCapture cap(0);
cv::Mat frame;

if(cap.grab())
{
cap.retrieve(frame);
}
else
{
cout<<"Cannot read video"<<endl;
return -1;
}

uint8_t* gpu_image;
size_t gpu_pitch;

//Get number of bytes occupied by a single pixel. Although VideoCapture mostly returns CV_8UC3 type frame thus pixelBytes is 1 , but just in case.
size_t pixelBytes = getPixelBytes(frame.type());

//Number of actual data bytes occupied by a row.
size_t frameRowBytes = frame.cols * frame.channels * pixelBytes;

//Allocate pitch linear memory on device
cudaMallocPitch(&gpu_image, &gpu_pitch, frameRowBytes , frame.rows);

//Copy memory from frame to device mempry
cudaMemcpy2D(gpu_image, gpu_pitch, frame.ptr(), frame.step, frameRowBytes, frame.rows, cudaMemcpyHostToDevice);

//Rest of the code ...
return 0;
}

免责声明:
代码是在浏览器中编写的。尚未测试。请根据需要添加 CUDA error checking

关于c++ - cudaMemcpy2D的未处理异常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59820663/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com