是否可以在 Visual Studio 编写的 MexGateway 代码中将变量预先分配给 CPU/GPU 内存？

问题描述

我正在尝试编写一个 MexGateway 代码，将 matlab 中的两个变量传递给编译后的 MexFile，将变量复制到 cuda 内核，进行处理并将结果返回给 Matlab。我需要在 matlab 的 for 循环中使用这个 MexFile。

问题在于：这两个输入对于我的应用程序来说非常大，并且在每个循环中只有其中一个（在以下代码中称为 Device_Data）在更改。所以，我正在寻找一种预先分配稳定输入的方法，这样它就不会在我的 for 循环的每次迭代中从 GPU 中删除。我还需要说，我真的需要在我的 Visual Studio 代码中做到这一点，并在 MexGateway 代码中做到这一点（我不想在 Matlab 中做到这一点）。有什么解决办法吗？

这是我的代码（我已经编译过了。它工作正常）：

#include <cuda_runtime.h>
#include "device_launch_parameters.h"
#include <stdio.h>
#include "cuda.h"
#include <iostream>
#include <mex.h>
#include "MexFunctions.cuh"




__global__ void add (int* Device_Data,int* Device_MediumX,int N) {
int TID = threadIdx.y * blockDim.x + threadIdx.x;
if (TID < N) {
    for (int i = 0; i < N; i++) {
        Device_Data[i] = Device_Data[i] + Device_MediumX[i];
    }
}
}
void mexFunction(int nlhs,mxArray* plhs[],int nrhs,const mxArray* prhs[]) {

int N = 128;
int* MediumX;
int* Data;
int* Data_New;

MediumX = (int*)mxGetPr(prhs[0]);
Data = (int*)mxGetPr(prhs[1]);

plhs[0] = mxCreateNumericMatrix(N,1,mxINT32_CLASS,mxREAL);
Data_New = (int*)mxGetData(plhs[0]);


int ArrayByteSize = sizeof(int) * N;
int* Device_MediumX; // device pointer to the X coordinates of the medium
gpuErrchk(cudaMalloc((int**)&Device_MediumX,ArrayByteSize));
gpuErrchk(cudaMemcpy(Device_MediumX,MediumX,ArrayByteSize,cudaMemcpyHostToDevice));

int* Device_Data; // device pointer to the X coordinates of the medium
gpuErrchk(cudaMalloc((int**)&Device_Data,ArrayByteSize));
gpuErrchk(cudaMemcpy(Device_Data,Data,cudaMemcpyHostToDevice));

dim3 block(N,1);
dim3 grid(1);//SystemSetup.NumberOfTransmitter
add << <grid,block >> > (Device_Data,Device_MediumX,N);

(cudaMemcpy(Data_New,Device_Data,cudaMemcpyDeviceToHost));


cudaDeviceReset();

}

解决方法

可以，只要你有 MATLAB 的分布式计算工具箱/并行计算工具箱。

工具箱允许在普通 MATLAB 代码中有一个名为 gpuArrays 的东西，但它也有一个 C 接口，您可以在其中获取和设置这些 MATLAB 数组的 GPU 地址。

您可以在此处找到文档：

https://uk.mathworks.com/help/parallel-computing/gpu-cuda-and-mex-programming.html?s_tid=CRUX_lftnav

例如，对于 mex 文件的第一个输入：

mxGPUArray const *dataHandler= mxGPUCreateFromMxArray(prhs[0]); // Can be CPU or GPU,will copy to GPU if its not already there
float  *  d_data = static_cast<float  *>( (float *)mxGPUGetDataReadOnly(dataHandler)); // get the pointer itself (assuming float data)

cuda memory mex

是否可以在 Visual Studio 编写的 MexGateway 代码中将变量预先分配给 CPU/GPU 内存？

问题描述

解决方法

相关问答