c - Can't return a pointer in CUDA -
for reason code seems work;
bool * copyhosttodevice(bool * h_input, size_t numelems) { bool * d_output; cudamalloc((void **) &d_output, numelems*sizeof(bool)); checkcudaerrors(cudamemcpy((void *)d_output,(void *)h_input,numelems*sizeof(bool), cudamemcpyhosttodevice)); return d_output; }
but generates error:
bool * copydevicetohost(bool * d_input, size_t numelems) { bool * h_output; cudamalloc((void **) &h_output, numelems*sizeof(bool)); cudamemcpy((void *)h_output,(void *)d_input, numelems*sizeof(bool),cudamemcpydevicetohost)); return h_output; }
i'm running remotely, in udacity class on parallel programming.
the output when call second function is:
we unable execute code. did set grid and/or block size correctly?
your code compiled!
so getting runtime error. when remove pieces of 2nd fcn, becomes clear error being generated cuamemcpy() call.
thanks in advance!
in second code using cuda_malloc
allocate h_output
, passing device-to-host copy host pointer. wrong, h_output
should host pointer. code should this:
bool * copydevicetohost(bool * d_input, size_t numelems) { bool * h_output; h_output = (bool *)malloc(numelems*sizeof(bool)); cudamemcpy((void *)h_output,(void *)d_input, numelems*sizeof(bool),cudamemcpydevicetohost)); return h_output; }
ie. use host memory allocation routine (malloc
, c++ new
, perhaps cudamallochost
if wanted pinned host memory) not device memory allocation api.
Comments
Post a Comment