pycuda - Why is my rather trivial CUDA program erring with certain arguments? -
I have created a simple CUDA program for practice. It only copies data from one array to another:
Import pycuda.driver as cuda import pycuda.autoinit as import numpy as np from pycuda.compiler Import SourceModule # Global Constant N = 2 ** 20 # Array size aa = np.linspace (0, 1, N) e = np.empty_like (a) block_size_x = 512 # Instant block and grid size Block_size = (block_size_x, 1, 1) grid_size = (N / block_size_x, 1) # Create the CUDA kernel, and run it. Mod = SourceModule ("" "__global__ zero D2x_kernel (double * A, double * E, int n) {int tid = blockDim.x * blockIdx.x + threadIdx.x; if (tid> 0 & amp; Amp; & amp; amp; amp; amp; amp;; - N - 1) {e [tid] = one [tid];}} "" "funk = mod.get_function ('D2x_kernel') func (a, cuda.InOut (e) , Np.int32 (N), however, I get this error: pycuda._driver.LogicError: cuLaunchKernel failed: invalid value / P> < P> When I get rid of the second logic in my kernel function double * e and without any logic e call the kernel, the error goes away What is the meaning of this error?
Your a array device Does not exist memory, so I suspect that PyCUDA is ignoring the first argument for your kernel setting (or otherwise handling) and only going through e and N ... so that you get an error because the kernel was expecting three arguments and it only got two. You can eliminate the error message you receive by removing double * e from your kernel definition, but your kernel will still not work properly. There should be a quick fix on this: Wrap the call a cuda.In () , which is a , that should be your kernel launch line: func (cuda.In (a), cuda.InOut (e), np. Int32 (n), block = block_size, grid = grid_iis) Edit: Also, do you know that your colonel a to e is not copying the first and last elements? If your if (TID> 0 & amp; amp; amp;; & lt; N - 1) statement is stopping it for the entire array, it should be (if & Lt; N) .
Comments
Post a Comment