Kernel launcher

A colleague of mine has used the variadic templates of C++ to mimic a kernel launcher:

template<typename F, typename... Ts>
void launch2D(const dim3 & numBlocks, const dim3 & blockDim, F & f, Ts&&... ts)
{
	for (int bx=0;bx<numBlocks.x;++bx)
	for (int by=0;by<numBlocks.y;++by)
	{
		#pragma omp parallel num_threads(blockDim.x*blockDim.y)
		{
			const int tn = omp_get_thread_num();
			const int tx = tn % blockDim.y;
			const int ty = tn / blockDim.y;
			f(numBlocks, blockDim, {bx,by}, {tx,ty}, ts...);
		}
	}
}

// ...

	const dim3 threadsperBlock {BlockSize,BlockSize};
	const dim3 numBlocks{N/threadsperBlock.x,N/threadsperBlock.y};
	launch2D(numBlocks, threadsperBlock, matrix_multiplication_kernel<BlockSize>, a.data(), b.data(), c.data(), N);

This is kind of like the CUDA triple chevron

launch2d<<<numBlocks,threadsperBlock>>>(matrix_multiplication_kernel<BlockSize>, a.data(), b.data(), c.data(), N)

I suppose it's possible to do something similar with Fypp, Fortran and OpenMP/OpenACC/CUDA. I came up with the following solution, but it lacks encapsulation:

#:def LAUNCH1D(kernel, n)
block
integer :: i
    !$omp parallel for simd
    do i = 1, ${n}$
        $:kernel
    end do
    !$omp end parallel for simd
end block
#:enddef

#:call LAUNCH1D
y(i) = a*x(i) + y(i)
#:nextarg
n
#:endcall

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions