Functions | |
__global__ void | gpuGenerateVerletList (FLOAT_ARRAY_TYPE *dPos, int *dVlists, size_t pitchDVlists, int *dVcount, float *dmax, float rListSq, int N, float boxSize, bool *doUpdate) |
Kernel function to generate Verlet lists for all particles. | |
__global__ void | gpuGenerateVerletListSmem (FLOAT_ARRAY_TYPE *dPos, int *dVlists, size_t pitchDVlists, int *dVcount, float *dmax, float rListSq, int N, float boxSize, bool *doUpdate) |
Kernel function to generate Verlet lists for all particles - using registers and shared memory. | |
__global__ void | gpuGenerateVerletListSmemVar1 (FLOAT_ARRAY_TYPE *dPos, int *dVlists, size_t pitchDVlists, int *dVcount, float *dmax, float rListSq, int N, float boxSize, bool *doUpdate) |
Kernel function to generate Verlet lists for all particles - using shared memory. | |
__global__ void | gpuGenerateVerletListVar1 (FLOAT_ARRAY_TYPE *dPos, int *dVlists, size_t pitchDVlists, int *dVcount, float *dmax, float rListSq, int N, float boxSize, bool *doUpdate) |
Kernel function to generate Verlet lists for all particles - using registers. |
This module contains all functions regarding the Verlet List procedure
__global__ void gpuGenerateVerletList | ( | FLOAT_ARRAY_TYPE * | dPos, | |
int * | dVlists, | |||
size_t | pitchDVlists, | |||
int * | dVcount, | |||
float * | dmax, | |||
float | rListSq, | |||
int | N, | |||
float | boxSize, | |||
bool * | doUpdate | |||
) |
To speedup short range force calculations, this kernel function calculates for each particle a list of possible interaction partners sitting inside a certain radius around the particle's position.
[in] | dPos | Particle position array containing N particles |
[out] | dVlists | Array holding in each column the Verlet List for one particle |
[in] | pitchDVlists | Pitch value for accessing the vlists array via CUDA in the right way. |
[out] | dVcount | This vector stores for each particle the corresponding Verlet list length. |
[in,out] | dmax | Maximum particle displacement is set to zero |
[in] | rListSq | Squared cutoff radius of the Verlet List |
[in] | N | Number of particles |
[in] | boxSize | Edge length of the cubic simulation box |
[in,out] | doUpdate | Flag for Verlet List update indication is set to \ false |
Definition at line 28 of file gpu_tools.cu.
References FLOAT_ARRAY_TYPE, and gpuDistPBC().
__global__ void gpuGenerateVerletListSmem | ( | FLOAT_ARRAY_TYPE * | dPos, | |
int * | dVlists, | |||
size_t | pitchDVlists, | |||
int * | dVcount, | |||
float * | dmax, | |||
float | rListSq, | |||
int | N, | |||
float | boxSize, | |||
bool * | doUpdate | |||
) |
To speedup short range force calculations, this function calculates for each particle a list of possible interaction partners sitting inside a certain radius around the particle's position.
Inside each thread the position of the reference particle is stored inside a register. Then each thread block pulls a block of particles position into shared memory. Finally each thread receives via broadcast the shared memory positions and generates the Verlet List for its reference particle.
numThreadsPerBlock*sizeof
(FLOAT_ARRAY_TYPE) [in] | dPos | Particle position array containing N particles |
[out] | dVlists | Array holding in each column the Verlet List for one particle |
[in] | pitchDVlists | Pitch value for accessing the vlists array via CUDA in the right way. |
[out] | dVcount | This vector stores for each particle the corresponding Verlet list length. |
[in,out] | dmax | Maximum particle displacement is set to zero |
[in] | rListSq | Squared cutoff radius of the Verlet List |
[in] | N | Number of particles |
[in] | boxSize | Edge length of the cubic simulation box |
[in,out] | doUpdate | Flag for Verlet List update indication is set to \ false |
Definition at line 86 of file gpu_tools.cu.
References FLOAT_ARRAY_TYPE, and gpuDistPBC().
Referenced by main().
__global__ void gpuGenerateVerletListSmemVar1 | ( | FLOAT_ARRAY_TYPE * | dPos, | |
int * | dVlists, | |||
size_t | pitchDVlists, | |||
int * | dVcount, | |||
float * | dmax, | |||
float | rListSq, | |||
int | N, | |||
float | boxSize, | |||
bool * | doUpdate | |||
) |
To speedup short range force calculations, this function calculates for each particle a list of possible interaction partners sitting inside a certain radius around the particle's position.
Inside each thread the position of the reference particle is stored in shared memory. For all interacting particles - exept the current thread block - the coordinates are read from global memory, those of the current block are read from shared memory.
numThreadsPerBlock*sizeof
(FLOAT_ARRAY_TYPE) \- The performance of gpuGenerateVerletListSmem is sligtly better. N
has to be equal to numThreadsPerBlock*numBlocks
, otherwise the synchonization fails![in] | dPos | Particle position array containing N particles |
[out] | dVlists | Array holding in each column the Verlet List for one particle |
[in] | pitchDVlists | Pitch value for accessing the vlists array via CUDA in the right way. |
[out] | dVcount | This vector stores for each particle the corresponding Verlet list length. |
[in,out] | dmax | Maximum particle displacement is set to zero |
[in] | rListSq | Squared cutoff radius of the Verlet List |
[in] | N | Number of particles |
[in] | boxSize | Edge length of the cubic simulation box |
[in,out] | doUpdate | Flag for Verlet List update indication is set to \ false |
Definition at line 128 of file gpu_tools.cu.
References FLOAT_ARRAY_TYPE, and gpuDistPBC().
__global__ void gpuGenerateVerletListVar1 | ( | FLOAT_ARRAY_TYPE * | dPos, | |
int * | dVlists, | |||
size_t | pitchDVlists, | |||
int * | dVcount, | |||
float * | dmax, | |||
float | rListSq, | |||
int | N, | |||
float | boxSize, | |||
bool * | doUpdate | |||
) |
To speedup short range force calculations, this function calculates for each particle a list of possible interaction partners sitting inside a certain radius around the particle's position.
[in] | dPos | Particle position array containing N particles |
[out] | dVlists | Array holding in each column the Verlet List for one particle |
[in] | pitchDVlists | Pitch value for accessing the vlists array via CUDA in the right way. |
[out] | dVcount | This vector stores for each particle the corresponding Verlet list length. |
[in,out] | dmax | Maximum particle displacement is set to zero |
[in] | rListSq | Squared cutoff radius of the Verlet List |
[in] | N | Number of particles |
[in] | boxSize | Edge length of the cubic simulation box |
[in,out] | doUpdate | Flag for Verlet List update indication is set to \ false |
Definition at line 57 of file gpu_tools.cu.
References FLOAT_ARRAY_TYPE, and gpuDistPBC().