c - How to treat 64-bit words on a CUDA device? -


I want to handle 64-bit words directly on the CUDA platform (like uint64_t vars). I understand that all addressing space, registers and SP architecture are all 32-bit based.

I actually got it to work correctly (on my CUDA CC.1.1 card):

__ global__ blank test64Kernel (uint64_t * words) {( * Word) & lt; & Lt; = 56; }

But I do not know, for example, how it affects the use of registers and the operation per clock cycle count.

What type of data types can you use 32-bit addresses or any other type of data In your example, you have a pointer (32-bit, 64-bit, 3-bit (!) - does not matter) for a 64-bit signed integer.

64-bit integers are supported CUDA but for every 64-bit value, you are submitting as much data twice as 32-bit values ​​and will use more registers than this and arithmetic Operations will take longer (two 64-bit integers will be expanded to only small) Datatype is used to push in the next sub-word). Compiler is an optimization compiler, so try to minimize the effect of this.

Note that using double precision floating point, 64-bit is only supported in devices with compute capability 1.3 or higher (i.e. 1.3 or 2.0 this time)


Comments

Popular posts from this blog

windows - Heroku throws SQLITE3 Read only exception -

lex - Building a lexical Analyzer in Java -

python - rename keys in a dictionary -