Basic Concepts: Writing OpenCL code for single and double precision
Support for double precision floating-point type double in OpenCL kernels requires an extension. AMD provides cl_khr_fp64 for newer high-edn hardware, but also a non-fully compliant cl_amd_fp64 extension for other hardware. NVIDIA and Intel support the cl_khr_fp64, so no exceptions need to be made for those drivers.
The code you see bellow these lines is based on a page you can find on Bealto and it was written by Eric Bainville. I added extra typedefs, removed a constant and added DOUBLE_SUPPORT_AVAILABLE for easier fallback.
#if CONFIG_USE_DOUBLE #if defined(cl_khr_fp64) // Khronos extension available? #pragma OPENCL EXTENSION cl_khr_fp64 : enable #define DOUBLE_SUPPORT_AVAILABLE #elif defined(cl_amd_fp64) // AMD extension available? #pragma OPENCL EXTENSION cl_amd_fp64 : enable #define DOUBLE_SUPPORT_AVAILABLE #endif #endif // CONFIG_USE_DOUBLE #if defined(DOUBLE_SUPPORT_AVAILABLE) // double typedef double real_t; typedef double2 real2_t; typedef double3 real3_t; typedef double4 real4_t; typedef double8 real8_t; typedef double16 real16_t; #define PI 3.14159265358979323846 #else // float typedef float real_t; typedef float2 real2_t; typedef float3 real3_t; typedef float4 real4_t; typedef float8 real8_t; typedef float16 real16_t; #define PI 3.14159265359f #endif
A macro is defined by the OpenCL C compiler for each available extension, which is cl_khr_fp64 in this example. This macro can be tested to enable the extension with #pragma OPENCL EXTENSION cl_khr_fp64 : enable.
Now, you need to use the defined constant(s) and real_t, real2_t types instead of float or double. The definition of CONFIG_USE_DOUBLE is passed as compilation option to clBuildProgram to make the switch between double and single precision. If there is no double-support, it falls back to single precision.
Enjoyed this post? Share it!