In the series Basic Concepts I try to give an alternative description to what is said everywhere else. This time my eye fell on alternative convenience methods in two cases which were introduced there to be nice to devs with i.e. C/C++ and/or graphics backgrounds. But I see it explained too often from the convenience functions and giving the “preferred” functions as a sort of bonus which works for the cases the old functions don’t get it done. Below is the other way around and I hope it gives better understanding. I assume you have read another definition, so you see it from another view not for the first time.

Vector Elements

Vectors can be seen as structs on which the computations can be implied to all the elements at the same time. Each element can be accessed by .sX with X being 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F depending on the number of elements in the vector with a 16-vector having all 16 and an 8-vector only 0 to 7. The convenience methods are .x, .y, .z, .w, .hi, .lo, .even en .odd. Below are the the methods defined in the standard. The abbreviation N.D. stands for non-defined. For 3-vectors some functions are not explicitly not-defined, but vague in how to be implemented, so therefore I put “??” to them.

Convenience alternative	vector2	vector3	vector4	vector8	vector16
x	.s0	.s0	.s0	N.D.	N.D.
.y	.s1	.s1	.s1	N.D.	N.D.
.z	.s2	.s2	.s2	N.D.	N.D.
.w	.s3	N.D.	.s3	N.D.	N.D.
.hi	.s1	??	.s23	.s4567	.s89ABCDEF
.lo	.s0	??	.s01	.s0123	.s01234567
.even	.s0	??	.s02	.s0246	.s02468ACE
.odd	.s1	??	.s13	.s1357	.s13579BDF

To get an idea what a float4 is, here is an (incompletely) description:

struct float4 {
….float s0, s1, s2, s3;
….float x, y, z, w;
….float hi, lo, odd, even;
….float2 s01, s02, s03, s10, s12, s13, s20, s21, s23, s30, s31, s32;
….float2 xy, xz, xw, yx, yz, yw, zx, zy, zw, wx, wy, wz;
….float3 s012, s021, s023, s032, s031, s013, … /* etc */
….float3 xyz, xzy, xzw, … /* etc */
….float4 s0123, s0132, … /* etc */
/* etc – see remark below */

} float4

We are missing i.e. float8 s10123422, but that is quite hard to define in a struct (and neither is defined well in the definitions which imply no repetitions of elements). Just try if .s0011 and .xxyy works with your drivers.

Conversions

Next are conversions between types. The specified and complete function is using convert_destType<_sat><_roundingMethod>. Most developers are familiar with explicit conversions like:

float a = 5.6f;
int b = (int) a; // = 5

In OpenCL this is the convenience function and only works with ascalars and one rounding mode without saturation; a explicit conversion ‘(destType)’ can be described as ‘convert_destType_rte’ (or ‘convert_destType’).

You do use (type) when you want to widen a scalar to a vector. For example:

float8 f = (float8) 1.0f;

If you get used to convert_ then you don’t have think which method to use depending on if its a scalar or vector and depending if you need rte-rounding or another rounding and depending if you need saturation or not. As a bonus the rounding modes with 2 examples.

float	convert_int_rte	convert_int_rtz	convert_int_rtp	convert_int_rtn
+1.6f	2	1	2	1
-1.6f	-2	-1	-1	-2
+1.4f	1	1	2	1
-1.4f	-1	-1	-1	-2

Thank you

Thank you for your time; I hoped you liked the alternative view. Check the rest of the series, while it is still small.

StreamHPC communications

Basic Concepts: OpenCL Convenience Methods for Vector Elements and Type Conversions

Vector Elements

Conversions

Thank you

Related Posts

The 13 application areas where OpenCL and CUDA can be used

Basic concepts: Function Qualifiers

Get ready for conversions of large-scale CUDA software to AMD hardware

Our training concepts for GPGPU

StreamHPC communications

Vector Elements

Conversions

Thank you

Related Posts

The 13 application areas where OpenCL and CUDA can be used

Basic concepts: Function Qualifiers

Get ready for conversions of large-scale CUDA software to AMD hardware

Our training concepts for GPGPU

Discover more from StreamHPC