Tag Archives: OpenCL

If you are looking for the samples in one zip-file, scroll down.

This sentence “NVIDIA’s Industry-Leading Support For OpenCL” was proudly used on NVIDIA’s OpenCL page last year. It seems that NVIDIA saw a great future for OpenCL on their GPUs. But when CUDA began borrowing the idea of using LLVM for compiling kernels, NVIDIA’s support for OpenCL slowly started to fade instead. Since with LLVM CUDA-kernels can be loaded in OpenCL and vice versa, this could have brought the two techniques more together.

What is the cause for this decreased support for OpenCL? Did they suddenly got aware LLVM would decrease any advantage of CUDA over OpenCL and therefore decreased support for OpenCL? Or did they decide so long ago, as their last OpenCL-conformant product on Windows is from July 2010? We cannot be sure, but we do know NVIDIA does not have an official statement on the matter.

The latest action demonstrating NVIDIA’s reduced support of OpenCL is the absence of the samples in their GPGPU-SDK. NVIDIA removed them without notice or clear statement on their position on OpenCL. Therefore we decided to start a petition to get these OpenCL samples back. The only official statement on the removal of the samples was on LinkedIn:

All of our OpenCL code samples are available at http://developer.nvidia.com/opencl, and the latest versions all work on the new Kepler GPUs.
They are released as a separate download because developers using OpenCL don’t need the rest of the CUDA Toolkit, which is getting to be quite large.
Sorry if this caused any alarm, we’re just trying to make life a little easier for OpenCL developers.

Best regards,

Will.

William Ramey
Sr. Product Manager, GPU Computing
NVIDIA Corporation

Read more …

Screenshot from Intel’s “God Rays” demo

This article is still work-in-progress

Intel has just released its OpenCL bit CPU-drivers, version 2013 bèta. It has support for OpenCL 1.1 (not 1.2 as for the CPU) on Intel HD Graphics 4000/2500 of the 3rd generation Core processors (Windows only). The release notes mention support for Windows 7 and 8, but the download-site only mentions windows 8. Support under Linux is limited to 64 bits.

The release notes mention:

  • General performance improvements for many OpenCL* kernels running on CPU.
  • Preview Tool: Kernel Builder (Windows)
  • Preview Feature: support of  kernel source code hotspots analysis with the Intel VTuneT Amplifier XE 2011 update 3 or higher.
  • The GNU Project Debugger (GDB) debugging support on Linux operating systems.
  • New OpenCL 1.2 extensions supported by the CPU device:
    • cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics
    • cl_khr_fp16
    • cl_khr_gl_sharing
    • cl_khr_gl_event
    • cl_khr_d3d10_sharing
    • cl_khr_dx9_media_sharing
    • cl_khr_d3d11_sharing.
  • OpenCL 1.1 extensions that were changed in OpenCL 1.2:
    • Device Fission supports both OpenCL 1.1 EXT API’s and also OpenCL* 1.2 fission core features
    • Media Sharing support intel 1.1 media sharing extension and also the 1.2 KHR media sharing extension
    • Printf extension is aligned with OpenCL 1.2 core feature.

Check the release notes for full information.

The drivers can be found on http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk-2013/. Installation is simple. For Windows there is a installer. If you have Linux, make sure you remove any previous version of Intel’s openCL drivers. If you have a Debian-based Linux, use the command ‘alien’ to convert the rpm to deb, and make sure ’libnuma1‘ is installed. There are requirements for libc 2.11 or 2.12 – more information on that later as Ubuntu 12.04 has libc6 2.15.

Read more …

If you want to see what is coming up in the market of consumer-technology (PC, mobile and tablet), then NVIDIA can tell you the most. The company is very flexible, and shows time after time it really knows in which markets is currently operates and can enter. I sometimes strongly disagree with their marketing, but watch them closely as they are in the most important markets to define the near future in: PCs, Mobile/Tablet and HPC.
You might think I completely miss interconnects (buses between processors, devices and memory) and memory-technologies as clouds have a large need for high-speed data-transport, but the last 20 years have shown that this is a quite stable developing market based on IP-selling to the hardware-vendors. With the acquisition of Cray’s interconnect technology, we have seen this is serious business for Intel, so things might change indeed. For this article I want to focus on NVIDIA’s choices.

The Khronos Group gave some talks on their technologies in Shanghai China on the 17th of March 2012. Neil Trevett did some interesting remarks on the position of NVidia on OpenCL I would like to share with you. Neil Trevett is both an important member of Khronos and employee of NVidia. To be more precise, he is the Vice President Mobile Content of NVidia and the president of Khronos. I think we can take his comments serious, but we must be very careful as these are mixed with his personal opinions.

Regular readers of the blog have seen I am not enthusiastic at all about NVidia’s marketing, but am a big fan of their hardware. And exactly I am very positive they are bold enough in the industry to position themselves very well with the fast-changing markets of the upcoming years. Having said that, let’s go to the quotes.

All quotes were from this video. Best you can do is to start at 41:50 till 45:35.

At 44:05 he states: “In the mobile I think space CUDA is unlikely to be widely adopted“, and explains: “A party API in the mobile industry doesn’t really meet market needs“. Then continues with his vision on OpenCL: “I think OpenCL in the mobile is going to be fundamental to bring parallel computation to mobile devices” and then “and into the web through WebCL“.

Also interesting at 44:55: “In the end NVidia doesn’t really mind which API is used, CUDA or OpenCL. As long as you are get to use great GPUs“. He ends with a smile, as “great GPUs” refers to NVidia’s of course. :)

At 45:10 he puts NVidia’s plans on HPC, before getting back to : “NVidia is going to support both [CUDA and OpenCL] in HPC. In Mobile it’s going to be all OpenCL“.

At 45:23 he repeats his statements: “In the mobile space I expect OpenCL to be the primary tool“.

Read more …

By exception, another PDF-Monday.

OpenCL vs. OpenMP: A Programmability Debate. The one moment OpenCL and the other mom ent OpenMP produces faster code. From the conclusion: “OpenMP is more productive, while OpenCL is portable for a larger class of devices. Performance-wise, we have found a large variety of ratios between the two solutions, depending on the application, dataset sizes, compilers, and architectures.”

Improving Performance of OpenCL on CPUs. Focusing on how to optimise OpenCL. From the abstract: “First, we present a static analysis and an accompanying optimization to exclude code regions from control-flow to data-flow conversion, which is the commonly used technique to leverage vector instruction sets. Second, we present a novel technique to implement barrier synchronization.”

Variants of Mersenne Twister Suitable for Graphic Processors. Source-code at http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MTGP/

Accelerating the FFTD method using SSE and GPUs. “The Finite-Difference Time-Domain (FDTD) method is a computational technique for modelling the behaviour of electromagnetic waves in 3D space”. This is a project-plan, but describes the theories pretty well. Read more …

You have run-time and compile-time of the C-code and of the OpenCL-code. It is very important to make clear when you talk about compile-time of the kernel as this can be confusing. Compile-time of the kernel is at run-time of the software after the compute-devices have been queried. The OpenCL-compiler can make better optimised code when you give as much information as possible. One of the methods is using Function Qualifiers. A function qualifier is notated as a kernel-attribute:

__kernel __attribute__((qualifier(qualification)))  void foo ( …. ) { …. }

There are three qualifiers described in OpenCL 1.x. Let’s walk through them one by one. You can also read them here in the official documentation, with more examples.

Read more …

Get in contact now!

We offer training in GPU-programming (OpenCL, CUDA, etc),
and consultancy-services for performance engineering.

Mail to info@streamcomputing.eu or fill in below form.

The web-form currently does not work.
Please send an e-mail while we resolve the issue.