Appendix D. OpenCL on mobile devices

 

Each new generation of mobile devices provides more capabilities than the last, and it’s only a matter of time before high-performance embedded computing becomes a serious priority. A great deal of this performance will be provided by the devices’ GPUs, so it’s important to understand how OpenCL operates on handheld and mobile devices.

Chapter 10 of the OpenCL 1.1 standard defines the OpenCL Embedded Profile. This is the criteria that embedded devices must meet to be considered OpenCL-compliant. These requirements are a subset of the rules that apply to desktop systems, so there’s nothing significantly new or different to learn. But when you’re porting OpenCL code to run on a tablet computer or smart phone, it’s crucial to know which capabilities are available and which aren’t.

First, OpenCL provides a macro that kernels can check to see if the target implements the embedded profile: __EMBEDDED_PROFILE__. If this macro is set to 1, the kernel can only access an abridged set of OpenCL capabilities. Otherwise, the kernel can access all the capabilities defined by the full profile.

The differences between the embedded profile and the full profile fall into two main categories: numerical processing and image processing. This appendix discusses both, and we’ll start by examining how embedded OpenCL processes numbers.

D.1. Numerical processing

D.2. Image processing

D.3. Summary