Sun Dec 11 16:18:29 EST 2016


on the Radeon HD 7800 Pitcairn
[    86.336] (--) RADEON(0): Chipset: "PITCAIRN" (ChipID = 0x6811)

This would be OpenCL.



OpenGL renderer string: Gallium 0.4 on AMD PITCAIRN (DRM 2.43.0 / 4.6.0-1-amd64, LLVM 3.8.1)

# apt-get install mesa-opencl-icd


see also git/opencl archive

tom@zoe:~/opencl$ ./devices.elf
1. Platform
  Version: OpenCL 1.1 Mesa 13.0.2
  Name: Clover
  Vendor: Mesa
  Extensions: cl_khr_icd
1. Device: AMD PITCAIRN (DRM 2.43.0 / 4.6.0-1-amd64, LLVM 3.9.0)
 1.1 Hardware version: OpenCL 1.1 Mesa 13.0.2
 1.2 Software version: 13.0.2
 1.3 OpenCL C version: OpenCL C 1.1 
 1.4 Parallel compute units: 20

What's the difference between compute units (20) and stream cores (see wikipedia: 1024-1280)

CU is roughly equivalent to an independent CPU.
Each CU is subdivided into stream cores, programmed using SIMT.

GCN = graphics core next

The Graphics Core Next (officially called "Southern Islands")
microarchitecture combines 64 shader processors with 4 TMUs and 1 ROP
to a compute unit (CU).

Each Compute Unit consists of:
- a CU Scheduler
- a Branch & Message Unit
- 4 SIMD Vector Units (each 16-lane wide)
- 4 64KiB VGPR files
- 1 scalar unit
- a 4 KiB GPR file
- a local data share of 64 KiB
- 4 Texture Filter Units
- 16 Texture Fetch Load/Store Units
- a 16 KiB L1 Cache.

Four Compute units are wired to share an Instruction Cache 16 KiB in
size and a scalar data cache 32KiB in size. These are backed by the L2

A SIMD-VU operates on 16 elements at a time (per cycle), while a SU
can operate on one a time (one/cycle). In addition the SU handles some
other operations like branching.

This seems interesting: