The flagship implementation, dubbed IMG AXT-64-2048, is rated at 2Tflop/s for computation, 64Gpixel/s for graphics processing and 8Tops for artificial intelligence processing.

IMG A-Series delivers improvement at the same clock speed and process compared with current PowerVR devices shipping, claims the firm, offering 2.5x the performance, 8x faster machine learning processing and 60% lower power.Itis the fastest GPU intellectual property ever released, said Imagination, listing: automotive, IoT artificial intelligence, digital TV, set-top box, over-the-top, phones and servers amongst potential applications.

Geometry processing in maximum configuration A-Series GPU (left)

The 2.5x performance increase is in computational speed/mm2 compared with a recent Series 9 core, running the computation-heavy Manhatten benchmark, PowerVR product director Kristof Beets told Electronics Weekly.

In the language used to compare CPU cores, according to Beets, Imaginations earlier GPU cores were like CISC (complex instruction set) CPUs, whereasthe new A-Series GPU are RISC-like in nature with a reduced instruction set and therefore simpler hardware.

GPUs traditionally had complex ALUs [arithmetic logic units] like CISC compared with RISC. If you simplify theinstruction set, you get higher hardware utilisation, he explained.

WithCISC, the onus is on the compiler to keep hardware filled with work. With a RISC-ish architecture in the A-Series, and having gone ultra-wide as well A-Series has 128 ALUs operating in parallel A-Series is much easier for compilation, said Beets.

With all those busses in parallel, wont any silicon need an awful lot of metal layers?

No, said Beet: Actually, congestion is much better in this [A-Series] design as everything much more regular. The older GPUs have fewer ALUs, but a lot of multiplexers.

A-Series ALU

For comparison, the earlier (Rogue) ALU

As well as 128 ALUs, there are 32 of what Beets calls more-than-ALUs (diagram right), which are far more capable than the simple versions, and are intended forsine, cos, log calculations and atomic operations, amongst other functions. Both pipelines can be used at the same time.

Looking at the maximum configuration (AXT-64-2048, diagram left), there are four identical blocks dubbed scaleable processing units (SPUs).

Within each of these SPUs is a square representing 128 32bit ALUs, each of which can perform multiply and add simultaneously, giving 256flop.

Looking closely, there is another square behind that per SPU a second set of ALUs, pushing capacity to 512flop in parallel.

Configurations are available with one, two, three or four of these dual ALU SPUs. In the pictured maximum configuration, there are four, taking its capacity to 2,048 simultaneous 32bit floating point operations to which can be added the capacity of the associated more-than-ALUs.

Pixel process flow in A-Series rendering the gold surface (top right) over the geometry tile-by-tile (right)

Imagination is claiming, although not yet quantifying, power savings from the A-Series architecture (It has lower power compared to competitors at the same clock and process) and foresees this giving it an advantage in phones used for gaming.

Most mobile GPUs struggle to deliver consistent gaming experiences due to thermal constraints. They are fast for a while, then slow, and then never recover, said Imagination. IMG A-Series delivers sustained performance for extended game play at consistent frame rates avoiding thermal panic clock throttling, glitches or over-heating.

The firms tile-based deferred rendering, where only what is needs to be visible on the screen is drawn, is part of power saving, as is active dynamic voltage and frequency scaling, controlled by low-latency deadline scheduling algorithms. If parts of the GPU arent fully utilised or needed, they are immediately slowed down or even put to sleep to ensure power efficiency, said Imagination.

For development, there is a tool-set and SDK (software design kit) as well as some on-GPU hardware for example, counters have been included to report per-tile processing (in the form of a visual heat map in the tools) to allow application developers to focus additional hardware at bottle-neck-causing tricky parts of an image.API standards including OpenGL ES, Vulkan, OpenCL are supported.

Computational flow in A-Series (right)

Suffixes from earlier Series-n nomenclature is retained.

A-Series series will be split into:

IMG A-Series cores are:

Concurrent tasks in A-Series

Within each group of 128 parallel ALUs, all ALUs perform the same operation simultaneously, so any multi-tasking within a 128 ALU block is time-sliced on to it.

To save power and time during task-switching,ALUs have register bank to keep the data from a large number of threads local, said Beets.

To map and prioritise tasks on to the various time-slices available across up to four SPUs, imagination has created an operating system calledHyperLane, which can treat all available GPU resources as up to eight virtual GPUs dubbed HyperLanes each of which can run more than one task.

As well as 32bit floating point data, for artificial intelligence processing each ALU can work with 8bit weighted data. HyperLane has a feature called AI Synergy which enables sufficient GPU graphics performance to be delivered, while allocating spare resources to implement programmable AI. AI Synergy delivers programmable AI in the lowest silicon area, while a unified software stack enables flexibility and performance, said he firm. The resource split between graphics and AI processing can be dynamic.HyperLane technology can also isolate protected content for rights management. All IMG A-Series GPUs support up to eight HyperLanes.

HyperLane also interacts with virtualisation hardware which is controlled by a separate on-board microcontroller running hypervisor code. Separated by hardware virtualisation, up to eight programmes can be run independently.

Visit link:
Imagination boosts parallelism in GPUs to speed mobile graphics and AI - Electronics Weekly

Related Posts
December 5, 2019 at 4:47 am by Mr HomeBuilder
Category: Tile Work