Budúca architektúra GPU AMD na implementáciu hardvéru BFloat16

A future AMD graphics architecture could implement BFloat16 floating point capability on the silicon. Updates to AMD's ROCm libraries on GitHub dropped a big hint as to the company implementing the compute standard, which has significant advantages over FP16 that's implemented by current-gen AMD GPUs. BFloat16 offers a significantly higher range than FP16, which caps out at just 6.55 x 10^4, forcing certain AI researchers to 'fallback' to the relatively inefficient FP32 math hardware. BFloat16 uses three fewer significand bits than FP16 (8 bits versus 11 bits), offering 8 exponent bits, while FP16 only offers 5 bits. BFloat16 is more resilient to overflow and underflow in conversions to FP32 than FP16 is, since BFloat16 is essentially a truncated FP32. The addition of BFloat16 is more of a 'future-proofing' measure by AMD. Atomic operations in modern 3D game rendering are unlikely to benefit from BFloat16 in comparison to FP16. BFloat16, however, will pay huge dividends to the AI machine-learning community. Sources: ROCm (Github), dylan522p (Reddit), Dr Nick Higham