기본 정보
연구 분야
프로젝트
발행물
구성원
article|
인용수 0
·2025
MaxiMoff: Designing Matrix Multiplication Accelerator for Effective Multiply-Add Operations Offloading
S. Kim, Dongho Ha, Seunghwan Sung, Won Woo Ro
IF 5.4IEEE Transactions on Emerging Topics in Computing
초록

Contemporary GPU architectures integrate specialized computing units for matrix multiplication, named matrix multiplication units (MXUs), to effectively process neural network applications. However, since MXUs are limited to matrix multiplications, GPUs show inefficiencies in computing resource utilization while applications are unrelated to matrix multiplications. Furthermore, despite prior work to leverage MXUs in general-purpose computing, they are constrained by static analysis, limiting their adaptability and hardware utilization efficiency. This study observes that the techniques emulating high-bitwidth multiplication with low-bitwidth ones transform a single high-bitwidth Multiply-and-ADd (MAD) operation into a low-bitwidth dot-product operation. Leveraging this observation, we propose <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">MaxiMoff</i>, a novel GPU architecture to utilize general-purpose cores and MXUs while computing MAD instructions dynamically. With this extended design, MaxiMoff achieves an average speedup of 1.39× and reduces total energy consumption by 17.3%.

키워드
Matrix multiplicationSpeedupLeverage (statistics)Energy consumptionAdaptabilityProcess (computing)Multiplication (music)LimitingHardware acceleration
타입
article
IF / 인용수
5.4 / 0
게재 연도
2025