MaxiMoff: Designing Matrix Multiplication Accelerator for Effective Multiply-Add Operations Offloading | 노원우 교수 연구실 | 연세대학교 전기전자공학과

노원우 교수 연구실

서비스 플랜

연구실 검색

프로젝트 공고

정부 과제 추천

AI 기반 기업 서칭

홈

기본 정보

연구 분야

프로젝트

발행물

구성원

article|

인용수 0

·2025

MaxiMoff: Designing Matrix Multiplication Accelerator for Effective Multiply-Add Operations Offloading

S. Kim, Dongho Ha, Seunghwan Sung, Won Woo Ro

IF 5.4IEEE Transactions on Emerging Topics in Computing

초록

Contemporary GPU architectures integrate specialized computing units for matrix multiplication, named matrix multiplication units (MXUs), to effectively process neural network applications. However, since MXUs are limited to matrix multiplications, GPUs show inefficiencies in computing resource utilization while applications are unrelated to matrix multiplications. Furthermore, despite prior work to leverage MXUs in general-purpose computing, they are constrained by static analysis, limiting their adaptability and hardware utilization efficiency. This study observes that the techniques emulating high-bitwidth multiplication with low-bitwidth ones transform a single high-bitwidth Multiply-and-ADd (MAD) operation into a low-bitwidth dot-product operation. Leveraging this observation, we propose <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">MaxiMoff</i>, a novel GPU architecture to utilize general-purpose cores and MXUs while computing MAD instructions dynamically. With this extended design, MaxiMoff achieves an average speedup of 1.39× and reduces total energy consumption by 17.3%.

키워드

Matrix multiplicationSpeedupLeverage (statistics)Energy consumptionAdaptabilityProcess (computing)Multiplication (music)LimitingHardware acceleration

타입

article

IF / 인용수

5.4 / 0

원문

https://doi.org/10.1109/tetc.2025.3626723

게재 연도

2025