Contemporary GPU architectures integrate specialized computing units for matrix multiplication, named matrix multiplication units (MXUs), to effectively process neural network applications. However, since MXUs are limited to matrix multiplications, GPUs show inefficiencies in computing resource utilization while applications are unrelated to matrix multiplications. Furthermore, despite prior work to leverage MXUs in general-purpose computing, they are constrained by static analysis, limiting their adaptability and hardware utilization efficiency. This study observes that the techniques emulating high-bitwidth multiplication with low-bitwidth ones transform a single high-bitwidth Multiply-and-ADd (MAD) operation into a low-bitwidth dot-product operation. Leveraging this observation, we propose <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">MaxiMoff</i>, a novel GPU architecture to utilize general-purpose cores and MXUs while computing MAD instructions dynamically. With this extended design, MaxiMoff achieves an average speedup of 1.39× and reduces total energy consumption by 17.3%.