Recent research on data-intensive computing systems has demonstrated that system throughput and latency are critically dependent on memory read bandwidth, highlighting the need for fast memory read operations. Although spin-transfer torque magnetic random-access memory (STT-MRAM) has emerged as a promising alternative to CMOS-based embedded memories, STT -MRAM continues to face challenges related to read speed and energy efficiency. This paper introduces a novel read scheme that enhances read speed and energy in successive read operations by alternating read paths between data and reference cells. This approach effectively mitigates worst-case read scenarios by balancing the read voltage swings. HSPICE simulations using 28nm CMOS technology show a 31.5% improvement in read speed and 48.8% reduction in energy consumption compared to the previous approach. SCALE-Sim system simulations also demonstrate that applying the proposed read scheme to STT-MRAM embedded memories in AI accelerators shows a significant reduction in memory energy for CNN inference tasks compared to the SRAM embedded memory.