This paper proposes a PAM4 receiver DSP architecture based on Feed-Forward Equalizer (FFE) and Maximum Likelihood Sequence Detector (MLSD), targeting improved bit-error-rate (BER) performance and reduced hardware complexity. Two key contributions are presented: (1) an FFE tap coefficient optimization framework that tunes the FFE to provide the output with an inter-symbol interference (ISI) memory length of two for better MLSD performance, and (2) a Top-K selection technique that reduces the computational complexity of MLSD by limiting the number of branch metric computations per trellis stage. The coefficient training is performed in two stages using least-mean-square (LMS) and cross-entropy loss based on pseudo-labels derived from MLSD outputs. Simulation results show two orders of magnitude improvement in BER over conventional LMS-based FFE coefficient optimization. In addition, to mitigate the MLSD complexity, Top-K selection approach is proposed to select only the most relevant K branches for MLSD computation using a pre-computed lookup table. The proposed architecture is synthesized and evaluated in a 28 nm CMOS process and also implemented on the RealDigital RFSoC4x2 FPGA board for demonstration. Compared to the full-metric MLSD architecture, the proposed reduced-complexity MLSD achieves 74% reduction in area and 65% in power consumption while maintaining BER performance.