For resource constrained on-device Al environment, system efficiency is the key factor for successful implementation (Fig. 1): area efficiency (TOPS ) to inference the user demand AI task while maintaining device size to a palm-scale, and power efficiency (TOPS/W) for maximum application usage on limited battery life. To save resources, network precisions are quantized and sparsity exploitable ternary input/weight shows best outcome as hardware demonstration in [1] achieves -82% energy per inference compared to binary (). For system architecture, analog designs generally have high efficiency, as custom bitcells are compactly integrated and perform multiply-and-accumulate (MAC) operations in parallel with low energy consumption. However, there are two challenges that degrade the efficiency enhancement. 1) For bitcells, it requires an additional latch for ternary weight and reduces macro area efficiency. 2) In analog compute-in-memory (CIM), increase in number of MAC worsens the energy and signal margin which affects operation’s accuracy and efficiency. Furthermore, A-to-D converters (ADCs) occupy 41% of macro area [2] and 72% latency [3], so minimal overhead is crucial for analog CIM system efficiency. In this paper we present highly resource efficient ternary latch based CIM macro 1) The refined single latch ternary input/weight cell enabled by 28 nm ternary-CMOS (T-CMOS) technology. 2) Solutions for analog CIM performance enhancement by energy-efficient and signal margin improved MAC operation with double-sampling ternary ADC.