K-means clustering is widely used in applications such as classification, recommendation, and image processing for its simplicity and efficiency. While often deployed on servers, it is also used on mobile platforms for tasks like sensor data analysis. However, mobile devices face tight hardware and energy constraints, making efficient execution challenging. Prior parallel K-means approaches still suffer from GPU underutilization due to warp divergence and leave CPUs idle. This paper proposes Kubism, a novel software technique that disassembles and reassembles a K-means clustering algorithm to maximize CPU and GPU resource utilization on mobile platforms. Kubism incorporates several key strategies, including reordering operations to minimize unnecessary work, ensuring balanced workloads across processing units to avoid idle time, dynamically adjusting task execution based on real-time performance metrics, and distributing computation efficiently between the CPU and GPU. These methods synergistically improve performance by reducing idle periods and optimizing the use of hardware resources. In our evaluation on the NVIDIA Jetson Orin AGX platform, Kubism achieves up to a 2.65× speedup in individual clustering iterations and an average 1.23× improvement in overall end-to-end execution time compared to prior work.