In this paper, we propose a novel scheme to eliminate CPU interventions when executing a neural network on multiple HW IPs, such as GPU and NPU. We delegate the role of data synchronization in CPU into GPU and NPU, so that GPU and NPU directly work together as a producer or a consumer seamlessly without CPU interventions. Experimental result shows that our scheme reduces execution time and power consumption simultaneously, so that our scheme can be effectively used for neural network execution on mobile devices.