Recent system-on-a-chip (SoC) architectures for edge systems incorporate a variety of processing units, such as CPUs, GPUs, and NPUs.Although hardware-based memory protection is crucial for the security of edge systems, conventional mechanisms experience a significant performance degradation in such heterogeneous SoCs due to the increased memory traffic with diverse access patterns from different processing units.To mitigate the overheads, recent studies, targeting a specific domain such as machine learning software or accelerator, proposed techniques based on custom granularities applicable either to counters or MACs, but not both.In response to this challenge, we propose a unified mechanism to support both multi-granular MACs and counters in a device-independent way.It supports a granularity-aware integrity tree to make it adaptable to various access patterns.The multi-granular tree architecture stores both coarse-grained and fine-grained counters at different levels in the tree.Combined with the multi-granularity technique for MACs.Our optimization technique, termed multi-granular MAC&tree, supports four different levels of granularity.Its dynamic detection mechanism can select the most appropriate granularity for different memory regions accessed by heterogeneous processing units.In addition, we combine the multi-granularity support with the prior subtree approaches to further reduce the overheads.Our simulationbased evaluation results show that the multi-granular MAC and tree reduce the execution time by 14.2% from the conventional fixed-granular MAC&tree.By combining prior sub-tree techniques, the multi-granular MAC and tree finally reduce the execution time by 21.1% compared to the conventional fixed-granular MAC&tree.