Meta-learning with gradient norm arbitration for sample-aware few-shot learning
Jongmin Lim, Soobin Cha, Heesan Kong, Sung Kuk Shyn, Kwangsu Kim
IF 7.6 (2025)
Knowledge-Based Systems
• We demonstrate that in optimization-based meta-learning, the shared prior knowledge across tasks exerts an imbalanced influence at the sample level within tasks. • We show that this imbalance leads to a broad loss distribution, where samples well-aligned with prior knowledge exhibit low loss values, while misaligned samples display high loss values. • Moreover, we experimentally demonstrate that gradients computed based on the mean across a broad loss distribution lead to poor generalization performance since the contributions of high-loss samples are diminished by those of low-loss samples. • To address this issue, we propose a novel meta-learning approach that arbitrates gradient norms based on sample-aware information, ensuring that high-loss samples misaligned with prior knowledge are adequately represented. • Experimental results and theoretical analysis demonstrate that the proposed method achieves competitive and generalizable performance compared to existing optimization-based meta-learning methods. The ability to rapidly adapt to unseen tasks is a fundamental objective in few-shot learning. Recent advances in optimization-based meta-learning have enhanced adaptability by learning sharable prior knowledge across tasks with just a few gradient descent steps. However, we argue that this shared prior knowledge can exert an imbalanced influence on individual samples within tasks, potentially resulting in a broad loss distribution where samples closely aligned with the prior knowledge exhibit low loss values, while others display high loss values. Furthermore, our experiments show that gradients computed as the average from a broad loss distribution tend to be non-representative and low, leading to poor generalization performance since the contribution of high-loss samples is diminished by low-loss samples. To address this, we propose a novel meta-learning method that arbitrates gradient norms based on sample-aware information during task adaptation. Specifically, we first normalize the gradient vector to reduce the imbalanced influence of prior knowledge on individual samples. Subsequently, the Arbiter, a learnable network, dynamically scales the current gradient norm by analyzing the relationship between original gradient norms and weight norms, which indicates the model’s sensitivity and complexity to each sample. In this way, the proposed method, Meta-learning with Gradient Norm Arbitration (Meta-GNA), improves generalization performance by preserving more representative and higher gradients that adequately reflect high-loss samples, which are distantly aligned with prior knowledge. Experimental results show that Meta-GNA improves performance in few-shot classification, particularly in cross-domain scenarios where the imbalance in prior knowledge across samples is more pronounced.
https://doi.org/10.1016/j.knosys.2025.114443
Generalization
Adaptability
Norm (philosophy)
Sample (material)
Distribution (mathematics)
Gradient descent
상세 정보 바로가기