From Sparse to Dense: Toddler-inspired Reward Transition in Goal-Oriented Reinforcement Learning | 장병탁 교수 연구실 | 서울대학교 컴퓨터공학부

장병탁 교수 연구실

서비스 플랜

연구실 검색

프로젝트 공고

정부 과제 추천

AI 기반 기업 서칭

홈

기본 정보

연구 분야

프로젝트

발행물

구성원

preprint|

green

·인용수 0

·2025

From Sparse to Dense: Toddler-inspired Reward Transition in Goal-Oriented Reinforcement Learning

Junseok Park, Hyeonseo Yang, Min Whoo Lee, Wonseok Choi, Minsu Lee, Byoung‐Tak Zhang

ArXiv.org

초록

Reinforcement learning (RL) agents often face challenges in balancing exploration and exploitation, particularly in environments where sparse or dense rewards bias learning. Biological systems, such as human toddlers, naturally navigate this balance by transitioning from free exploration with sparse rewards to goal-directed behavior guided by increasingly dense rewards. Inspired by this natural progression, we investigate the Toddler-Inspired Reward Transition in goal-oriented RL tasks. Our study focuses on transitioning from sparse to potential-based dense (S2D) rewards while preserving optimal strategies. Through experiments on dynamic robotic arm manipulation and egocentric 3D navigation tasks, we demonstrate that effective S2D reward transitions significantly enhance learning performance and sample efficiency. Additionally, using a Cross-Density Visualizer, we show that S2D transitions smooth the policy loss landscape, resulting in wider minima that improve generalization in RL models. In addition, we reinterpret Tolman's maze experiments, underscoring the critical role of early free exploratory learning in the context of S2D rewards.

키워드

ToddlerReinforcement learningTransition (genetics)ReinforcementPsychologyCognitive psychologyArtificial intelligenceComputer scienceDevelopmental psychologySocial psychology

타입

preprint

IF / 인용수

- / 0

원문

http://arxiv.org/abs/2501.17842

게재 연도

2025