기본 정보
연구 분야
프로젝트
논문
구성원
article|
gold
·인용수 0
·2025
Learning graph based individual intrinsic reward for multi-agent reinforcement learning
Seokhun Ju, Seungyub Han, Taehyun Cho, Jungwoo Lee, Taeyoung Lee, Minkyoung Kim, Jinung An
IF 4.2ICT Express
초록

Designing a reward function is a critical challenge in reinforcement learning. However, as environments become more complex and tasks grow more difficult, designing a reward function that drives optimal behavior becomes increasingly challenging. To overcome these issues, Preference based reinforcement learning has proposed methods that learn reward functions based on the preference between two trajectories, thereby eliminating the need for handcrafted reward function. In multi-agent reinforcement learning, the challenge is even greater due to the complex interactions among agents, which makes designing a single global reward function even more difficult. In this paper, we show that when a single global reward function is learned via preference-based reinforcement learning in multi-agent setting, it often fails to capture sufficient information for optimal policy learning. Instead, we propose a method for learning individual reward functions that provide additional guidance for each agent’s optimal policy. Our approach, which leverages graph structures and preference-based reinforcement learning, outperforms the method based on learning a single, global reward function.

키워드
Reinforcement learningReinforcementIntrinsic motivationPsychologyGraphCognitive psychologyComputer scienceSocial psychologyArtificial intelligenceTheoretical computer science
타입
article
IF / 인용수
4.2 / 0
게재 연도
2025

주식회사 디써클

대표 장재우,이윤구서울특별시 강남구 역삼로 169, 명우빌딩 2층 (TIPS타운 S2)대표 전화 0507-1312-6417이메일 info@rndcircle.io사업자등록번호 458-87-03380호스팅제공자 구글 클라우드 플랫폼(GCP)

© 2026 RnDcircle. All Rights Reserved.