OffSim: Offline Simulator for Model-based Offline Inverse Reinforcement Learning | 최현덕 교수 연구실 | 서울과학기술대학교 스마트ICT융합공학과

최현덕 교수 연구실

서비스 플랜

연구실 검색

프로젝트 공고

정부 과제 추천

AI 기반 기업 서칭

홈

기본 정보

연구 분야

프로젝트

논문

구성원

preprint|

green

·인용수 0

·2025

OffSim: Offline Simulator for Model-based Offline Inverse Reinforcement Learning

Woo Jin Ahn, Seung Woo Baek, Yongjun Lee, Hyun Duck Choi, Myo Taeg Lim

ArXiv.org

초록

Reinforcement learning algorithms typically utilize an interactive simulator (i.e., environment) with a predefined reward function for policy training. Developing such simulators and manually defining reward functions, however, is often time-consuming and labor-intensive. To address this, we propose an Offline Simulator (OffSim), a novel model-based offline inverse reinforcement learning (IRL) framework, to emulate environmental dynamics and reward structure directly from expert-generated state-action trajectories. OffSim jointly optimizes a high-entropy transition model and an IRL-based reward function to enhance exploration and improve the generalizability of the learned reward. Leveraging these learned components, OffSim can subsequently train a policy offline without further interaction with the real environment. Additionally, we introduce OffSim $^{+}$ , an extension that incorporates a marginal reward for multi-dataset settings to enhance exploration. Extensive MuJoCo experiments demonstrate that OffSim achieves substantial performance gains over existing offline IRL methods, confirming its efficacy and robustness.

키워드

Reinforcement learningFunction (biology)Offline learningGeneralizability theoryReinforcementOnline and offline

타입

preprint

IF / 인용수

- / 0

원문

http://arxiv.org/abs/2510.15495

게재 연도

2025

프로젝트 공고 서비스 문의 자주 묻는 질문 이용약관 개인정보처리방침

주식회사 디써클

대표 장재우,이윤구서울특별시 강남구 역삼로 169, 명우빌딩 2층 (TIPS타운 S2)대표 전화 0507-1312-6417이메일 info@rndcircle.io사업자등록번호 458-87-03380호스팅제공자 구글 클라우드 플랫폼(GCP)