문경식 교수 연구실 | 고려대학교 컴퓨터학과

문경식 연구실

고려대학교 컴퓨터학과

문경식 교수

문경식 교수 연구실

서비스 플랜

연구실 검색

프로젝트 공고

정부 과제 추천

AI 기반 기업 서칭

홈

기본 정보

연구 분야

논문

구성원

홈

문경식 연구실

고려대학교 컴퓨터학과 문경식 교수

문경식 연구실은 컴퓨터 비전 기반의 인간 중심 시각지능을 연구하며, 3차원 인간 자세 추정, 손 자세 및 메쉬 복원, 전신 인체 표현 학습, 단일 영상 기반 3차원 아바타 재구성과 동적 의상 모델링 등 디지털 휴먼 이해와 생성에 필요한 핵심 기술을 개발하고 이를 실제 환경에서 강건하게 동작하는 형태로 확장하는 데 주력하고 있다.

대표 연구 분야

연구 영역 전체보기

3차원 인간 자세 및 전신 메쉬 추정

손 자세 인식 및 상호작용 표현 학습

단일 영상 기반 3차원 아바타 재구성과 동적 의상 표현

연구 성과 추이

표시된 성과는 수집된 데이터 기준으로 산출되며, 일부 차이가 있을 수 있습니다.

5개년 연도별 논문 게재 수

37총합

5개년 연도별 피인용 수

673총합

주요 논문

3

논문 전체보기

1

article

|

green

·

인용수 0

·

2026

Enhancing Hands in 3D Whole-Body Pose Estimation with Conditional Hands Modulator

Gyeongsik Moon

ArXiv.org

Accurately recovering hand poses within the body context remains a major challenge in 3D whole-body pose estimation. This difficulty arises from a fundamental supervision gap: whole-body pose estimators are trained on full-body datasets with limited hand diversity, while hand-only estimators, trained on hand-centric datasets, excel at detailed finger articulation but lack global body awareness. To address this, we propose Hand4Whole++, a modular framework that leverages the strengths of both pre-trained whole-body and hand pose estimators. We introduce CHAM (Conditional Hands Modulator), a lightweight module that modulates the whole-body feature stream using hand-specific features extracted from a pre-trained hand pose estimator. This modulation enables the whole-body model to predict wrist orientations that are both accurate and coherent with the upper-body kinematic structure, without retraining the full-body model. In parallel, we directly incorporate finger articulations and hand shapes predicted by the hand pose estimator, aligning them to the full-body mesh via differentiable rigid alignment. This design allows Hand4Whole++ to combine globally consistent body reasoning with fine-grained hand detail. Extensive experiments demonstrate that Hand4Whole++ substantially improves hand accuracy and enhances overall full-body pose quality.

http://arxiv.org/abs/2603.14726

Pose

Articulated body pose estimation

Modular design

Context (archaeology)

Kinematics

3D pose estimation

Feature (linguistics)

Estimator

2

preprint

|

green

·

인용수 0

·

2026

Zero-Shot Reconstruction of Animatable 3D Avatars with Cloth Dynamics from a Single Image

Joohyun Kwon, Geonhee Sim, Gyeongsik Moon

arXiv (Cornell University)

Existing single-image 3D human avatar methods primarily rely on rigid joint transformations, limiting their ability to model realistic cloth dynamics. We present DynaAvatar, a zero-shot framework that reconstructs animatable 3D human avatars with motion-dependent cloth dynamics from a single image. Trained on large-scale multi-person motion datasets, DynaAvatar employs a Transformer-based feed-forward architecture that directly predicts dynamic 3D Gaussian deformations without subject-specific optimization. To overcome the scarcity of dynamic captures, we introduce a static-to-dynamic knowledge transfer strategy: a Transformer pretrained on large-scale static captures provides strong geometric and appearance priors, which are efficiently adapted to motion-dependent deformations through lightweight LoRA fine-tuning on dynamic captures. We further propose the DynaFlow loss, an optical flow-guided objective that provides reliable motion-direction geometric cues for cloth dynamics in rendered space. Finally, we reannotate the missing or noisy SMPL-X fittings in existing dynamic capture datasets, as most public dynamic capture datasets contain incomplete or unreliable fittings that are unsuitable for training high-quality 3D avatar reconstruction models. Experiments demonstrate that DynaAvatar produces visually rich and generalizable animations, outperforming prior methods.

https://doi.org/10.48550/arxiv.2603.14772

Avatar

Dynamics (music)

Limiting

Motion capture

Human motion

Gaussian

Joint (building)

USable

3

article

|

green

·

인용수 0

·

2026

Zero-Shot Reconstruction of Animatable 3D Avatars with Cloth Dynamics from a Single Image

Joohyun Kwon, Geonhee Sim, Gyeongsik Moon

arXiv (Cornell University)

Existing single-image 3D human avatar methods primarily rely on rigid joint transformations, limiting their ability to model realistic cloth dynamics. We present DynaAvatar, a zero-shot framework that reconstructs animatable 3D human avatars with motion-dependent cloth dynamics from a single image. Trained on large-scale multi-person motion datasets, DynaAvatar employs a Transformer-based feed-forward architecture that directly predicts dynamic 3D Gaussian deformations without subject-specific optimization. To overcome the scarcity of dynamic captures, we introduce a static-to-dynamic knowledge transfer strategy: a Transformer pretrained on large-scale static captures provides strong geometric and appearance priors, which are efficiently adapted to motion-dependent deformations through lightweight LoRA fine-tuning on dynamic captures. We further propose the DynaFlow loss, an optical flow-guided objective that provides reliable motion-direction geometric cues for cloth dynamics in rendered space. Finally, we reannotate the missing or noisy SMPL-X fittings in existing dynamic capture datasets, as most public dynamic capture datasets contain incomplete or unreliable fittings that are unsuitable for training high-quality 3D avatar reconstruction models. Experiments demonstrate that DynaAvatar produces visually rich and generalizable animations, outperforming prior methods.

http://arxiv.org/abs/2603.14772

Avatar

Dynamics (music)

Limiting

Motion capture

Human motion

Gaussian

Joint (building)

USable

프로젝트 공고 서비스 문의 자주 묻는 질문 이용약관 개인정보처리방침

주식회사 디써클

대표 장재우,이윤구서울특별시 강남구 역삼로 169, 명우빌딩 2층 (TIPS타운 S2)대표 전화 0507-1312-6417이메일 info@rndcircle.io사업자등록번호 458-87-03380호스팅제공자 구글 클라우드 플랫폼(GCP)