박재식 교수 연구실 | 서울대학교 본교(제1캠퍼스) 컴퓨터공학부

시각 및 기하지능 연구실

서울대학교 본교(제1캠퍼스) 컴퓨터공학부

박재식 교수

3D Shape Assembly

3D 형상 조립

Image-Text Alignment

박재식 교수 연구실

서비스 플랜

연구실 검색

프로젝트 공고

정부 과제 추천

AI 기반 기업 서칭

홈

기본 정보

연구 분야

프로젝트

논문

구성원

홈

시각 및 기하지능 연구실

서울대학교 본교(제1캠퍼스) 컴퓨터공학부 박재식 교수

시각 및 기하지능 연구실은 컴퓨터공학부에 소속된 연구실로, 주로 3D 형상 조립, 이미지-텍스트 정렬, 생성적 적대 신경망 (GANs), 비주얼 SLAM 및 포인트 클라우드 매칭 등의 연구를 수행하고 있습니다. 최근 3년간 'Distilling Diffusion Models into Conditional GANs', '3D Geometric Shape Assembly via Efficient Point Cloud Matching', 'Extending CLIP's Image-Text Alignment to Referring Image Segmentation' 등 다수의 논문을 발표하며 해당 분야에서 높은 연구 성과를 거두고 있습니다. 특히, GANs를 활용한 이미지 생성 및 텍스트-이미지 정렬 분야에서 두드러진 성과를 보이고 있으며, 다양한 기업과의 협업을 통해 실제 프로젝트에서도 성공적인 결과를 도출하고 있습니다.

3D Shape Assembly3D 형상 조립Image-Text Alignment이미지-텍스트 정렬Generative Adversarial Networks (GANs)

대표 연구 분야

연구 영역 전체보기

인간 중심 시각 데이터셋과 표현 학습

생성모델과 시각 기반 인공지능

3차원 시각 인식 및 공간 복원

연구 성과 추이

표시된 성과는 수집된 데이터 기준으로 산출되며, 일부 차이가 있을 수 있습니다.

5개년 연도별 논문 게재 수

97총합

5개년 연도별 피인용 수

1,061총합

주요 논문

논문 전체보기

article

hybrid

인용수 6

2024

3Doodle: Compact Abstraction of Objects with 3D Strokes

Changwoon Choi, Jaeah Lee, Jaesik Park, Young Min Kim

IF 9.5

ACM Transactions on Graphics

While free-hand sketching has long served as an efficient representation to convey characteristics of an object, they are often subjective, deviating significantly from realistic representations. Moreover, sketches are not consistent for arbitrary viewpoints, making it hard to catch 3D shapes. We propose 3Dooole, generating descriptive and view-consistent sketch images given multi-view images of the target object. Our method is based on the idea that a set of 3D strokes can efficiently represent 3D structural information and render view-consistent 2D sketches. We express 2D sketches as a union of view-independent and view-dependent components. 3D cubic Bézier curves indicate view-independent 3D feature lines, while contours of superquadrics express a smooth outline of the volume of varying viewpoints. Our pipeline directly optimizes the parameters of 3D stroke primitives to minimize perceptual losses in a fully differentiable manner. The resulting sparse set of 3D strokes can be rendered as abstract sketches containing essential 3D characteristic shapes of various objects. We demonstrate that 3Doodle can faithfully express concepts of the original images compared with recent sketch generation approaches. 1

https://doi.org/10.1145/3658156

Abstraction

Computer science

Computer graphics (images)

Artificial intelligence

Computer vision

Programming language

article

인용수 56

2023

StudioGAN: A Taxonomy and Benchmark of GANs for Image Synthesis

Minguk Kang, Joonghyuk Shin, Jaesik Park

IF 18.6

IEEE Transactions on Pattern Analysis and Machine Intelligence

Generative Adversarial Network (GAN) is one of the state-of-the-art generative models for realistic image synthesis. While training and evaluating GAN becomes increasingly important, the current GAN research ecosystem does not provide reliable benchmarks for which the evaluation is conducted consistently and fairly. Furthermore, because there are few validated GAN implementations, researchers devote considerable time to reproducing baselines. We study the taxonomy of GAN approaches and present a new open-source library named StudioGAN. StudioGAN supports 7 GAN architectures, 9 conditioning methods, 4 adversarial losses, 12 regularization modules, 3 differentiable augmentations, 7 evaluation metrics, and 5 evaluation backbones. With our training and evaluation protocol, we present a large-scale benchmark using various datasets (CIFAR10, ImageNet, AFHQv2, FFHQ, and Baby/Papa/Granpa-ImageNet) and 3 different evaluation backbones (InceptionV3, SwAV, and Swin Transformer). Unlike other benchmarks used in the GAN community, we train representative GANs, including BigGAN and StyleGAN series in a unified training pipeline and quantify generation performance with 7 evaluation metrics. The benchmark evaluates other cutting-edge generative models (e.g., StyleGAN-XL, ADM, MaskGIT, and RQ-Transformer). StudioGAN provides GAN implementations, training, and evaluation scripts with the pre-trained weights. StudioGAN is available at https://github.com/POSTECH-CVLab/PyTorch-StudioGAN.

https://doi.org/10.1109/tpami.2023.3306436

Computer science

Benchmark (surveying)

Generative grammar

Implementation

Transformer

Artificial intelligence

Taxonomy (biology)

Machine learning

Benchmarking

Generative adversarial network

article

인용수 11

2022

TextureMe: High-Quality Textured Scene Reconstruction in Real Time

Jungeon Kim, Hyomin Kim, Hyeonseo Nam, Jaesik Park, Seungyong Lee

IF 9.5

ACM Transactions on Graphics

Three-dimensional (3D) reconstruction using an RGB-D camera has been widely adopted for realistic content creation. However, high-quality texture mapping onto the reconstructed geometry is often treated as an offline step that should run after geometric reconstruction. In this article, we propose TextureMe , a novel approach that jointly recovers 3D surface geometry and high-quality texture in real time. The key idea is to create triangular texture patches that correspond to zero-crossing triangles of truncated signed distance function (TSDF) progressively in a global texture atlas. Our approach integrates color details into the texture patches in parallel with the depth map integration to a TSDF. It also actively updates a pool of texture patches to adapt TSDF changes and minimizes misalignment artifacts that occur due to camera drift and image distortion. Our global texture atlas representation is fully compatible with conventional texture mapping. As a result, our approach produces high-quality textures without utilizing additional texture map optimization, mesh parameterization, or heavy post-processing. High-quality scenes produced by our real-time approach are even comparable to the results from state-of-the-art methods that run offline.

https://doi.org/10.1145/3503926

Projective texture mapping

Computer vision

Bidirectional texture function

Texture mapping

Artificial intelligence

Computer science

Texture atlas

Texture compression

RGB color model

Texture filtering

정부 과제

과제 전체보기

2025년 6월-2030년 12월

|1,050,000,000원

AI스타펠로우십지원(서울대학교)

4D+5S+6R: 시공간 데이터(4D), 다감각 정보(5S), 6대 로봇 기술(6R)을 통한 초지능형 AI 에이전트의 핵심 기술을 선도적으로 개발하고 인재를 양성함

인공지능

증강 휴먼

에이전틱 AI

초개인화

인지 및 추론

2025년 6월-2030년 12월

|2,000,000,000원

AI스타펠로우십지원(울산과학기술원)

본 과제는 강건한 VLA(시각-언어-행동) 통합지능 온디바이스 제조 AI 원천기술을 개발하고 제조 현장에 적용 및 검증을 통해 AI 기반 제조 산업의 혁신을 선도하는 글로벌 최고 수준의 융합형 신진연구자 양성을 목표로 함.

인공지능

자율제조

VLA 모델

온디바이스 AI

강화학습

2025년 3월-2027년 12월

|1,040,000,000원

실사 공간 6DoF 자유시점 실감 재현 기술

o XR 디바이스를 활용하여 원격지 다중 사용자의 현실공간을 융합한 6DoF 자유 시점 실감 협업을 위해, 자동 환경 설정을 위한 전처리 기술, 현실공간 디지털 구성과 실시간 공간 변화 갱신을 위한 공간 복원 기술, 저지연 공간 전송을 위한 데이터 경량화 기술, 의미론적 공간 배치 실공간 융합 기술 및 6DoF 자유시점 실감 재현 기술 개발o (결과물) 원...

원격 협업

3차원 공간 복원

6자유도 가시화

공간 합성

실시간 갱신

최신 특허

특허 전체보기

상태	출원연도	과제명	출원번호	상세정보
등록	2024	드래그 기반 이미지 편집 장치 및 이미지 편집 방법	1020240102748

전체 특허

드래그 기반 이미지 편집 장치 및 이미지 편집 방법

상태

등록

출원연도

2024

출원번호

1020240102748

상세 정보 바로가기

맞춤형 인사이트 리포트

연구실의 전체 데이터를 활용한 맞춤형 인사이트 리포트

연구 트렌드부터 공동 연구 방향성 기획까지

연구실과 같이 할 수 있는게 무엇인지,
지금 바로 확인해보세요

무료 리포트 확인하기