ALERT Open Dataset and Input-Size-Agnostic Vision Transformer for Driver Activity Recognition Using IR-UWB | 박세웅 교수 연구실 | 서울대학교 전기·정보공학부

박세웅 교수 연구실

서비스 플랜

연구실 검색

프로젝트 공고

정부 과제 추천

AI 기반 기업 서칭

홈

기본 정보

연구 분야

프로젝트

발행물

구성원

article|

gold

·인용수 0

·2026

ALERT Open Dataset and Input-Size-Agnostic Vision Transformer for Driver Activity Recognition Using IR-UWB

Jeongjun Park, Sunwook Hwang, Hyeonho Noh, J. Yang, Hyun Jong Yang, Saewoong Bahk

IF 3.6IEEE Access

초록

Distracted driving contributes to fatal crashes worldwide. To address this, researchers are using Driver Activity Recognition (DAR) with Impulse Radio Ultra-Wideband (IR-UWB) radar, which offers advantages like interference resistance, low-power use, and privacy. However, two challenges limit its adoption: the lack of large-scale, real-world UWB datasets on diverse distracted driving behaviors, and the difficulty in adapting fixed-input Vision Transformers (ViTs) to UWB radar data with non-standard dimensions. This work tackles both challenges. We present the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ALERT dataset, containing 10,220 radar samples of seven distracted driving activities in real driving conditions. We also propose the Input-Size-Agnostic Vision Transformer <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">(ISA-ViT), a framework designed for radar-based DAR. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ISA-ViT resizes UWB data to fit ViT input requirements while preserving radar-specific information like Doppler shifts and phase data. By adjusting patches and using pre-trained positional embedding vectors (PEVs), <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ISA-ViT avoids the limitations of simple resizing. Additionally, a domain fusion strategy combines range and frequency domain features, enhancing classification accuracy. Comprehensive experiments demonstrate that <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ISA-ViT achieves a 22.68% higher accuracy compared to the existing ViT method in UWB-based DAR. By publicly releasing the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ALERT dataset and detailing our input-size-agnostic strategy, this work paves the way for more robust and scalable distracted driving detection systems in real-world scenarios.

키워드

ScalabilityTransformerRadarActivity recognitionFrequency domainDoppler radarDistracted drivingSensor fusion

타입

article

IF / 인용수

3.6 / 0

원문

https://doi.org/10.1109/access.2026.3663636

게재 연도

2026