Distracted driving contributes to fatal crashes worldwide. To address this, researchers are using Driver Activity Recognition (DAR) with Impulse Radio Ultra-Wideband (IR-UWB) radar, which offers advantages like interference resistance, low-power use, and privacy. However, two challenges limit its adoption: the lack of large-scale, real-world UWB datasets on diverse distracted driving behaviors, and the difficulty in adapting fixed-input Vision Transformers (ViTs) to UWB radar data with non-standard dimensions. This work tackles both challenges. We present the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ALERT</i> dataset, containing 10,220 radar samples of seven distracted driving activities in real driving conditions. We also propose the Input-Size-Agnostic Vision Transformer <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">(ISA-ViT)</i>, a framework designed for radar-based DAR. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ISA-ViT</i> resizes UWB data to fit ViT input requirements while preserving radar-specific information like Doppler shifts and phase data. By adjusting patches and using pre-trained positional embedding vectors (PEVs), <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ISA-ViT</i> avoids the limitations of simple resizing. Additionally, a domain fusion strategy combines range and frequency domain features, enhancing classification accuracy. Comprehensive experiments demonstrate that <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ISA-ViT</i> achieves a 22.68% higher accuracy compared to the existing ViT method in UWB-based DAR. By publicly releasing the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ALERT</i> dataset and detailing our input-size-agnostic strategy, this work paves the way for more robust and scalable distracted driving detection systems in real-world scenarios.