Deep learning models provided accurate segmentation of the surgical instruments and anatomical structures observed during RARP. Both CNN and transformer models showed reliable predictions in the segmentation task; however, CNN models may be more suitable than transformer models for organ segmentation and may be more applicable in unusual cases. Further research with large datasets is needed.