Online Convex Optimization with Stochastic Constraints: Zero Constraint Violation and Bandit Feedback | 이다빈 교수 연구실 | 서울대학교 수리과학부

이다빈 교수 연구실

서비스 플랜

연구실 검색

프로젝트 공고

정부 과제 추천

AI 기반 기업 서칭

홈

기본 정보

연구 분야

프로젝트

발행물

구성원

preprint|

green

·인용수 1

·2023

Online Convex Optimization with Stochastic Constraints: Zero Constraint Violation and Bandit Feedback

Yeongjong Kim, Dabeen Lee

arXiv (Cornell University)

초록

This paper studies online convex optimization with stochastic constraints. We propose a variant of the drift-plus-penalty algorithm that guarantees $O(\sqrt{T})$ expected regret and zero constraint violation, after a fixed number of iterations, which improves the vanilla drift-plus-penalty method with $O(\sqrt{T})$ constraint violation. Our algorithm is oblivious to the length of the time horizon $T$, in contrast to the vanilla drift-plus-penalty method. This is based on our novel drift lemma that provides time-varying bounds on the virtual queue drift and, as a result, leads to time-varying bounds on the expected virtual queue length. Moreover, we extend our framework to stochastic-constrained online convex optimization under two-point bandit feedback. We show that by adapting our algorithmic framework to the bandit feedback setting, we may still achieve $O(\sqrt{T})$ expected regret and zero constraint violation, improving upon the previous work for the case of identical constraint functions. Numerical results demonstrate our theoretical results.

키워드

Constraint (computer-aided design)RegretQueueMathematical optimizationConvex optimizationRegular polygonMathematicsOnline algorithmZero (linguistics)Computer science

타입

preprint

IF / 인용수

- / 1

원문

http://arxiv.org/abs/2301.11267

게재 연도

2023