When using a codec that supports CMR (Codec Mode Request) for voice calls, the audio data receiver can request a real-time change in the encoding bitrate from the sender in order to enhance the user experience in various network environments. The bitrate selection problem has a tradeoff characteristic, and it should be considered that the selection at a particular point in time can affect the next state. Reinforcement learning is effective in solving such problems. In this paper, we propose a method of selecting an appropriate bitrate by training a reinforcement learning model using the E-MODEL, which predicts user experience of audio data, and the network statistics of the RTP (Real-time Transport Protocol) protocol. Performance evaluation was conducted by implementing AMR-WB (Adaptive Multi-Rate Wideband) application, RTP, and JBM (Jitter Buffer Management) model in the NS3 simulator environment and simulating both congested and stable network condition. The evaluation results show that the AI model outperformed the fixed bitrate approach by lowering the encoding bitrate in congested networks and increasing the encoding bitrate in stable networks as the learning progresses. This study demonstrates the possibility of improving the user experience of voice calls by selecting encoding bitrates in real-time according to network conditions through reinforcement learning.