From Risk-Neutral to Risk-Sensitive Reinforcement Learning: Actor–Critic vs REINFORCE with Tail-Based Risk Measures

Aprida Siska Lestia, Adhitya Ronnie Effendie, Made Tantrawan, Muhammad Rafli Azrarsyah

Abstract


his study investigates the application of \emph{risk-sensitive reinforcement learning} on heavy-tailed return series by comparing two primary algorithms: REINFORCE with baseline (REINFORCE-BL) and episodic batched actor--critic (A2C-B). Initial exploratory analysis reveals an asymmetric return distribution with numerous extreme \emph{outliers}, rendering variance-based risk measures inadequate and motivating the integration of tail-based risk measures—specifically Value at Risk (VaR), Conditional Value at Risk (CVaR), and Entropic Value at Risk (EVaR)—into the RL objective function. This study constructs a simple portfolio environment with discrete actions (market entry, market exit, and \emph{hold}) and trains both algorithms under four scenarios: risk-neutral, VaR, CVaR, and EVaR. Experimental results demonstrate that A2C-B consistently outperforms REINFORCE-BL across all scenarios, exhibiting higher average long-term rewards, faster convergence rates, and more stable \emph{learning curves}. While VaR and CVaR penalties significantly reduce rewards and increase learning volatility for REINFORCE-BL, A2C-B experiences only moderate reward reductions while maintaining stability. In the EVaR scenario, both algorithms yield high rewards, yet A2C-B retains a slight advantage in terms of stability. These findings indicate that in environments with heavy-tailed returns, employing coherent risk measures (particularly CVaR and EVaR) within an actor--critic framework offers a more compelling trade-off between tail risk control and average performance, serving as a viable \emph{baseline} for the development of risk-sensitive RL in finance and actuarial science.

Keywords


risk-sensitive reinforcement learning; REINFORCE; actor--critic; VaR; CVaR; EVaR.

Full Text:

PDF


DOI: https://doi.org/10.18860/cauchy.v11i1.40309

Refbacks

  • There are currently no refbacks.


Copyright (c) 2026 Adhitya Ronnie Effendie

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Editorial Office
Mathematics Department,
Universitas Islam Negeri Maulana Malik Ibrahim Malang
Gajayana Street 50 Malang, East Java, Indonesia 65144
Faximile (+62) 341 558933
e-mail: cauchy@uin-malang.ac.id

Creative Commons License
CAUCHY: Jurnal Matematika Murni dan Aplikasi is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.