SharpeRatio@k: Off-Policy Evaluation Using Novel Risk-Return Tradeoff and Efficiency Assessment (IMAGE)
Caption
SharpeRatio@k, a novel evaluation metric for Off-Policy Evaluation estimators, effectively measures the risk-return tradeoff of evaluating policies used in reinforcement learning and contextual bandits, which are typically ignored by conventional metrics, show scientists at Tokyo Tech. This novel metric, inspired from risk assessment in financial portfolio management, provides a more insightful evaluation of OPE, paving the way for improved policy selection.
Credit
Tokyo Institute of Technology
Usage Restrictions
None
License
Original content