Abstract: This study investigates the design of reward functions for deep reinforcement learning-based source term estimation (STE). Estimating the properties of unknown hazardous gas leakage using a ...