In recent years, renewable energy sources such as photovoltaic power generation system (PV) have been rapidly integrated into many power grids around the world. The higher penetration of renewable energy resources has made more difficult to maintain proper voltage using conventional method of Load Ratio Transformer (LRT) tap-changing in view of rapid generation fluctuation caused by weather condition change. To solve this problem, the reactive power control with power conditioning system (PCS) of PV can be used as a voltage regulation resource. Recent works have developed multi-timescale voltage control with short-term control by PCS and long-term control by LRT using deep reinforcement learning (DRL).
Most of these methods achieve coordination between agents in different control cycles by reward calculation. However, the training becomes unstable due to the improper management of change of each agent’s strategy, and it may not be possible to control voltage of power grid.
In this paper, the authors have proposed a phased training method to improve the stability of the training process for each agent that performs either LRT control or PCS control in power grid voltage control with DRL. The effectiveness of the proposed method is verified by numerical simulations using a power grid model with large PVs.