Data-Driven Condition Monitoring and Fault Detection of Power Transformers Using ML

Reliable transformer operation is critical for
minimizing downtime and ensuring power system stability.
Dissolved Gas Analysis (DGA) is the most widely used diagnostic
tool, yet ratio-based methods such as the Duval Triangle and
Key Gas Method often fail when signatures overlap or appear
at early fault stages. While machine learning has improved
accuracy, models remain vulnerable to noise, imbalance, and
overfitting. This paper proposes a CatBoost-based framework
that combines statistical and energy features of H₂, CO, C₂H₂,
and C₂H₄ gases with engineered ratios to capture complex intergas dependencies. With tuned hyperparameters, the model
achieved 97.6% overall accuracy and strong class-wise
performance: 98.9% (Normal), 94.7% (Partial Discharge),
93.9% (Low-Energy Discharge), and 89.0% (Low-Temperature
Overheating). Feature importance analysis identified H₂ and gas
ratios as key contributors, while training dynamics showed
rapid and stable convergence. The results demonstrate
robustness, interpretability, and efficiency, highlighting the
framework’s potential for real-time transformer fault detection
and improved power system resilience