Stacking LLM Models’ Predictions for Feature Selection in Anomaly Classification

Large language models (LLMs) are increasingly being integrated into machine learning (ML) pipelines, particularly for tasks like feature selection in supervised classification. With the growing diversity of available LLMs, their predictions often complement one another, making ensembles of LLMs a promising approach for solving various ML challenges. In this paper, we propose using stacking methods to combine the predictions of multiple LLMs. The focus of the ML task is anomaly detection, specifically identifying whether an anomaly has occurred in a system and classifying its type. The ensemble’s base models are built on feature sets selected by six different LLMs. We demonstrate that stacking LLM predictions can enhance the accuracy of individual classifiers and advocate for the use of stacking as a simple yet effective method for integrating traditional classifiers with LLMs. Additionally, we assess the impact of various base classifiers and meta-classifiers on the performance of the proposed approach.