• A novel structured prompting methodology that provides context-aware, declarative instructions to coordinate multi-agent collaboration, and an output aggregation mechanism that assigns greater weight to more confident
agent responses, improving reliability and interpretability.
• A manually annotated, domain-specific dataset designed to support uncontaminated, realistic performance evaluation.
• A resource-efficient system combining LLaMA 3.1 8B for specialized subtasks and LLaMA 3 70B for synthesis, balancing scalability and performance.
