Advancing Brain Tumor Detection via ViRCNN: A Fusion of Vision Transformers and Faster R-CNN

In the field of cancer diagnosis, especially detection of brain tumors, achieving highly accurate detection is very im- portant. Deep learning, with its remarkable capabilities in object detection, has emerged as a valuable tool for identifying brain tumors. We introduce a novel approach called ViRCNN that combines the strengths of Faster R-CNN and Vision Transformer (ViT), referred to as ViRCNN. This method enhances both the accuracy and efficiency of brain tumor detection in magnetic resonance image (MRI) images. To evaluate the effectiveness of ViRCNN, we employed the Br35H dataset, which includes 801 MRI images for training, validation, and testing. Our approach demonstrates significant improvements in the Mean Average Precision 50 (MAP50) and Recall metrics compared to previous methods. Notably, ViRCNN achieves a 0.9% improvement in the MAP50 score while maintaining a parameter count of only 19 million, substantially lower than the over 80 million parameters typical of state-of-the-art methods.