Deepfake technology, leveraging advancements in deep learning, has become a significant threat to digital media authenticity, enabling the creation of hyper-realistic yet deceptive videos that challenge existing detection methods. This paper presents a hybrid approach combining ResNext Convolutional Neural Networks (CNN) for frame-level feature extraction and Long Short-Term Memory (LSTM) networks for analyzing temporal dependencies to improve deepfake detection accu racy. The study utilized a balanced dataset comprising videos from FaceForensic++, Celeb-DF, and custom-crafted deepfakes, with preprocessing steps that included facial region cropping, frame standardization, and noise reduction. The proposed model achieved an accuracy of 94.87%, outperforming existing methods by effectively capturing both static and dynamic video fea tures. Key innovations include leveraging the complementary strengths of CNNs and LSTMs to address frame-level and se quential inconsistencies in fake media. This approach is validated through extensive experimentation, demonstrating robustness against evolving generative adversarial techniques. The results establish a strong foundation for scalable and real-time detection applications, with future work aiming to enhance detection for multi-modal data and improve computational efficiency for deployment in resource-constrained environments.