Since deepfake films allow for the production of extremely convincing manipulated media, they represent serious threats to the integrity of information. These videos are produced utilising sophisticated machine learning models such as Gen erative Adversarial Networks (GANs). This study introduces a hybrid deep learning framework that efficiently detects deepfakes by combining Long Short-Term Memory (LSTM) networks for temporal analysis with ResNext Convolutional Neural Networks (CNNs) for spatial feature extraction. By applying transfer learning, the model reduces computing overhead while achieving great accuracy and efficiency. For training and assessment, a meticulously selected dataset of 1,000 videos that was evenly dis tributed between authentic and fraudulent content was utilised. During preprocessing, video frames’ facial features were sepa rated and cropped to provide a high-quality face-only dataset. The suggested model proved its resilience in detecting modified information with an astounding 95% detection accuracy on the test set. The model’s superiority over baseline techniques was demonstrated through performance validation using metrics like precision, recall, and F1-score. In order to combat the swift advancement of deepfake technology, this study highlights the significance of creating flexible detection methods. Subsequent efforts will concentrate on extending detection capabilities to encompass full-body movements and incorporating the frame work into easily available tools such as browser-based plugins for continuous use.