ENHANCING FAKE NEWS DETECTION MODELS THROUGH MODIFIED COSINE SIMILARITY (MCS) USING SORENSEN-DICE DISTANCE
DOI:
https://doi.org/10.54938/ijemdss.2025.04.2.532Keywords:
Sorensen-Dice, Misinformation, Natural Language Processing, Jaccard, ClassifirsAbstract
The rapid spread of fake news undermines public trust and highlights the need for more reliable detection models. Traditional approaches, such as LSTM with standard cosine similarity using Euclidean distance, often fail to capture subtle textual relationships. This study introduces a Modified Cosine Similarity (MCS) that replaces Euclidean distance with Sørensen-Dice distance and evaluates its effectiveness across four models: Long Short-Term Memory (LSTM), K-Nearest Neighbor (KNN), Random Forest (RF), and Support Vector Machine (SVM). Baseline results using cosine similarity showed strong performance, with SVM achieving the highest accuracy (0.987) and F1-score (0.986), followed by RF (accuracy 0.979) and KNN (accuracy 0.973). However, enhanced models with MCS demonstrated substantial improvements. LSTM achieved the best results overall (accuracy 0.997, recall 0.998, F1-score 0.997) with reduced cross-entropy loss (0.016), false positive rate (0.005), and false negative rate (0.002). SVM and KNN also showed notable gains with accuracies of 0.995 and 0.991, respectively, while RF recorded high recall (0.995) and competitive performance across metrics. These findings confirm that integrating Sørensen-Dice distance into cosine similarity significantly boosts semantic representation and model performance, making MCS a robust similarity measure for advancing fake news detection.
Downloads
References
Ahmed, A., & Khalid, T. (2022). Political fake news detection using Jaccard distance and Cosine similarity. Journal of Political Communication Studies, 21(2), 320-335.
Ahmed, S., Hinkelmann, K., & Corradini, F. (2022). Development of Fake News Model using Machine Learning through Natural Language Processing. 14(12), 454–460. http://arxiv.org/abs/2201.07489
Apallius de Vos, I. M., van den Boogerd, G. L., Fennema, M. D., & Correia, A. D. (2022). Comparing in context: Improving cosine similarity measures with a metric tensor. arXiv preprint arXiv:2203.14996. https://arxiv.org/abs/2203.14996
Apuke, O. D., & Omar, B. (2021). Fake news and covid-19: Modelling the predictors of fake news sharing among social media users. Telematics and Informatics, 56, 101475. https://doi.org/10.1016/j.tele.2020.101475
Barnes, M., & Patel, S. (2023). Political news inconsistency detection using Jaccard coefficient and Cosine similarity. Journal of Political Discourse and Communication, 17(2), 90-105.
Barve, Y., Saini, J. R., Kotecha, K., & Gaikwad, H. (2022). Detaction of Fact_Checking Misinformation using "Veracity Scanning Model". International Journal of Advanced Computer Science and Applications, 13(2), 201- 209. doi:https://doi.org/10.5281/zenodo.7778421
Castillo, F., et al. (2023). Fake news detection using Jaccard similarity and Cosine similarity for headline-text coherence. Journal of Communication and Media Studies, 14(1), 155-168.
Shushkevich, E., Mai, L., Loureiro, M. V., Derby, S., & Wijaya, T. K. (2023). SPICED: News Similarity Detection Dataset with Multiple Topics and Complexity Levels. Retrieved from https://arxiv.org/abs/2309.13080v1
Gomez, T., & Rivera, F. (2023). Plagiarism detection in academic fake news using Cosine similarity and Jaccard distance. Journal of Academic Integrity and Technology, 15(2), 170-190.
Malla, S. J., & P.J.A., A. (2021). COVID-19 outbreak: An ensemble pre-trained deep learning model for detecting informative tweets. Applied Soft Computing, 107, 107495. https://doi.org/10.1016/j.asoc.2021.107495
Mohapatra, A., Thota, N., & Prakasam, P. (2022). Fake news detection and classification using hybrid BiLSTM and self-attention model. Multimedia Tools and Applications, 81(13), 18503–18519. https://doi.org/10.1007/s11042-022-12764-9
Muller, R., & Esposito, G. (2022). Detecting duplicate news articles with Jaccard distance and Cosine similarity. International Journal of Media Studies, 13(2), 205-220.
Nguyen, T. T., Nguyen, Q. V. H., Nguyen, D. Q., & Cao, L. (2023). Detecting Multimodal Fake News Using Fusion Techniques. Journal of Information Technology, 38(2), 184-197.
Patel, N., & Rao, P. (2023). Medical misinformation detection using Jaccard similarity and Euclidean distance. Journal of Medical Informatics and Knowledge Discovery, 28(1), 45-60.
Patra, N., Sharma, S., Ray, N., & Bera, D. (2024). Measuring Accuracy in AI-Generated Definitions: A Comparison Among Select GPTs Using Cosine Similarity Index. ResearchGate
Rahman, M., & Lee, H. (2023). Classifying misinformation using Euclidean distance and Cosine similarity in feature vectors. Journal of Data Science and Informatics, 17(1), 67-83.
Ritika, R., & Satwinder, S. (2021). Multi-similarity measures for fake news detection. Journal of Information and Knowledge Management, 20(3), 250-270.
Silva, G., et al. (2022). Government document misinformation detection using Cosine similarity and Euclidean distance. Journal of Information Retrieval in Public Affairs, 24(2), 105-118. the Russia-Ukraine Conflict. Proceedings of the ACM on Human-Computer Interaction, 7(CSCW1), 1-24.
Tshephisho, N., & Mapitsi, S. (2023). Jaccard similarity and cosine similarity in news article generation and comparison. Journal of Applied Computational Intelligence, 18(2), 180-192.
Wang, Z., et al. (2022). Fake news detection on social media using Cosine similarity and Jaccard coefficient. Journal of Online Media and Technology, 18(4), 245-260.
Dementieva, D., Kuimov, M., & Panchenko, A. (2023). Multiverse: Multilingual Evidence for Fake News Detection. Journal of Imaging,, 9(4). doi:https://doi.org/10.3390/jimaging9040077
Yashoda, D., et al. (2021). Healthcare misinformation detection and fact-checking: A novel approach. IEEE Transactions on Knowledge and Data Engineering, 33(5), 945-956.
Yashoda, D., et al. (2022). Detecting and fact-checking misinformation using a veracity scanning model. Expert Systems with Applications, 38(7), 3458-3468.
Kumari, S. (2024). A Deep Learning Multimodal Framework for Fake News Detection. 14(5), 16527–16533.
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Emerging Multidisciplinaries: Social Science

This work is licensed under a Creative Commons Attribution 4.0 International License.





