ENHANCING FAKE NEWS DETECTION MODELS THROUGH MODIFIED COSINE SIMILARITY (MCS) USING SORENSEN-DICE DISTANCE

Authors

  • Aliyu Shuaibu Bayero University, Kano
  • Hadiza Ali Umar Bayero University, Kano
  • Yusuf Tijjani Baffa Bayero University, Kano

DOI:

https://doi.org/10.54938/ijemdss.2025.04.2.532

Keywords:

Sorensen-Dice, Misinformation, Natural Language Processing, Jaccard, Classifirs

Abstract

The rapid spread of fake news undermines public trust and highlights the need for more reliable detection models. Traditional approaches, such as LSTM with standard cosine similarity using Euclidean distance, often fail to capture subtle textual relationships. This study introduces a Modified Cosine Similarity (MCS) that replaces Euclidean distance with Sørensen-Dice distance and evaluates its effectiveness across four models: Long Short-Term Memory (LSTM), K-Nearest Neighbor (KNN), Random Forest (RF), and Support Vector Machine (SVM). Baseline results using cosine similarity showed strong performance, with SVM achieving the highest accuracy (0.987) and F1-score (0.986), followed by RF (accuracy 0.979) and KNN (accuracy 0.973). However, enhanced models with MCS demonstrated substantial improvements. LSTM achieved the best results overall (accuracy 0.997, recall 0.998, F1-score 0.997) with reduced cross-entropy loss (0.016), false positive rate (0.005), and false negative rate (0.002). SVM and KNN also showed notable gains with accuracies of 0.995 and 0.991, respectively, while RF recorded high recall (0.995) and competitive performance across metrics. These findings confirm that integrating Sørensen-Dice distance into cosine similarity significantly boosts semantic representation and model performance, making MCS a robust similarity measure for advancing fake news detection.

Downloads

Download data is not yet available.

Author Biographies

Hadiza Ali Umar, Bayero University, Kano

Department of Computer Science, Faculty of Computing

Yusuf Tijjani Baffa, Bayero University, Kano

Department of Software Engineering, Faculty of Computing

References

Ahmed, A., & Khalid, T. (2022). Political fake news detection using Jaccard distance and Cosine similarity. Journal of Political Communication Studies, 21(2), 320-335.

Ahmed, S., Hinkelmann, K., & Corradini, F. (2022). Development of Fake News Model using Machine Learning through Natural Language Processing. 14(12), 454–460. http://arxiv.org/abs/2201.07489

Apallius de Vos, I. M., van den Boogerd, G. L., Fennema, M. D., & Correia, A. D. (2022). Comparing in context: Improving cosine similarity measures with a metric tensor. arXiv preprint arXiv:2203.14996. https://arxiv.org/abs/2203.14996

Apuke, O. D., & Omar, B. (2021). Fake news and covid-19: Modelling the predictors of fake news sharing among social media users. Telematics and Informatics, 56, 101475. https://doi.org/10.1016/j.tele.2020.101475

Barnes, M., & Patel, S. (2023). Political news inconsistency detection using Jaccard coefficient and Cosine similarity. Journal of Political Discourse and Communication, 17(2), 90-105.

Barve, Y., Saini, J. R., Kotecha, K., & Gaikwad, H. (2022). Detaction of Fact_Checking Misinformation using "Veracity Scanning Model". International Journal of Advanced Computer Science and Applications, 13(2), 201- 209. doi:https://doi.org/10.5281/zenodo.7778421

Castillo, F., et al. (2023). Fake news detection using Jaccard similarity and Cosine similarity for headline-text coherence. Journal of Communication and Media Studies, 14(1), 155-168.

Shushkevich, E., Mai, L., Loureiro, M. V., Derby, S., & Wijaya, T. K. (2023). SPICED: News Similarity Detection Dataset with Multiple Topics and Complexity Levels. Retrieved from https://arxiv.org/abs/2309.13080v1

Gomez, T., & Rivera, F. (2023). Plagiarism detection in academic fake news using Cosine similarity and Jaccard distance. Journal of Academic Integrity and Technology, 15(2), 170-190.

Malla, S. J., & P.J.A., A. (2021). COVID-19 outbreak: An ensemble pre-trained deep learning model for detecting informative tweets. Applied Soft Computing, 107, 107495. https://doi.org/10.1016/j.asoc.2021.107495

Mohapatra, A., Thota, N., & Prakasam, P. (2022). Fake news detection and classification using hybrid BiLSTM and self-attention model. Multimedia Tools and Applications, 81(13), 18503–18519. https://doi.org/10.1007/s11042-022-12764-9

Muller, R., & Esposito, G. (2022). Detecting duplicate news articles with Jaccard distance and Cosine similarity. International Journal of Media Studies, 13(2), 205-220.

Nguyen, T. T., Nguyen, Q. V. H., Nguyen, D. Q., & Cao, L. (2023). Detecting Multimodal Fake News Using Fusion Techniques. Journal of Information Technology, 38(2), 184-197.

Patel, N., & Rao, P. (2023). Medical misinformation detection using Jaccard similarity and Euclidean distance. Journal of Medical Informatics and Knowledge Discovery, 28(1), 45-60.

Patra, N., Sharma, S., Ray, N., & Bera, D. (2024). Measuring Accuracy in AI-Generated Definitions: A Comparison Among Select GPTs Using Cosine Similarity Index. ResearchGate

Rahman, M., & Lee, H. (2023). Classifying misinformation using Euclidean distance and Cosine similarity in feature vectors. Journal of Data Science and Informatics, 17(1), 67-83.

Ritika, R., & Satwinder, S. (2021). Multi-similarity measures for fake news detection. Journal of Information and Knowledge Management, 20(3), 250-270.

Silva, G., et al. (2022). Government document misinformation detection using Cosine similarity and Euclidean distance. Journal of Information Retrieval in Public Affairs, 24(2), 105-118. the Russia-Ukraine Conflict. Proceedings of the ACM on Human-Computer Interaction, 7(CSCW1), 1-24.

Tshephisho, N., & Mapitsi, S. (2023). Jaccard similarity and cosine similarity in news article generation and comparison. Journal of Applied Computational Intelligence, 18(2), 180-192.

Wang, Z., et al. (2022). Fake news detection on social media using Cosine similarity and Jaccard coefficient. Journal of Online Media and Technology, 18(4), 245-260.

Dementieva, D., Kuimov, M., & Panchenko, A. (2023). Multiverse: Multilingual Evidence for Fake News Detection. Journal of Imaging,, 9(4). doi:https://doi.org/10.3390/jimaging9040077

Yashoda, D., et al. (2021). Healthcare misinformation detection and fact-checking: A novel approach. IEEE Transactions on Knowledge and Data Engineering, 33(5), 945-956.

Yashoda, D., et al. (2022). Detecting and fact-checking misinformation using a veracity scanning model. Expert Systems with Applications, 38(7), 3458-3468.

Kumari, S. (2024). A Deep Learning Multimodal Framework for Fake News Detection. 14(5), 16527–16533.

Published

2025-10-19

How to Cite

Aliyu Shuaibu, Hadiza Ali Umar, & Yusuf Tijjani Baffa. (2025). ENHANCING FAKE NEWS DETECTION MODELS THROUGH MODIFIED COSINE SIMILARITY (MCS) USING SORENSEN-DICE DISTANCE. International Journal of Emerging Multidisciplinaries: Social Science, 4(2). https://doi.org/10.54938/ijemdss.2025.04.2.532

Issue

Section

Research Article