Efficient Semantic Detection and Analysis of Misinformation in CBD-Related Tweets Using FAISS and Mistral NeMo Instruct
Document Type
Article
Publication Date
6-19-2025
Abstract
The growing popularity of cannabidiol (CBD) has led to a surge in misinformation, particularly on social media platforms like Twitter, posing risks to public health. This paper presents a scalable method for detecting CBD-related misinformation in a large corpus of tweets. Using approximately 3.7 million tweets collected from 2011 to 2021, we implement a two-step process: first, FAISS (Facebook AI Similarity Search) efficiently identifies tweets semantically similar to false claims extracted from FDA warning letters. Second, Mistral NeMo Instruct, a zero-shot model, classifies tweets as ‘Misinformation’ or ‘Non-Misinformation’, providing justifications for transparency. This approach minimizes computational costs while maintaining accuracy, making it a practical tool for large-scale misinformation detection. The framework is scalable and adaptable, evolving with new FDA data or emerging cannabis research.
Recommended Citation
J. Turner, M. A. Gulum and M. Kantardzic, "Efficient Semantic Detection and Analysis of Misinformation in CBD-Related Tweets Using FAISS and Mistral NeMo Instruct," 2025 19th International Conference on Semantic Computing (ICSC), Laguna Hills, CA, USA, 2025, pp. 1-8.
