Machine Learning–Driven Sentiment Analysis for Tourism Insights: The Case of North-East Indian States
Published 2026-01-15
Keywords
- Emotion Detection,
- Feature Extraction,
- Opinion Mining,
- Polarity Classification,
- Sentiment
Copyright (c) 2025 SANDEEP BHATTACHARJEE (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
Abstract
This study examines publicly available online reviews from Sikkim, Meghalaya, Arunachal Pradesh, Assam, Nagaland, Mizoram, and Manipur to evaluate the effectiveness of sentiment analysis for improving tourism decision-making and addressing the constraints of traditional surveys in capturing emotional and informal experiential insights. Open-access datasets from TripAdvisor, Reddit, and travel blogs (GitHub) were methodically tokenized, lemmatized, and stop-word-free to ensure clean and analysable textual input. NRC Emotion Lexicon determined emotional tone, VADER and TextBlob estimated sentiment polarity, and a state-wise sentiment index was generated. Emotion and experience themes were clustered by K-Means. The findings suggest a considerably favourable attitude in Sikkim and Meghalaya due to natural beauty and hospitality, whereas Assam and Nagaland have mixed sentiments due to infrastructure constraints. Joy, trust, and surprise are dominant feelings, while sadness and contempt are related to service issues. The study concluded that AI-driven sentiment analytics enhances destination branding, service improvement, and policy planning, providing a strong foundation for data-informed tourist growth.
