In this month’s research spotlight, COSMOS highlights recent research that uses AI as an integral part across multiple aspects of its methodology. Specifically, the study titled “Decoding YouTube’s Recommendation System: A Comparative Study of Metadata and GPT-4 Extracted Narratives” explores the role of YouTube’s recommendation system in shaping user experiences. The study was published at the 4th International Workshop on Computational Methods for Online Discourse Analysis (BeyondFacts 2024), which took place from 13 to 17 May in Singapore. 

Bias within recommendation systems can potentially create filter bubbles and echo chambers that reinforce users’ existing beliefs, stifle diverse perspectives, and contribute to polarization. For this reason, many studies have investigated YouTube’s recommendation system. The authors of this paper argue that previous studies’ reliance on metadata, such as video titles and descriptions, could perpetuate the very biases researchers aim to address. Since metadata often fails to capture the full depth of video content, there runs the risk of inaccurately identifying the actual content of recommended YouTube videos; they propose a novel approach to overcome these limitations by using the large language model (LLM) AI GPT-4 to extract narratives from video transcripts, with transcripts collected either from YouTube API or using the OpenAI Whisper model. In particular, they test the proposed approach on recommended videos concerning the South China Sea dispute, providing an overview of the trends in sentiments, emotions, and toxic elements.

The findings revealed significant trends as the depth of content analysis increased. Both YouTube video titles and narratives generally showed a shift from neutral to positive sentiments, but the shift was significantly more evident in narratives. Emotion analysis indicated an increase in positive emotions, particularly joy, and a decrease in negative emotions such as anger and disgust, again especially and to a much greater degree in the narratives than in metadata. However, toxicity analysis presented a contrasting pattern: while video titles displayed an upward trend in toxicity, peaking at the greatest depth analyzed, narratives showed a high initial toxicity level that sharply decreased and stabilized at lower depths. “Titles, though useful for capturing initial viewer interest, exhibit a weaker and more variable relationship with toxicity, often failing to reflect the deeper sentiment trends present in the narrative content,” they explain.

Dr. Agarwal said, “These findings emphasize the limitations of relying solely on metadata for analyzing YouTube content; they suggest that more in-depth engagement with video content, beyond just titles, is crucial for understanding the full impact of YouTube’s algorithms on user experience.” The researchers advocate for integrating narratives into analytical frameworks to achieve a more nuanced and accurate understanding of video content. This shift in methodology could lead to better insights into the sentiment and toxicity landscape on YouTube, potentially informing more effective platform moderation and algorithmic recommendations.