The International Conference on Multimedia and Image Processing (ICMIP) gathers researchers who, as the name implies, present each year novel methods in processing images and multimedia content. In previous years, this international conference has seen delegates come together from different areas to present the research and new ideas on such processing, establishing business and research relationships and having met in international locations like China, Malaysia, and Brunei Darussalam. From April 20–22, 2024, the 9th ICMIP will be held in Osaka, Japan. Sponsored by Ritsumeikan University, this year’s ICMIP will take place at the Ritsumeikan University Osaka (OIC) campus. Two studies from COSMOS researchers—viz., Niloofar Yousefi, Mert Can Cakmak, Mainuddin Shaik, and Dr. Nitin Agarwal—have been accepted for publication at this prestigious conference!

The researchers will present their studies, entitled “Emotion Assessment of YouTube Videos using Color Theory” and “Examining Multimodal Emotion Assessment and Resonance with Audience on YouTube.” 

Dr. Agarwal said, “The two studies nicely complement each other, as in the first study develops novel ways to assess emotions from multimedia content, while the second study integrates the approach from the first study with audio signal and text based emotion analysis to determine which modality (i.e., video, audio, and text) resonates more with the content consumers.” He continued, “Such studies demonstrate how COSMOS is pioneering groundbreaking socio-computational research to understand behaviors in the evolving landscape of social media, where multimedia rich platforms are becoming increasingly prevalent.” 

The former paper develops a novel barcode emotional color analysis. This research showed how videos establish emotional themes through color—each video is condensed into a barcode summarizing the colors present throughout a video. 

The latter study builds upon this research by applying color barcode based emotion analysis, audio-based emotion analysis, and text-based emotion analysis to various contexts, such as movie trailers and political news channels. By comparing text-based emotion analysis, audio-based emotion analysis, and color barcode-based emotion analysis of political datasets with a movie trailer dataset, Yousefi, Cakmak, and Dr. Agarwal show how multimedia use can differ across different contexts.

“Our analysis shows that content like news or incident explanations evokes more emotion through text,” Niloofar explains. “We believe this is likely due to the need to convey important information in a short video, so the creator focuses on clarity rather than visual elements, whereas, on the other hand, movie trailers use color and audio to attract users. For instance, horror movie trailers often use red and black colors to evoke fear.”