Cosmographers’ Summer 2024 Internships

Internships offer students valuable opportunities that can significantly influence their academic journeys. For the COSMOS research team, these experiences have been particularly beneficial, providing numerous advantages. Internships enable students to tackle challenges from diverse angles, pushing them to rethink assumptions and inspiring innovative thinking. COSMOS welcomes from summer internships Manohar KoyaImran MohammedRemi Oni, Diwash PoudelPrecious Sani, and Shadi Shajari

We are excited to announce the recent internships of our cosmographers, who have had the privilege of working with prestigious companies such as W&W AFCO Steel, Acxiom, Amazon, VINCI Education Worldwide, and FedEx, as well as the Arkansas State Department of Commerce.

These internships have infused our cosmographers with fresh ideas, creating a dynamic and collaborative atmosphere that drives innovation. They have also provided opportunities to build professional networks, connect with industry leaders, and establish meaningful relationships. These connections create a vital link between academia and industry, paving the way for future collaborations aimed at enhancing the understanding of real-world challenges, developing a workforce with relevant industry skills, and fostering mutual knowledge exchange.

Through these internships, cosmographers gain essential real-world problem-solving skills that they can apply to our projects upon their return. We are immensely proud of their achievements and are eager to see the remarkable contributions they will continue to make as they advance their academic pursuits, enriched by their experiences. COSMOS welcomes them from their summer work, and read the reports of their work below.

Manohar says, “During my summer internship at FedEx, I had the exciting opportunity to work within the InfoSec organization, specifically with the Risk Data Engineering team. I developed data dashboards and improved the team’s alerting and monitoring systems by building a Python microservice integrated with Teams webhooks for dynamic alerts. A key highlight was presenting this work to Gene Sun, CISO of FedEx and SVP of InfoSec. His insights not only validated the impact of my contributions but also provided a new perspective on the critical importance of information security. This experience significantly enhanced my technical skills and deepened my passion for cybersecurity.”

Imran remarks, “During my internship at the Arkansas State Department of Commerce in the Division of Workforce Services, I had the opportunity to work on various applications, including mainframes, C#, and AWS. I was part of the IT team responsible for managing and developing applications that support unemployed individuals in Arkansas. I gained valuable insights into how the Arkansas state government assists those who register on the website by providing unemployment insurance claims and offering job opportunities tailored to their skillsets. My team owns and manages all the unemployment-related applications, such as Arknet and Ezarc for the State of Arkansas. Overall, it was a great learning experience.”

Remi expresses, “In my summer internship at Amazon AWS Dallas, I worked as a Data Engineer Intern on the Roster Analytics ORCAS team. I developed a robust data quality framework aimed at addressing key challenges in data integrity within our pipeline. The solution was built and implemented using the DQChecker-python-wrapper to conduct data quality checks on both upstream and downstream datasets. This framework not only reduced redundancy but also saved my team more than 20 work hours per month by catching data quality issues early. I became proficient in AWS services like Glue, S3, and QuickSight, and further honed my time management and problem-solving skills. The best part of the internship was participating in virtual game hours and playing table tennis with colleagues after work, which made for a well-rounded and enjoyable experience.”

Diwash says, “During my summer internship for VINCI Education in Dunn Loring, North Virginia, I had the opportunity to work on developing an action model using computer vision, which allowed me to dive deep into vision transformers. This experience not only provided valuable technical skills but also gave me a firsthand look at American working culture. The charming surroundings and my first-ever ride on a metro train added to the overall experience, making it both memorable and enriching.”

Precious reports, “During my summer internship at Acxiom LLC, I served as an Information Security Intern, where I acquired substantial knowledge and expertise in secure SDLC and AI Governance, securing information systems and preventing vulnerabilities. I was introduced to various industry-standard tools such as Burp Suite and Snyk, which are instrumental in enhancing the security of web applications. Additionally, I had the privilege of interacting with leadership and gaining valuable insights from their experiences. The skills I acquired during this internship have equipped me to write more secure code and effectively safeguard information assets.”

Shadi reflects, “During my internship at W&W AFCO Steel company, significant efforts were made to refine and enhance barcode detection and processing systems using deep learning techniques. Super-resolution models were trained and debugged, with optimizations applied to improve the accuracy and speed of barcode detection, even under challenging conditions such as rapid movement or poor image quality. Deep learning solutions like OpenCV and PyTorch were integrated to address complex detection issues, and barcode repair mechanisms were implemented to enhance the reliability of the models.”

Dr. Agarwal commented, “These internships represent significant milestones in the careers of our cosmographers. The knowledge and insights they gained during their recent experiences in the data science industry often help ongoing research at COSMOS. We remain committed to encouraging our researchers to engage with the industry beyond the university, and their internships are a testament to their growing expertise and talent.”

COSMOS to Conduct Social Media Analysis Training

From September 26th through 28th, COSMOS will host a hybrid in-person and virtual training course, taught by Dr. Nitin Agarwal and based on his US Department of Defense-funded research, on social media analysis techniques; each day will have scheduled instruction from 10:00 AM to 5:00 PM (US Central time zone), with lunch and coffee breaks.

The training course, supported by grants from the US Army Research Office, will introduce basic social science concepts, theories, and principles that guide model development, data analysis, AI/ML, and inference. Attendees will learn about state-of-the-art developments in mining social media to support business intelligence and decision-making, and about the emerging challenges and opportunities within social media. Participants will be prepared to learn innovative applications of multidisciplinary problem-solving approaches, and come away from the course with the analytical skills to enhance their understanding of the data.

Social media platforms connect people, enabling shared consumption of content, opinions, and experiences, and analyzing this wealth of data can help discover the behaviors of individuals, communities, and organizations. Businesses, academic researchers, and governments can use the study of this data to guide decision-making and to advance the understanding of social and cultural dynamics through the lens of contemporary information and communication tools (ICTs). The training course will show attendees how to conduct such research, with a near-limitless number of applications—how to analyze social media data about products, services, campaigns, markets, events, customers, and employees; how to segment audiences by geography, demographics, influencers, recommenders, or detractors; and how to measure social media activities. The training will discuss case studies that show the impact of the analyses on domains like security, health, business, policy-making, strategic communication, public affairs, and socio-political and cultural assessment.

All that is required is a computer with an internet connection and a familiarity with programming languages such as Python. Limited participation scholarships, supported by the US Army Research Office, up to $1,000 will be available.

View the event agenda, read more about the material, or register for the upcoming event at https://cosmos.ualr.edu/training/.

Cosmographer Corner: Dr. Thomas Marcoux, Data Scientist

COSMOS, in this edition of Cosmographer Corner, highlights the work of former University of Arkansas at Little Rock graduate and Cosmographer Dr. Thomas Marcoux. 

Dr. Marcoux—who is now a Senior Data Scientist working for Bayer—started his graduate education at UA Little Rock in 2015, studying for his master’s in computer science. After starting his PhD, he joined COSMOS as a graduate research assistant in 2018. He received his PhD in computer & information sciences in 2022, and worked as a postdoc for COSMOS post-graduation. We interviewed Dr. Marcoux on where his career is now and what his work at COSMOS entailed, with his responses below.

How did COSMOS fit into your university/secondary education career? How did you come across COSMOS, and what were you studying when you joined COSMOS?

Before my time at UA Little Rock, I was an international exchange student, visiting from my home country of France during the senior year of my bachelor’s in computer science at Université d’Orléans. During that time, I discovered that I liked it here in the US, and that I wanted to pursue a master’s here. I returned to France and worked a little bit there, but I decided I wanted to stay in the US long-term. 

When starting my master’s in computer science at UA Little Rock, I was introduced to COSMOS through a friend, Tuja Khaund, who was working at COSMOS at the time as a graduate research assistant. Eventually I met Dr. Agarwal, joined COSMOS, and worked there for the four years of my PhD as a graduate research assistant. I also returned after graduation and worked as a postdoc at COSMOS for a year. 

How would you describe the “research pipeline” that you worked on while at COSMOS? In other words, what was the specific area in which you researched?

The main three projects I worked on were VTracker, a tool for highlighting narratives, and DatabaseSyncTool. At the time I worked at COSMOS, everyone definitely wore a lot of different hats, so I did a bit of database management, supporting the work of different COSMOS teams. But VTracker was my flagship project.

I also helped work on the COVID misinformation tool, which was done in partnership with the Office of the Attorney General for the state of Arkansas. We put this tool up on the website for tracking misinformation when COVID was developing, matching articles for debugging information and trying to shed some light in all the confusion.

Since leaving COSMOS, what roles/positions/jobs have you had? What is your current work?

When I worked at COSMOS as postdoc, I continued what I was working on but also did admin work. For example, I would interview people that were applying to COSMOS, since I had seniority at the time. 

After my postdoc work, I began work as a data scientist for Bayer, which is where I work now, through the contractor ColaBerry. With Bayer, I work with plant DNA sequencing data—specifically, I work with the crop science division. And what I do is I support the lab. You may imagine the people with the white coats and the beakers, who do the biology work—they send to me their computerized DNA sequencing data, which I then process. There’s a very complex workflow to the data, and we assign metrics to them and, essentially, do quality assurance. Say there’s these plates with DNA samples data: do they pass our checks? Do we need new samples?—That sort of thing.

What positions did COSMOS and your classes at UALR best prepare you for?

First of all, it’s been Python all the way—I use Python that I learned in classes in the lab. It’s the programming language most likely to be used for scientific projects.

It’s interesting, because COSMOS is a bit like a startup—like I said earlier, you have to wear a lot of hats, to be able to do very different things while working with people coming in and out from different backgrounds who all have different skill sets, different cultures. So I would say that one of the most beneficial things about the COSMOS experience, that you may not get in other places, is that it makes you work with people very well. It sort of forces you to manage expectations of yourself and others, because there are some things others may be able to do better. So you develop this mindset of, “Okay, this is what I’m able to do, and what I can do tomorrow.” It makes you more efficient and able to manage group workloads.

If you had to describe the most momentous event at COSMOS, what would it be?

Oh, that’s easy. We had—in 2019 right before COVID—the International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction and Behavior Representation in Modeling and Simulation (SBP-BRiMS). We flew to Washington, DC and had an Airbnb, and I was very lucky since that batch of cosmographers was our first and last conference in person, before everything became virtual after COVID hit. It was really fun, going with the team. We presented at a panel that was two projects in one room, and found some people who were interested in funding us just by hearing about our projects. I loved the city so much that I decided to move to it! So I credit SBP-BRiMS for making me want to pursue the career in Washington, DC I now have.

What advice would you have for current Cosmographers?

I hinted at it a bit earlier, when talking about managing expectations and hats. COSMOS is a unique place, where most of the students are international. It will provide you with lots of amazing learning opportunities. So take advantage of those. I know it can be stressful and emotionally taxing if you’re struggling to perform. So what I’ve found to be very helpful is to find what you’re good at and try to steer yourself in that direction, while communicating with your team and Dr. Agarwal what your strengths are. For example, you can, if you’re struggling with such and such projects, try to reorient and say, “Hey, I’m not able to do that. Can we do it this way, or can I have some kind of assistance?”

Alongside that, communication is key. People will get stressed about asking for help or communicating that they don’t know how to do something, which, again, I think is difficult to admit because it may feel like it will cost you something. But being able to reach out helps significantly.

Research Spotlight: Narratives and Bias

In this month’s research spotlight, COSMOS highlights recent research that uses AI as an integral part across multiple aspects of its methodology. Specifically, the study titled “Decoding YouTube’s Recommendation System: A Comparative Study of Metadata and GPT-4 Extracted Narratives” explores the role of YouTube’s recommendation system in shaping user experiences. The study was published at the 4th International Workshop on Computational Methods for Online Discourse Analysis (BeyondFacts 2024), which took place from 13 to 17 May in Singapore. 

Bias within recommendation systems can potentially create filter bubbles and echo chambers that reinforce users’ existing beliefs, stifle diverse perspectives, and contribute to polarization. For this reason, many studies have investigated YouTube’s recommendation system. The authors of this paper argue that previous studies’ reliance on metadata, such as video titles and descriptions, could perpetuate the very biases researchers aim to address. Since metadata often fails to capture the full depth of video content, there runs the risk of inaccurately identifying the actual content of recommended YouTube videos; they propose a novel approach to overcome these limitations by using the large language model (LLM) AI GPT-4 to extract narratives from video transcripts, with transcripts collected either from YouTube API or using the OpenAI Whisper model. In particular, they test the proposed approach on recommended videos concerning the South China Sea dispute, providing an overview of the trends in sentiments, emotions, and toxic elements.

The findings revealed significant trends as the depth of content analysis increased. Both YouTube video titles and narratives generally showed a shift from neutral to positive sentiments, but the shift was significantly more evident in narratives. Emotion analysis indicated an increase in positive emotions, particularly joy, and a decrease in negative emotions such as anger and disgust, again especially and to a much greater degree in the narratives than in metadata. However, toxicity analysis presented a contrasting pattern: while video titles displayed an upward trend in toxicity, peaking at the greatest depth analyzed, narratives showed a high initial toxicity level that sharply decreased and stabilized at lower depths. “Titles, though useful for capturing initial viewer interest, exhibit a weaker and more variable relationship with toxicity, often failing to reflect the deeper sentiment trends present in the narrative content,” they explain.

Dr. Agarwal said, “These findings emphasize the limitations of relying solely on metadata for analyzing YouTube content; they suggest that more in-depth engagement with video content, beyond just titles, is crucial for understanding the full impact of YouTube’s algorithms on user experience.” The researchers advocate for integrating narratives into analytical frameworks to achieve a more nuanced and accurate understanding of video content. This shift in methodology could lead to better insights into the sentiment and toxicity landscape on YouTube, potentially informing more effective platform moderation and algorithmic recommendations.

COSMOS Makes a Splash at the 30th AMCIS 2024

The annual Americas Conference on Information Sciences (AMCIS) leads academic research dedicated to information science taking place in the US and from the North and South Americas, and is an Association for Information Sciences (AIS) conference. This year, the theme was around “Elevating Life Through Digital Social Entrepreneurship.” Namely, the conference highlighted the disparities in access to technology, especially in marginalized communities, and emphasized the importance of providing equal digital opportunities to promote inclusive growth. Additionally, it explored the challenges of developing sustainable business models that prioritize social impact.

From August 15 to 17, 2024, the 30th AMCIS conference was held in Salt Lake City, Utah. This year COSMOS had 5 studies at AMCIS! Several cosmographers visited Salt Lake City to present the studies. The following is a list of papers that were published in the conference:

  • Studying the Influence of Toxicity Intensity on Its Propagation Using Epidemiological Models
  • Toxicity Prediction in Reddit
  • A Computational Approach to Analyze Identity Formation: A Case Study of Brazil Insurrection
  • Role of Co-occurring Words on Mobilization in Brazilian Social Movements 
  • Analyzing Anomalous Engagement and Commenter Behavior on YouTube

In this first article of a two-part series, we summarize two papers on online toxicity, “Studying the Influence of Toxicity Intensity on Its Propagation Using Epidemiological Models” and “Toxicity Prediction in Reddit.” While both are concerned with tracking and classifying toxic online content, the former compares epidemiological modeling of toxicity on Twitter/X while the latter predicts toxicity based on the hierarchy of comments on Reddit.

“Studying the Influence of Toxicity Intensity on Its Propagation Using Epidemiological Models” specifically compares epidemiological models that segment the population into different groups based on the “infection” of toxic posting. Namely, the authors compare the SIR (Susceptible-Infected-Recovered), SIS (Susceptible-Infected-Susceptible), and STRS (Susceptible-Toxic-Recovered-Susceptible) models. As the model names suggest, each model segments the population and determines whether individuals are susceptible to, are transmitting, or in recovery from the effects of online toxicity. They address two key research questions: 

  1. Which epidemiological model is most effective in explaining toxicity diffusion on social media?
  2. How do different levels of toxicity affect the spread of toxic content?

They found that the SIS model, while providing some insight, was least accurate in comparison to the others. The Recovered state in the SIR model also improved accuracy, but it was not as accurate as the STRS model: the latter’s ability to capture the cyclical nature of reinfection in toxicity spread made it the most robust model for understanding toxicity diffusion. Additionally, even when datasets were split into moderately and highly toxic posts, the STRS model had superior error rates for both kinds of toxicity.

Dr. Agarwal said, “Such results suggest that policymakers and social media platforms could benefit from science-driven strategies tailored to combat the toxic behaviors of different user groups.” 

“Toxicity Prediction in Reddit,” proposed a novel approach to predict and detect toxic comments on Reddit that analyzed the hierarchical structure of the platform’s conversations. Since Reddit organizes discussion in a hierarchical format akin to a family tree (i.e., structured by parent-child relationships), comments can be traced back through multiple levels of parent comments. The researchers studied the influence of the threaded discourse structure on the spread of toxicity. They address two key research questions:

  1. Can the toxicity of a Reddit comment be predicted by analyzing the toxicity of its immediate parent, grandparent, and great-grandparent comments, as well as the average toxicity of the entire conversation?
  2. Which generational parent comment has the most influence on the toxicity of a child comment?

Their approach used trees and machine learning to investigate these questions. It was found that toxic conversations tend to diminish over time, as toxic content often discourages further participation in discussions, supported by previous research indicating that toxic comments can deter individuals from engaging. The authors also analyzed the influence of generational parent comments on the toxicity of the last comment in a conversation. Using various machine learning algorithms, it was found that the immediate parent comment was the most significant predictor of the last comment’s toxicity, with accuracies ranging from 69% to 75% for different toxic-level communities.

Dr. Agarwal said, “Both of these studies show meaningful trends policymakers and social media platforms can take advantage of when combating online toxicity.”

Hadi Rashid

Hadi Rashid is the Systems Administrator at COSMOS research center as well as the Emerging Analytics Center (EAC). His main focus is in automated configuration of pools of computing, storage, and networking resources to build an OpenStack-based cloud and a high-performance computing (HPC) platform in the UA Little Rock EIT’s data-center facility. Previously, Hadi worked as a lecturer and Cisco Networking Academy manager and instructor at the University of Duhok, Iraq. He received an MSc in Computer Science from the University of Duhok (2009) and a BSc in Computer Science (2003) from the University of Mosul, Iraq.

Ahmed Al-Taweel

Ahmed Al-Taweel is a Ph.D. postdoctoral research fellow at the COSMOS research center after completing his Ph.D.  in applied mathematics at UA Little Rock with a research interest in numerical analysis and scientific computing. He received an M.S. in Mathematical Sciences from the UA Little Rock (2020).In COSMOS, he works with the research team in developing mathematical models of socio-technical behaviors on a variety of social media platforms. Currently he is working on Narrative analysis project.

Chinni Krishna Kongala

Kongala Chinni Krishna is a Master’s student pursuing a degree in Information Quality at University of Arkansas at Little Rock and also working on ETL pipeline in COSMOS. He completed his bachelor’s in Electronics and Communication as his Major. He worked as a software developer for 1 year in Multi-National Company, as a developer. He is  passionate to code and enthusiastic to learn new skills.

Deepika Kurapati

Deepika graduated with a Master’s in Information Science at UA Little Rock. She has worked at COSMOS as a UI/UX Designer- Graduate Research Assistant. She is familiar with BI tools like Power BI and Tableau.

Qudirat Akanji

Qudirat Akanji is a data analyst with a master’s degree in business information systems and Analytics from UA Little Rock and a bachelor’s degree in international relations. Her work experience cuts across the financial services and technology industry. She enjoys kdrama, scrabble and scrawling the depths of Wikipedia