Understanding the plight of Covid-19 Long Haulers through Computational Analysis of YouTube Content
The coronavirus pandemic is a pervasive event that has had a drastic impact on countries globally. In the United States alone there have been over 98 million Covid-19 cases and over 1 million deaths. One consequence of Covid-19 infection has been PASC or Coronavirus Post-Acute Sequelae. People with this syndrome, colloquially called Long Haulers experience a wide range of symptoms, up to 22, that impact every system of the body. The root cause of PASC as well as treatment continues to mystify doctors. Many of those suffering from PASC (otherwise called Long Covid) have used social media to turn to each other for support and guidance. Social media surveillance is the use of social media data to be used as a rich source of information on health-related issues and attitudes (Melton). Due to the pervasiveness of social media apps and the comfort with which people engage in health-related discourse online, researchers have employed text mining techniques in order to gain insights about patient experiences. In this study, I wanted to gain a better understanding of Long Hauler experiences as well as how information about Long Haulers is received. I chose YouTube videos as the data source due to YouTube’s unique nature. YouTube is a platform in which creators can make and share videos with their audiences. Discourse then continues in the comment section between viewers and the creators of the videos as well. I gathered data from three different types of content creators: Medical Sources, News Sources and Long Haulers. Medical sources included content creators who identified as doctors, health insurance companies and medical schools. News sources included content from various types of news stations ranging from local to international. Lastly, Long Haulers represented first person accounts of those suffering from PASC themselves. I used Biterm, a topic model created specifically for short texts, to analyze video transcripts and all top-level comments in the comment section. Ultimately, I organized resulting topics into 20 themes across all of the sources. These themes included: Explanations in Layman’s Terms, Show Housekeeping, Biological Explanations, Sharing Patient Experiences, Negative Experiences, Experts Weighing In, Handling the Long Haul, Taking Treatment into Own Hands, Changes to Daily Life, Choose Homeopathy over Pharmaceuticals, Ingesting Shared Information, Seeking More from Shared Content, Misinformation, Skepticism, Sharing Long Covid Experiences, Complete Distrust of Information, Fears of Hidden Dishonest, Nihilistic Entity, Interacting with Content Creators, Disillusionment with Traditional Medicine and Distrust with Health Care System. Results of this thesis could help public health agencies, policymakers, organizations and health researchers to understand symptomatology and experiences related to Long Covid. It also helps these agencies to understand how information concerning Long Covid is being received.