New Approach to the Detection of Social Bot Activity

Sooraj Shah ’24

Figure 1: Key to detection of social bot activity on social media may be by detecting lack of variation across a group rather than in individual accounts.

Oftentimes, people find themselves scrolling through social media, responding and interacting with accounts and commenting on their favorite posts. What some users do not know is that many of these accounts utilize social bots, agents created to autonomously communicate with social media users. These bots respond in a way that is indistinguishable from human mannerisms, which give bots the power to influence public discussion and opinion. A study led by Dr. Andrew Schwartz, an associate professor in the Department of Computer Science at Stony Brook University, focused on understanding how these social bots are able to flawlessly interact with human users. In order to do so, bot accounts from Twitter were analyzed for 17 human features to see how close their imitation resembled human actions. 

The data collected was derived from Twitter, a popular social media platform where users can post news or personal thoughts to the public. Two data sets were collected, the first being Social SpamBots #1 (SSB1). SSB1 included 464 bots advertising products on Amazon as well as 464 bots that resembled genuine people (718,975 total tweets). The second data set, SSB2, included 2,913 users and 2,913 users promoting an app. In total, 2,621,684 tweets were analyzed. The 17 human traits from these social bots were examined,  including age, personality, sentiment, and emotion. Language-based personality models, the NRC Word-Emotion Association Lexicon (relates words with emotions), and a predictive age generator based on the profiles of these bots were used for analysis. 

The results showed that viewing the bots individually can lead to the assumption that they are human, but when each bot is compared side by side, it is conclusive that they are indeed duplicates of one another. Most of the bots conveyed a tone and language of a human in their 20s, and given that 42% of Twitter users are between the ages of 18 and 29 years old, it is not surprising to see that many of these bots go undetected due to their relatability to the target audience.  Schwartz found a lack of variability of these bot accounts, most notably in the areas of age, gender, and sentiment. 

The study found that recognition of bot accounts can be traced to the lack of variability between the accounts rather than characterizing each one individually. When multiple bot accounts are compared, it is easier to see the lack of variation in the patterns of human features in comparison to real accounts. Using these findings, Schwartz created an automatic bot detector that is able to accurately predict bot presence. Future research should focus on statistical data such as timings of the post, the certain language used, and the pattern of activity in order to improve the accuracy of bot detection. 

Works Cited

  1. S. Giorgi, L. Ungar, and H.A. Schwartz, Characterizing Social Spambots by Their Human Traits. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 5148–5158 (2021).  doi: 10.18653/v1/2021.findings-acl.457. 
  2. Image Retrieved from: https://i0.pickpik.com/photos/382/953/301/work-desk-computer-night-preview.jpg

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s