how to discriminate between bots
and humans. User meta-data is considered among the most predictive
feature and the most interpretable
ones. 22, 38 We can suggest a few rules
of thumb to infer whether an account
is likely a bot, by comparing its meta-data with that of legitimate users (see
Figure 2). Further work, however, will
be needed to detect sophisticated
strategies exhibiting a mixture of humans and social bots features (
sometimes referred to as cyborgs). Detecting these bots, or hacked accounts, 43
is currently impossible for feature-based systems.
Combining Multiple Approaches
Alvisi et al. 3 recognized first the need
of adopting complementary detec-
tion techniques to effectively deal
with sybil attacks in social networks.
The Renren Sybil detector37, 42 is an
example of system that explores mul-
tiple dimensions of users’ behaviors
like activity and timing information.
Examination of ground-truth click-
stream data shows that real users
spend comparatively more time mes-
saging and looking at other users’
contents (such as photos and videos),
ROC via cross validation. In addition to
the classification results, Bot or Not?
features a variety of interactive visual-
izations that provide insights on the
features exploited by the system (see
Figure 1 for examples).
Bots are continuously changing
and evolving: the analysis of the highly predictive behaviors that feature-based systems can detect may reveal
interesting patterns and provide
unique opportunities to understand
Figure 2. User behaviors that best discriminate social bots from humans.
Social bots retweet more than humans and have longer user names, while they produce fewer tweets,
replies and mentions, and they are retweeted less than humans. Bot accounts also tend to be more recent.
No. retweets
No. tweets
No. replies
No. mentions
No. times retweeted
Username length
Z-score
– 3 – 2 – 1 0
Human
1234
Account age
Social bot
Figure 1. Common features used for social bot detection. (a) The network of hashtags co-occurring in the tweets of a given user. (b) Various
sentiment signals including emoticon, happiness and arousal-dominance-valence scores. (c) The volume of content produced and consumed (tweeting and retweeting) over time.
followtrain
followback
hiking
monday
belieber
haveaniceday
justinbieber
photos
instantfollowback
beliebers
jb
Network Hashtag Graph Sentiment Tweet Emoticon
Sentiment Tweet Happiness
Temporal Retweet Timestamps
Temporal Tweet Timestamps
A
50% 0% 50% 100% 100%
91%
9%
Positive Negative
2468 0 10
B
C
Sentiment Tweet Methods