[ 9, 19]. These works focused on elements of sociability, demographics,
network characterizations, privacy implications, and similarity measures within social networking. Our work takes the next step by
attempting to determine the validity of an identity on an online social
community by mining self-described data.
The data mining techniques in our approach have a long history of use
in crime detection [ 17], intrusion detection models [ 5], and lie detection
[ 20, 12]. Our algorithm adds to the already existing set of tools that can
be used to identify forms of deception in nonverbal communication [ 2].
Although our approach is feasible and our results are promising,
there remain many topics to investigate in future work. First, having
more data would result in greater accuracy. Second, a comparison of
the effectiveness of the present approach against other social networking sites such as Facebook could shed light on its generalizability
and robustness. Third, if a user is really attempting to pass off a fake
identity, would the classifier be effective in detecting it? Finally, it
would be interesting to experiment with the resiliency of our classifier
to carefully crafted identities and investigate how this classifier could
be used as evidence in computational trust systems.
Roya Feizy is a PhD student at the University of Sussex. She holds degrees in Applied Mathematics from Azad University in Iran and Multimedia and Computer Science achieved from Middlesex University.
Her research interests include identity and online social networking,
specifically looking at how individuals present themselves online,
whether they are real or fake, and the type and amount of information
they are willing to disclose.
1. Boyd, D. 2007. Why youth (heart) social network sites: The role of
networked publics in teenage social life. In Youth, Identity, and Digital Media, Buckingham, D., Ed. MacArthur Foundation Series on
Digital Learning. MIT Press, Cambridge, MA.
2. Burgoon, J., Adkins, M., et al. 2005. An approach for intent identifi-
cation by building on deception detection. In Proceedings of the
Hawaii International Conference on Systems Science (HICSS’05 ).
3. Casciaro, T. 1998. Seeing things clearly: Social structure, personality,
and accuracy in social network perception. Social Netw. 20. 331-351.
4. Caverlee, J. and Webb, S. 2008. A large-scale study of MySpace:
Observations and implications for online social networks.
In Proceedings of the International Conference on Weblogs and
Social Media (ICWSM ).
5. Dokas, P., Ertoz, L., et al. 2002. Data mining for network intrusion
detection. In Proceedings of the NSF Workshop on Next Generation
6. Donath, J. and Boyd, D. 2004. Public displays of connection. BT
Technol. J. 22, 4.
7. Hu, J., Zeng, H., Lin, C., and Chen, Z. 2007. Demographic prediction based on user’s browsing behaviour. In Proceedings of the 16th
International Conference on World Wide Web.
8. Kagal, L., Finin, T., and Joshi, A. 2001. Trust-based security in pervasive computing environments. In IEEE Comm.
9. Maia, M., Almeida, V., and Almeida, J. 2008. Identifying user behaviour in online social networks. In Proceedings of the 1st Workshop on
Social Network Systems, ACM.
10. Mazar, N., Amir, O., and Ariely, D. (2007). The dishonesty of honest
people: A theory of self-concept maintenance. J. Market. Resear. XLV.
11. Mislove, A., Marcon, M., et al. 2007. Measurement and analysis of
online social networks. In Proceedings of the 5th ACM/USENIX Internet Measurement Conference (IMC’07 ).
12. Mundinger, J. and Le Boudec, J. 2005. The impact of liars on reputation in social networks. In Proceedings of Social Network Analysis:
Advances and Empirical Applications Forum.
13. Newman, M., Watts, D., and Strogatz, S. 2002. Random graph models of social networks. Proc. Nat. Acad. Science. 2566-2572.
14. Ryberg, T. and Larsen, M. C. 2008. Networked identities: Understanding relationships between strong and weak ties in networked
environments. J. Comput. Assist. Learn. 24. 103-105.
15. Shrivastava, N., Majumder, A., and Rastogi, R. 2008. Mining (social)
network graphs to detect random link attacks. In Proceeding of the
24th International Conference on Data Engineering (ICDE’08).
16. Somanathan, E. and Rubin, R. 2004. The evolution of honesty.
J. Econom. Behav. Organiz. 54. 1-17.
17. Thongtae, P. and Srisuk, S. 2008. An analysis of data mining applications in crime domain. Computer and Information Technology
18. Toma, C. L., Hancock, J. T., and Ellision, N. B. 2008. Separating fact
from fiction: An examination of deceptive self-presentation in online dating profiles. Personality Social Psych. Bull. 34, 8. 1023-1036.
19. Tufekci, Z. 2008. Can you see me now? Audience and disclosure
regulations in online social network sites. Bull. Science Technol.
Society 28, 1. 20-36.
20. Whitty, M. T. 2002. Liar, liar! An examination of how open, supportive and honest people are in chat rooms. Comput. Human Behav. 18.
21. Witten, I. H. and Frank, E. 2000. Data Mining: Practical machine
learning. In Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco, CA.
22. Zou, H., Hastie, and T., Tibshirani, R. 2006. Sparse principal component analysis. J. Computation. Graph. Statistics 15, 2. 265-286