Figure 1. A “near bipartite core” of fraudsters and accomplices. Honest identities
are not shown. Note that each fraudster has traded with most, but not all,
accomplices; hence it is a near, but incomplete, core.
collaborators who are from Japan, Italy,
Canada, and Greece respectively, and
she believes they likely form a clique
(i.e., every pair has communicated).
With Graphite, Laura sketches a 4-node
clique as the query pattern and assigns
the countries as the node types. But if
she was to use another tool where the
node types cannot be specified, any
4-node cliques will be returned (like a
family of four who all reside in the US),
overwhelmingly Laura with irrelevant
information.
The Holy Grail of anomaly detection
is that the detection happens automatically and human does not need to do
anything at all. While this may seem to
be a distant goal, the Oddball system
[ 4] is a big step towards such goal. The
main idea behind Oddball is that it extracts a set of features (without human
intervention) that summarize each
node’s neighborhood subgraph, called
the node’s “egonet,” which includes
the node’s immediate neighbors and
all edges in the neighborhood. Then,
Oddball uses unsupervised methods
that automatically correlate pairs of
features and pinpoints nodes whose
features significantly deviate from
those of the rest of the nodes. Oddball
can detect several important patterns,
such as near-cliques and near-stars (by
correlating the total edge weight and
total edge count in the egonet). For example, in a who-called-whom network,
the center of a near star could be a
telemarketer who has called many random people, and a near clique could be
a close-knit group of friends.
E-commerce has redefined crime.
We now see new breeds of online
crime where technologically savvy
criminals exploit not only the weaknesses of the human nature, but also
the systems originally designed to protect online shoppers. Many criminals
have learned to cover their tracks with
the large amount of data generated by
e-commerce, and obfuscate law enforcement with multiple fake virtual
identities. As e-commerce thrives and
the online world becomes even more
connected, tools and methods such as
those from graph mining will play an
increasingly important role in untangling the many layers of sophisticated
organization and schemes crafted by
criminals. Will online crime be elimi-
Query Pattern
Near Match
Banker
Businessman
Figure 2. Given a query pattern, such as a money laundering ring (left), the Graphite
system can find both exact and near matches that tolerates a few extra nodes (right).
Fraudsters
Accomplices
nated? Perhaps not. But our effort will
force crooks to resort to more complex
schemes that incur more effort and
higher cost, so crime will be increasingly difficult to commit. Then, perhaps, fewer bad guys would attempt to
get on the wrong side of the law.
Biography
Polo Chau is a Ph. D. student in the Machine Learning
Department at Carnegie Mellon University. His research
intersects graph mining and human-computer interaction.
He builds interactive systems that help analysts explore
and make sense of large graph data, find patterns, detect
fraud, and spot anomalies. His work on fraud detection in
online auctions appeared in the Wall Street Journal and
many other media outlets. He was a Symantec fello w for
two consecutive years, and is an avid designer, having won
many awards.
References
1. Pandit, S., Chau, D.H., Wang, S. and Faloutsos, C.
NetProbe: A fast and scalable system for fraud
Detection in Online Auction Net works. In Proc. WWW
2007, 201-210.
2. McGlohon, M., Bay, S., Anderle, M., Steier, D. and
Faloutsos, C. SNARE: A link analytic system for graph
labeling and risk detection. In Proc. KDD2009, 1265-
1274.
3. Chau, D. H., Faloutsos, C., Tong, H., Hong, J. I. Gallagher,
B. and Eliassi-Rad, T. GRAPHI TE: A visual query system
for large graphs. In Proc. ICDM 2008, 963-966
4. Akoglu, L. McGlohon, M. and Faloutsos, C. OddBall:
Spotting anomalies in weighted graphs. In Proc.
PAKDD 2010, 410-421.