dom, and future data distributions might differ sharply from the labeled examples selected by the algorithm.
To correct the sampling bias, or generalization error, Daniel Hsu, a Ph.D. student at University of California, San Diego, and Dasgupta recommended a scheme where each chosen example has an importance weighting determined by its selection probability. More likely points have higher weightings, and less likely points have lower weightings. As a result, the algorithm could realize when it had selected an unlikely point that was very informative and correct its expectations for any future data.
With several solid years of theoretical work behind them, the active learning researchers are beginning to tackle an even larger challenge than developing the theoretical underpinnings of active learning. Now, they are trying to link theoretical algorithms with large-scale applications.
“There’s a striking difference between what’s known in practice and theory with active learning,” says Hsu. “The theory is trying to catch up with what is known about what works well in practice, and to be able to say something mathematical about what works and what doesn’t.”
The University of Pennsylvania’s Wortman agrees. “[Active learning] is used quite frequently in practice with impressive results, and it’s a subject of increasing interest in the theory community, but there’s still a huge gap between theory and practice. At a fundamental level, we don’t really know ‘why’ the particular active learning algorithms that are used in practice work.”
Toward that goal, some theorists are looking to find general methods of building algorithms that can be used in any situation. General purpose-built algorithms have existed for decades in passive learning. However, with active learning, an algorithm must almost be built from scratch for each new application.
At Carnegie Mellon, Hanneke is working on a type of “meta-algorithm” that can take established passive learning algorithms and churn out an active learning algorithm with passive
learning algorithms as subroutines. His hope is that this new work, which he calls “activized learning,” will have the benefit of using established passive learning, alongside active learning with its guaranteed improvements, on the number of labels needed.
Maria-Florina Balcan, a post-doc-toral student with Microsoft Research New England, believes that active learning algorithms are too delicate for such a general method to exist. Instead, Balcan is broadening the type of interaction that active learning can handle beyond just labeling unlabeled examples. She points out that in other situations another interaction method might be more natural.
Take the example of trying to classify images of people by gender, says Balcan. If the interaction is changed, the teacher might be able complain to the learner that some classifiers are too general and need to be split, or that some are too similar and need to be merged. The teacher might suggest a new class to be added, or notice that the learning is making certain mistakes and suggest a new zone that will eliminate those classification errors.
“Much of the theoretical and practical work on active learning was developed on a very specific model and function of interaction,” says Balcan. “But the interaction could be different. The main direction that I see for improvement in the future is extending the type of interaction between the learning algorithm and the user.”
Given the progress made in the past several years, with more horizons opening up, the future of active learning appears to be very promising. “We now have active learning algorithms providing substantial guarantees on performance while relying on relatively minimal assumptions about the world,” notes John Langford, doctor of learning at Yahoo! Research. “We have a reasonable understanding of where and when they provide performance improvements over passive learning. These algorithms are also practical, often yielding substantial savings in label complexity over passive learning approaches.”
Graeme Stemp-Morlock is a science writer based in Waterloo, ontario, canada.
© 2009 acm 0001-0782/09/0400 $5.00
the computational thinking
that drives the field of computer
science is a key tool for solving
problems, designing systems, and
understanding human behavior
in many disciplines, according
to a panel of international
experts in computer Science
and informatics (cS&i). the
findings of the project, the
research assessment exercise
2008, confirmed the u.K. as a top-
ranked research power among
industrialized countries.
the survey reported an
increased level in the influence
of computer science on
other disciplines, including
bioinformatics, medicine, and
e-health. it also found that more
computer science research
used mathematics to quantify
the complexity and rigor of
calculations. the analysis also
found that research funding for
cS&i for the period 2001 to 2008
more than doubled, from $376
million in 2001 to $763 million,
leading participants to conclude
that continued commitment to
funding research and innovation
is necessary to maintain global
excellence in a battered economy.
the review of research in
cS&i surveyed 81 colleges and
universities and found the
subject not only healthy and
growing, but more rigorous,
interdisciplinary, experimental,
and user-oriented than ever.
“the vitality of the computing field, which is due in large measure to increased investment in research, is directly related to the degree of innovation that emerges from u. K. research institutions,” says acm President dame Wendy hall, who participated in the survey’s computer science panel. “these innovations, in turn, foster research partnerships with startup companies as well as spinouts and collaborations with subject matter experts and multinational corporations. the resulting level of economic activity crosses into all industries, even creating new sectors that provide career opportunities in the computing and information technology field.”
APriL 2009 | voL. 52 | no. 4 | communicAtionS of the Acm
13
References:
Archives