Nnews
I
M
A
G
E
B
Y
D
E
P
T
.
O
F
M
I
C
R
O
B
I
O
L
O
G
Y,
B
I
O
Z
E
N
T
R
U
M
/
S
C
I
E
N
C
E
P
H
O
T
O
L
I
B
R
A
R
Y
HOW DO YOU look for a needle in a haystack, when you are not sure what the needle looks like? This is the prob- lem that faces scientists as
they try to deal with increasingly complex datasets. One answer is to turn machine learning loose on the enormous
volumes of data they have captured.
The problem of finding relevant
data in genetic databases is one that Simon Roux, a researcher working at the
U.S. Department of Energy’s Joint Genome Institute, faced when investigating the role that an obscure and little-understood family of viruses plays in
the environment.
There are many types of virus,
called bacteriophages, that infect
bacteria. Many of these either kill
their hosts or are themselves rejected by an immune response. The
bacteriophages that belong to the
family inoviridae can remain in the
host for long periods. This property
has helped make one such “
inovirus,” known as M13, a popular choice
among bioengineering researchers.
The needle-shaped M13 infects the
Escherichia coli (E. coli) bacterium, an
organism that is very easy to cultivate
under laboratory conditions. When
the bacteria expel the virus particles
they are forced to make by the viral
DNA, the particles are available in
large numbers and can be purified
Learning to See
Machine learning turns the spotlight on elusive viruses.
Science | DOI: 10.1145/3283206 Chris Edwards
Colored transmission electron micrograph of a T4 bacteriophage virus, magnified 100,000 times.