that it was probably someone doing research, but checked the MOO to verify
that it was in fact legitimate.
The participants also collaborated on
a broader scale. During our visit, the site
was dealing with a worldwide security
incident targeting military, educational,
and government sites across the U.S.
and Europe. This was a particularly persistent attack—every time an intrusion
was detected and a vulnerability was
closed, the attackers would come back
using a new exploit. The attackers would
hop from institution to institution, compromising a machine in one place, collecting passwords, and then trying those
passwords on machines at other institutions (as users often have a single password for accounts at different sites).
This broad-based attack required a
broad-based response, so security administrators from affected institutions
formed an ad hoc community to monitor and share information about the attacks, with the goal of tracing the attacks
back to their source. When a compromised machine was found, they would
let it remain compromised so that they
could then trace the attackers and see
where else they were connecting. This
collaboration was like information
warfare: it was important to share information about known compromised
machines and exploits with trusted colleagues, but the information had to be
kept from the attackers. You did not
want the attackers to know that you had
detected their attack and were monitoring their activities. When we first observed them, the security administrators
used conference calls for community
meetings. Later they found a special
encrypted email listserv to keep their
information under wraps—but because
this tool was unmaintained, they had to
adopt and maintain it themselves.
The world of security administration
seems very fluid, with new vulnerabilities and exploits discovered every day.
Though secrecy was a greater concern
than with other sysadmins we observed,
collaboration was the foundation of
their work: sharing knowledge of unfolding events and system status, especially when an attack might be starting
and time was critical.
conclusion
One of our motivations for studying sys-
admins is the ever-increasing cost of IT
management. Part of this can certainly
be attributed to the fact that computers
get faster and cheaper every year, and
people do not. Yet complexity is also a
huge issue—a Web site today is built
upon a dramatically more complicated
infrastructure than one 15 years ago.
With complexity comes specialization
in IT management. With around-the-
clock operations needed for today’s en-
terprises, coordination is also a must.
System administrators need to share
knowledge, coordinate their work, com-
municate system status, develop a com-
mon understanding, find and share
expertise, and build trust and develop
relationships. System administration is
inherently collaborative.
Related articles
on queue.acm.org
Error Messages: What’s the Problem?
Paul P. Maglio, Eser Kandogan
http://queue.acm.org/detail.cfm?id=1036499
Oops! Coping with human Error
in IT Systems
Aaron B. Brown
http://queue.acm.org/detail.cfm?id=1036497
Building Collaboration into IDEs
Li-Te Cheng, Cleidson R.B. de Souza,
Susanne Hupfer, John Patterson, Steven Ros
http://queue.acm.org/detail.cfm?id=966803
References
1. Barrett, R., Kandogan, E., Maglio, P.P., Haber, E.M.,
Prabaker, M., Takayama, L.A. Field studies of
computer system administrators: analysis of system
management tools and practices. In Proceedings of
the Conference on Computer-Supported Collaborative
Work. 2004.
2. Gartner Group/Dataquest. Server Storage and RAID
Worldwide (May 1999).
3. Gelb, J.P. System-managed storage. IBM Systems
Journal 28, 1 (1989), 77–103.
4. ITCentrix. Storage on Tap: Understanding the Business
Value of Storage Service Providers (Mar. 2001).
5. Kandogan, E., Haber, E. M. 2005. Security and
Usability: Designing Secure Systems that People Can
Use. In Security Administration Tools and Practices.
L.F. Cranor and S. Garfinkel, Eds. O’Reilly Media,
Sebastapol, 2005, 357–378.
6. Kandogan, E., Maglio, P.P., Haber, E.M., Bailey, J.
(forthcoming). Information Technology Management:
Studies in Large-Scale System Administration. Oxford
University Press.
7. Maglio, P.P., Kandogan, E. 2004. Error messages:
What’s the problem? ACM Queue 2, 8 (2004), 50–55.
Eben M. haber is a research staff member at IBM
Research, Almaden, in San Jose, CA. He studies human-computer interaction, working on projects including data
mining and visualization, ethnographic studies of IT
system administration, and end-user programming tools.
Eser Kandogan is a research staff member at IBM
Research, Almaden, San Jose, CA. His interests include
human interaction with complex systems, ethnographic
studies of system administrators, information
visualization, and end-user programming.
Paul P. Maglio is a research scientist and manager at
IBM Research, Almaden, San Jose, CA. He is working
on a system to compose loosely coupled heterogeneous
models and simulations to inform health and health policy
decisions. Since joining IBM Research, he has worked
on programmable Web intermediaries, attentive user
interfaces, multimodal human-computer interaction, human
aspects of autonomic computing, and service science.