Article development led by
queue.acm.org
Companies have access to more types
of external data than ever before.
How can they integrate it most effectively?
BY sTePHen Pe TscHulAT
other
People’s
Data
eVeRY oRGAniZAtion BAses some of its critical
decisions on external data sources. In addition to
traditional flat file data feeds, Web services and
Web pages are playing an increasingly important
role in data warehousing. the growth of Web
services has made data feeds easily consumable
at the departmental and even end-user
levels. There are now more than 1,500
publicly available Web services and
thousands of data mashups ranging
from retail sales data to weather information to U.S. census data.
3 These
mashups are evidence that when users
need information, they will find a way
to get it. An effective enterprise information management strategy must
take into account both internal and external data.
External data sources vary in their
structure and methods of access. Some
are comprehensive and have been a part
of data-warehousing flows for many
years: securities data, corporate information, credit risk data, and address/
postal code lookup. These are typically
structured in a formal manner, contain
the “base” (most detailed) level of data,
and are available through established
data service providers in multiple formats. The most common access method is still flat files over FTP.
Web services are well understood