by developers but also in the variety of
content that users get to see. We can
rightly assume each search engine
using the index would apply its own
ranking function, and therefore, produce different results. Users would
benefit in that they would not have to
rely on only one or at best a few search
engines but could choose from a variety of engines serving their different
purposes. In that way, an Open Web
Index would foster plurality and restrict the power of single companies
dictating which content is shown to
and consumed by users.
Another benefit would be that the
index would be open to everyone, and
therefore, would allow for investigating its transparency. However, search
engines built on top of the index could
still be “black boxes” in that they would
not need to make their ranking functions open to anybody.
Possible Applications
While the Open Web Index would first
and foremost make the development of
new Web search engines feasible and
financially attractive, it could also form
the basis for a variety of other applications, being related to search or not.
In the field of search, the Open
Web Index would also allow for vertical search engines (like image search,
video search, or search in specific areas and on specific topics) to be built.
In vertical search applications, OWI
data could also be used to amend proprietary data. For instance, a provider
of company information could amend
its company profiles with Web data.
Apart from search, the OWI could
also build the basis for data analysis
an interface/API to the services built
upon the index. The indexing stage
is divided between basic indexing
and advanced indexing. Basic index-
ing provides the data in a form that
services built on top of the index can
easily and rapidly process that data.
So, while services are allowed to do
their further indexing to prepare documents, some advanced indexing is
also provided by the open infrastructure. This provides additional information to the indexed documents
(for example, semantic annotations).
For this, an extensive infrastructure
for data mining and processing is
needed. Services should, however, be
able to decide for themselves to what
extent they want to rely on the preprocessing infrastructure provided by the
Open Web Index. A design principle
should be to allow services a maximum of flexibility.
As modern search engines rely
heavily on usage data, this data (most
prominently search queries routed
to the index) is collected and made
available for reuse. The OWI Usage
Data Index allows for this data to be
collected, stored, and queried. So,
while each service can collect and
query its own usage data, every service that wants to access usage data
from the OWI Usage Data Index
should be required to share anony-mized usage data with the other services, so that every service profits
from the amassed data. It is clear that
existing search engines like Google
and Bing have a huge lead compared
to new providers, as they have a solid
user base and already amassed large
amounts of usage data. However,
sharing usage data between the services could at least lessen the cold
start problem.
Benefits
The main benefit of such an index
would be for all interested parties to be
able to develop their own applications
without the problem of having to create their own index of the Web, which
currently is an impossible endeavor
not only, but especially, for small- and
medium-size enterprises, as well as for
non-commercial bodies.
Given a considerable uptake for
such an index, it would foster plurality not only in the use of Web content
The index would be
open to everyone,
and therefore,
would allow for
investigating its
transparency.
speakers.acm.org
Students and faculty
can take advantage of
ACM’s Distinguished
Speakers Program
to invite renowned
thought leaders in
academia, industry
and government to
deliver compelling and
insightful talks on the
most important topics
in computing and IT
today. ACM covers the
cost of transportation
for the speaker to
travel to your event.
A great speaker
can make the
difference between
a good event and
a WOW event!
Distinguished
Speakers
Program