the framework supports Web searching in a multilingual world. Post-retrieval analysis techniques
(such as summarization and visualization) were
found to alleviate information overload but also that
the extent of such improvement varies across
domains. Summarization and categorization did not
achieve significant improvement in the CBizPort
study. In the SBizPort and AMedPort studies, information visualization achieved significant performance improvement in Web-search results. The
ability to visualize a large number of search results
was essential for good performance in all three
portals.
I recommend that system developers and IT managers incorporate browse support and analysis tools
into their online search systems and portals to augment traditional textual list displays. Such tools can
be used to summarize Web-page textual descriptions
[ 6], support query formulation [ 7], visualize emerging events related to their environment and organizations [ 5], and categorize search results into hierarchies
or maps [ 4]. However, users must be cautioned that
the tools are still prone to error due largely to ambiguities in natural-language processing and high computational costs that may not be economical for small
Web sites.
Factors to be considered when adopting the tools
include the extent to which the Web-page collection
provides sufficient statistical information for machine
learning, adequate hardware and software to support
intensive computation, availability of a work force to
improve the Web-site interface and accommodate
new presentation choices, characteristics of the language used, and user IT literacy.
Across a variety of languages and domains, I found
significant differences in the development of Web-search portals, technologies, and language use. For
instance, the growth of Internet use in mainland
China (but relative lack of comprehensive Web search
and browse support) strongly suggests the need for
future improvements. While Web-search technologies in Taiwan are more mature, there is likely room
for new technologies developed specifically for processing Chinese. The strong growth of the Chinese-and Spanish-speaking online populations will likely
persist in the coming years, further emphasizing the
need for better, more integrated Web-search portals
that deliver results in a variety of formats and provide
richer information for the regions and the communities that use the languages. The increasing amount of
Arabic Web content and online population, along
with economic and political developments in Arab
regions, will continue to fuel the growth of many Arabic Web sites that remain mostly underdeveloped
40 May 2008/Vol. 51, No. 5 COMMUNICATIONS OF THE ACM
today. The research I’ve reported here will likely contribute to a better understanding of related developmental and experimental issues.
My ongoing work includes developing scalable
techniques to collect, analyze, and visualize Web
information in different languages, studying user
needs in non-English Web search, and exploring the
effect of new techniques in information exploration
and analysis. This effort will contribute to Web
searching and browsing in a multilingual world. c
REFERENCES
1. Abbi, R. The Current Status of the Internet in the Arab World.
UNESCO Observatory on the Information Society (2002);
www.unesco.org/cgibin/webworld/portal_observatory/cgi/jump.cgi?I
D=2329.
2. China Internet Network Information Center. The 20th Statistical Survey Report on the Internet Development in China; Beijing, China, 2007;
www.cnnic.net.cn/uploadfiles/pdf/2007/7/18/113918.pdf.
3. Chung, W., Bonillas, A., Lai, G., Xi, W., and Chen, H. Supporting
non-English Web searching: An experiment on the Spanish business
and the Arabic medical intelligence portals. Decision Support Systems
42, 3 (Dec. 2006), 1697–1714.
4. Chung, W., Chen, H., and Nunamaker, J. A visual framework for
knowledge discovery on the Web. Journal of Management Information
Systems 21, 4 (Spring 2005), 57– 84.
5. Chung, W., Chen, H., Chaboya, L., O’Toole, C., and Atabakhsh, H.
Evaluating event visualization: A usability study of the COPLINK spa-tio-temporal visualizer. International Journal of Human-Computer
Interaction 62, 1 (Jan. 2005), 127–157.
6. Chung, W., Zhang, Y., Huang, Z., Wang, G., Ong, T.-H., and Chen,
H. Internet searching and browsing in a multilingual world: An experiment on the Chinese Business Intelligence Portal. Journal of the American Society for Information Science and Technology 55, 9 (July 2004),
818–831.
7. Leroy, G., Xu, J., Chung, W., Eggers, S., and Chen, H. An end-user
evaluation of query formulation and results review tools in three meta-search engines. International Journal of Medical Informatics 7, 11– 12
(Nov.–Dec. 2007), 780–789.
8. McDonald, D. and Chen, H. Summary in context: Searching versus
browsing. ACM Transactions on Information Systems 24, 1 (Jan. 2006),
111–141.
9. Miniwatts International. Internet Usage Statistics: The Internet Big Picture (updated Nov. 30, 2007); www.internetworldstats.com/stats.htm.
10. Mowshowitz, A. and Kawaguchi, A. Bias on the Web. Commun. ACM
45, 9 (Sept. 2002), 56– 60.
11. Spink, A., Ozmutlu, S., Ozmutlu, H., and Jansen, B. U.S. versus European Web searching trends. SIGIR Forum 36, 2 (Fall 2002).
12. Wilson, T. Models of information behavior research. Journal of Documentation 55, 3 (June 1999), 249–270.
WINGYAN CHUNG ( wchung@scu.edu) is an assistant professor in
the Department of Operations and Management Information Systems
of the Leavey School of Business at Santa Clara University, Santa
Clara, CA.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.