the framework supports Web searching in a multilingual world. Post-retrieval analysis techniques (such as summarization and visualization) were found to alleviate information overload but also that the extent of such improvement varies across domains. Summarization and categorization did not achieve significant improvement in the CBizPort study. In the SBizPort and AMedPort studies, information visualization achieved significant performance improvement in Web-search results. The ability to visualize a large number of search results was essential for good performance in all three portals.
I recommend that system developers and IT managers incorporate browse support and analysis tools into their online search systems and portals to augment traditional textual list displays. Such tools can be used to summarize Web-page textual descriptions [ 6], support query formulation [ 7], visualize emerging events related to their environment and organizations [ 5], and categorize search results into hierarchies or maps [ 4]. However, users must be cautioned that the tools are still prone to error due largely to ambiguities in natural-language processing and high computational costs that may not be economical for small Web sites.
Factors to be considered when adopting the tools include the extent to which the Web-page collection provides sufficient statistical information for machine learning, adequate hardware and software to support intensive computation, availability of a work force to improve the Web-site interface and accommodate new presentation choices, characteristics of the language used, and user IT literacy.
Across a variety of languages and domains, I found significant differences in the development of Web-search portals, technologies, and language use. For instance, the growth of Internet use in mainland China (but relative lack of comprehensive Web search and browse support) strongly suggests the need for future improvements. While Web-search technologies in Taiwan are more mature, there is likely room for new technologies developed specifically for processing Chinese. The strong growth of the Chinese-and Spanish-speaking online populations will likely persist in the coming years, further emphasizing the need for better, more integrated Web-search portals that deliver results in a variety of formats and provide richer information for the regions and the communities that use the languages. The increasing amount of Arabic Web content and online population, along with economic and political developments in Arab regions, will continue to fuel the growth of many Arabic Web sites that remain mostly underdeveloped
40 May 2008/Vol. 51, No. 5 COMMUNICATIONS OF THE ACM
today. The research I’ve reported here will likely contribute to a better understanding of related developmental and experimental issues.
My ongoing work includes developing scalable techniques to collect, analyze, and visualize Web information in different languages, studying user needs in non-English Web search, and exploring the effect of new techniques in information exploration and analysis. This effort will contribute to Web searching and browsing in a multilingual world. c
REFERENCES
1. Abbi, R. The Current Status of the Internet in the Arab World.
UNESCO Observatory on the Information Society (2002); www.unesco.org/cgibin/webworld/portal_observatory/cgi/jump.cgi?I D=2329.
2. China Internet Network Information Center. The 20th Statistical Survey Report on the Internet Development in China; Beijing, China, 2007; www.cnnic.net.cn/uploadfiles/pdf/2007/7/18/113918.pdf.
3. Chung, W., Bonillas, A., Lai, G., Xi, W., and Chen, H. Supporting non-English Web searching: An experiment on the Spanish business and the Arabic medical intelligence portals. Decision Support Systems 42, 3 (Dec. 2006), 1697–1714.
4. Chung, W., Chen, H., and Nunamaker, J. A visual framework for knowledge discovery on the Web. Journal of Management Information Systems 21, 4 (Spring 2005), 57– 84.
5. Chung, W., Chen, H., Chaboya, L., O’Toole, C., and Atabakhsh, H. Evaluating event visualization: A usability study of the COPLINK spa-tio-temporal visualizer. International Journal of Human-Computer Interaction 62, 1 (Jan. 2005), 127–157.
6. Chung, W., Zhang, Y., Huang, Z., Wang, G., Ong, T.-H., and Chen, H. Internet searching and browsing in a multilingual world: An experiment on the Chinese Business Intelligence Portal. Journal of the American Society for Information Science and Technology 55, 9 (July 2004), 818–831.
7. Leroy, G., Xu, J., Chung, W., Eggers, S., and Chen, H. An end-user evaluation of query formulation and results review tools in three meta-search engines. International Journal of Medical Informatics 7, 11– 12 (Nov.–Dec. 2007), 780–789.
8. McDonald, D. and Chen, H. Summary in context: Searching versus browsing. ACM Transactions on Information Systems 24, 1 (Jan. 2006), 111–141.
9. Miniwatts International. Internet Usage Statistics: The Internet Big Picture (updated Nov. 30, 2007); www.internetworldstats.com/stats.htm.
10. Mowshowitz, A. and Kawaguchi, A. Bias on the Web. Commun. ACM 45, 9 (Sept. 2002), 56– 60.
11. Spink, A., Ozmutlu, S., Ozmutlu, H., and Jansen, B. U.S. versus European Web searching trends. SIGIR Forum 36, 2 (Fall 2002).
12. Wilson, T. Models of information behavior research. Journal of Documentation 55, 3 (June 1999), 249–270.
WINGYAN CHUNG ( wchung@scu.edu) is an assistant professor in the Department of Operations and Management Information Systems of the Leavey School of Business at Santa Clara University, Santa Clara, CA.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
References:
http://www.unesco.org/cgibin/webworld/portal_observatory/cgi/jump.cgi?ID=2329
http://www.cnnic.net.cn/uploadfiles/pdf/2007/7/18/113918.pdf
http://www.internetworldstats.com/stats.htm
http://www.unesco.org/cgibin/webworld/portal_observatory/cgi/jump.cgi?ID=2329
Archives