Our work is driven by the vision of a Global InfoBase (GIB): a ubiquitous and
universal information resource, simple to use, up to date, and comprehensive.
The project consists of four interrelated thrusts:
(i) Combining Technologies: integrating technologies for information
retrieval, database management, and hypertext navigation, to achieve a
"universal" information model;
(ii) Personalization: developing tools for personalizing information
management;
(iii) Semantics: Using natural-language processing and structural techniques
for analyzing the semantics of Web pages; and
(iv) Data Mining: designing new algorithms for mining information in order to
synthesize new knowledge.
Faculty
Students (full-time and part-time, grad and undergrad)
Alums
- Sepandar
D. Kamvar, Taher H. Haveliwala, Christopher D. Manning, and Gene H. Golub. Extrapolation
Methods for Accelerating PageRank Computations. Submitted to WWW2003.
- Sriram
Raghavan and Hector Garcia-Molina. Integrating
Diverse Information Management Systems: A Brief Survey. Proceedings of
the IEEE Data Engineering Bulleting, December 2001.
-
Cheng Yang. "Music
Database Retrieval Based on Spectral Similarity." In International
Symposium on Music Information Retrieval, October 2001.
-
T. Haveliwala. Search Facilities for Internet Relay Chat. To appear in
Proceedings of the Joint Conference on Digital Libraries (Poster session),
2002.
-
C. Olston and J. Widom. Best-Effort
Cache Synchronization with Source Cooperation. ACM SIGMOD 2002.
-
C. Olston and J. Widom. Approximate
Caching for Continuous Queries over Distributed Data Sources . February
2002 Technical Report.
-
C. Olston, B. T. Loo and J. Widom. Adaptive
Precision Setting for Cached Approximate Values. ACM SIGMOD 2001.
International Conference on Management of Data, May 2001.
-
D. Klein and T. Haveliwala. Concise Labeling of Document Clusters.
Submitted. Technical Report, Stanford University, April 2002.
-
Sriram Raghavan and Hector Garcia-Molina. Crawling
the hidden Web. Proceedings of the 27th Intl. Conf. on Very Large
Databases (VLDB), pp. 129-138, September 2001.
-
T. Haveliwala. Topic-Sensitive
PageRank. Proceedings of the Eleventh International World Wide Web
Conference, 2002.
-
T. Haveliwala, A. Gionis, D. Klein, and P. Indyk. Evaluating
Strategies for Similarity Search on the Web. Proceedings of the Eleventh
International World Wide Web Conference, 2002.
-
D. Klein, S. Kamvar, and C. Manning. From
Instance-level Constraints to Space-level Constraints: Making the Most of
Prior Knowledge in Data Clustering. Proceedings of the Nineteenth
International Conference on Machine Learning, 2002.
-
S. Kamvar, D. Klein, and C. Manning. Interpreting
and Extending Classical Agglomerative Clustering Algorithms using a
Model-Based Approach. Proceedings of the Nineteenth International
Conference on Machine Learning, 2002.
-
Glen Jeh and Jennifer Widom. SimRank:
A Measure of Structural-Context Similarity. Technical Report, Computer
Science Department, Stanford University, 2001.
-
Glen Jeh and Jennifer Widom. Scaling
Personalized Web Search. Technical Report, Computer Science Department,
Stanford University, 2002.
Sites relevant to the project include: DB
Group home page, Infolab home page,
NLP Group home page, Digital
Libraries project home page.
Progress
Report, 2002
Progress Report,
2001
Last modified: Jan. 22 2003