CAREER: Intelligent Sampling for Learning Complex Query Concepts
Edward Chang
Department
of Electrical & Computer Engineering
University of California, Santa Barbara
Contact Information
Edward
Chang
Department
of Electrical & Computer Engineering
University
of California
Santa
Barbara, CA 93106
Phone: (805)893-2971
Fax : (805) 893-3262
Email: echang@ece.ucsb.edu
URL: http://www.mmdb.ece.ucsb.edu/~echang/
WWW PAGE
http://www.mmdb.ece.ucsb.edu/~echang/career.html
List of Supported Students and Staff
Beitao Li: Graduate
research assistant
Project Award Information
- Award Number: IIS-0133802
- Duration: five years, 04/1/2002 -- 01/31/2007
(in the beginning of the first year)
- Title: CAREER: Intelligent
Sampling for Learning Complex Query Concepts
Keywords
Active learning,
personalization, query-concept learning, similarity search
Project Summary
For a multimedia
search task, a query concept is hard to articulate, and articulation can be
subjective. For instance, in an image search, it is difficult for a user to
describe a desired image using low-level features such as color, shape and
texture. In addition, different users may perceive the same image differently.
Even if an image is perceived similarly, users may use different vocabulary
(i.e., different combinations of low-level features) to depict it. Furthermore,
most users are not trained to specify simple query criteria using, for example,
Boolean algebra. In order to make information access easier and more personal,
it is both necessary (for capturing subjective concepts) and desirable (for
alleviating users from specifying complex query concepts) to build intelligent
search engines that can quickly learn users' query concepts
through active learning.
Project Impact
·
Ph.D.
students: Beitao Li (directly funded by this grant), Kingshy Goh, Yan Meng, Yi Wu, and Gang Wu
·
M.S.
students: Gerard Sychay (thesis).
- Education and curriculum development: Created a new course ECE160
--- Multimedia Computing in
Spring 2001. This course introduces multimedia theories and applications,
including cognitive psychology, perceptual feature extraction,
high-dimensional indexing, classification, and machine learning theories
for modeling image semantics. The course was built on the context of
the proposal of this project, and it was designed to foster
inter-disciplinary and undergraduate research.
- Broader Impact: The project's broader impacts
upon information retrieval are potentially substantial. First, rapid
proliferation of multimedia content in digital libraries and on the Web
underscores the increasing importance of having effective multimedia
search tools. Second, intelligent query-concept learners will directly or
indirectly make traditional text-based information retrieval easier and
more personal. Directly, a text collection can employ an intelligent
learner to better capture users' query concepts. Indirectly, for instance,
multimedia data can be added to a text collection so that searches can be
conducted through interfaces that contain pictures and graphics. Even
young students who have not learned Boolean algebra can use images and
graphics to search for stories and books. In addition to bringing benefits
to education, we believe that this research project will further
contribute to making information more accessible for underprivileged users
who are not yet able to enjoy the full benefits of the information revolution.
Goals, Objectives, and Targeted Activities
The goal of the
proposed research plan is to make fundamental advances towards intelligent
search engines through the development of online query-concept learners. The
specific targets are as follows:
- To design novel learning algorithms that
grasp a user's query concept quickly despite time, sample, and seeding
constraints.
- To develop techniques that can detect concept
drift during a relevance feedback session, and to handle concept drift
in the learning algorithms.
- To devise multi-resolution image
characterization methods for improving both search accuracy and search
efficiency.
- To ensure the scalability in feature
dimension, dataset size, and concept complexity of the developed learning
algorithms.
- To conduct validation on developed learning
algorithms with experimental data provided by colleagues at IBM
Laboratories, Sony, and Benchthalon.
Project References
- Dynamic Partial Function,
B. Li, E. Chang, C.-T. Wu,
IEEE International Conference on Image Processing, New York, September,
2002.
- On Learning Perceptual Distance Functions for
Image Retrieval (Invited),
E. Chang and B. Li,
IEEE International Conference on Acoustics, Speech and Signal Processing,
Orlando, May 2002.
- Indexing Multimedia Data in High-dimensional
and Dynamic Weighted Feature Spaces (Invited),
K. Goh and E. Chang,
The 6th Visual Database Conference, Australia, May 2002.
- Supporting Subjective Image Queries without
Seeding Requirements --- Proposing Test Queries for Benchathlon,
E. Chang and T. Cheng,
Internet Imaging III, pp.225-232, San Jose, January 2002.
- Spin Discriminant Analysis: Using a One
Dimensional Classifier for High-Dimensional Classification Problems,
H. You and E. Chang,
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), pp.968-975, Hawaii, December 2001.
- Mining Image Features for Efficient Query
Processing,
B. Li, W. Lai, E. Chang and T. Cheng,
Proceedings of the first IEEE Data Mining Conference, pp.353-360, San
Jose, November 2001.
- SVM Binary Classifier Ensembles for Multi-Class
Image Classification,
K. Goh, E. Chang and T. Cheng,
Proceedings of ACM International Conference on Information and
Knowledgment Management (CIKM), pp.395-402, Atlanta, November 2001.
- Support Vector
Machine Active Learning for Image Retrieval,
S. Tong and E. Chang,
Proceedings of ACM International Conference on Multimedia, pp.107-118,
Ottawa, October 2001.
- PBIR: A System
that Learns Subjective Image Query Concepts,
E. Chang, T. Cheng, W. Lai, C. Wu, C. Chang and Y. Wu,
Proceedings of ACM International Conference on Multimedia, pp.611-614,
Ottawa, October 2001.
- Learning Image Query Concepts via Intelligent
Sampling,
B. Li, E. Chang, and C.-S. Li,
Proceedings of IEEE International Conference on Multimedia, pp.1168-1171,
Tokyo, August 2001.
- PBIR - Perception-Based Image Retrieval, [Demo
Description]
E. Chang, T. Cheng and L. Chang,
Proceedings of ACM Sigmod, Santa Barbara, May 2001.
Area Background
Traditional
learning and relevance feedback techniques may not be suitable for online query-concept
learning for at least two reasons.
- Time and sample constraints. Traditional
learning methods such as decision trees and neural networks require a
large number of training instances (i.e., samples) and can take a long
time (more than a few seconds) to learn a concept. But, online users are
typically impatient and cannot be expected to wait around or to provide a
great deal of feedback.
- Seeding constraint. All traditional relevance
feedback methods require users to provide good examples to seed a query.
However, finding good seeds is the job of the search engine itself, and
this circular requirement leaves the core problem---learning users' query
concepts---unsolved.
Area References
- The Nature of Statistical Learning Theory , V.
Vapnik, Springer-Verlag, 1995.
- Support Vector Machine Active Learning with
Applications to Text Classification,
Simon Tong and Daphne Koller, Proceedings of the 17th International
Conference on Machine Learning, pp.401-412, June,2000.
- Image Retrieval: Current Techniques, Promising
Directions and Open Issues, Yong Rui and Thomas S. Huang and Shih-Fu
Chang, Journal of Visual Communication and Image Representation, March
1999.