BIB-VERSION:: CS-TR-v2.0 ID:: STAN//CS-TR-97-1598 ENTRY:: December 08, 1997 ORGANIZATION:: Stanford University, Department of Computer Science TITLE:: Query Planning and Optimization in Information Integration TYPE:: Technical Report AUTHOR:: Duschka, Oliver M. DATE:: December 1997 PAGES:: 100 ABSTRACT:: Information integration systems provide uniform user interfaces to varieties of different information sources. Our work focuses on query planning in such systems. Query planning is the task of transforming a user query, represented in the user's interface language and vocabulary, into queries that can be executed by the information sources. Every information source might require a different query language and might use different vocabularies. We show that query plans with a fixed number of database operations are insufficient to extract all information from the sources, if functional dependencies or limitations on binding patterns are present. Dependencies complicate query planning because they allow query plans that would otherwise be invalid. We present an algorithm that constructs query plans that are guaranteed to extract all available information in these more general cases. This algorithm is also able to handle datalog user queries. We examine further extensions of the languages allowed for user queries and for describing information sources: disjunction, recursion and negation in source descriptions, negation and inequality in user queries. For these more expressive cases, we determine the data complexity required of languages able to represent "best possible" query plans. NOTES:: [Adminitrivia V1/Prg/19971208] END:: STAN//CS-TR-97-1598