CS145 Lecture Notes (5) -- XML Queries: XPath and XQuery

=> In this class we will learn about XPath and XQuery, covering the most important features but ignoring some esoteric ones.

XML DTD and sample data for examples

   <!ELEMENT Bookstore (Book | Magazine)*>
   <!ELEMENT Book (Title, Authors, Remark?)>
             Price CDATA #REQUIRED
             Edition CDATA #IMPLIED>
   <!ELEMENT Magazine (Title)>
   <!ELEMENT Title (#PCDATA)>
   <!ELEMENT Authors (Author+)>
   <!ELEMENT Remark (#PCDATA)>
   <!ELEMENT Author (First_Name, Last_Name)>
   <!ELEMENT First_Name (#PCDATA)>
   <!ELEMENT Last_Name (#PCDATA)>

   <?xml version="1.0" standalone="no"?>
   <!DOCTYPE Bookstore SYSTEM "bookstore.dtd">
      <Book ISBN="ISBN-0-13-035300-0" Price="65" Edition="2nd">
         <Title>A First Course in Database Systems</Title>
      <Book ISBN="ISBN-0-13-031995-3" Price="75">
         <Title>Database Systems: The Complete Book</Title>
         Amazon.com says: Buy this book bundled with "A First Course,"
         it's a great deal!


Think of XML as a tree (or directory) structure.

XPath specifies path expressions that match XML data by navigating down (and occasionally up or across) the tree and possibly evaluating conditions over data in the tree.

Some basic constructs (very incomplete list):

/ root element, or separator between steps in path
X matches element X
* matches any element
@X matches attribute X of the current element ("context node")
// matches all descendants of the current element, including self
[C] evaluates condition C on the current element
[N] picks the Nth matching element
Path1 | Path2 union of Path1 and Path2 results
contains(s1,s2) returns TRUE if string s1 contains string s2
name() returns tag of the current element
parent:: matches the parent of the current element, if there is one
following-sibling:: matches all later siblings of the current element
descendants:: matches all descendants of the current element
self:: matches the current element


Result of XPath Queries

XPath Examples

(Example: all book titles)

(Example: all book or magazine titles)

(Example: all ISBN numbers)

(Example: all books costing < 70)

(Example: all ISBN numbers of books costing < 70)

(Example: all books containing a remark)

(Example: all titles of books costing < 70 where "Ullman" is an author)

(Example: same query using //)

(Example: all second authors anywhere)

(Example: all author last names anywhere)

(Example: all books whose title contains one of its author's last names)

(Example: all magazines where there is a book of the same title)

(Example: all elements whose parent tag is not "Book")

(Example: all books where there is a different book of the same title)

For next example modify DTD to contain Remark* instead of Remark?

(Example: all books where all Remarks include "great")



Queries and Results as XML

Suppose we want:

To do so, all queries are wrapped in:
<Result> { ... the query here ... } </Result>

XQuery Examples

(Example: all titles of books costing < 70 where "Ullman" is an author)

(Example: all author Last_Name's of books related to databases)

(Example: average price of all database books)

(Example: all database books priced above average over all books)

(Example: titles and prices of all books, sorted by price)

(Example: all book titles where all remarks include "great")

(Example: all book pairs with at least one author last name in common)

(Example: all title-author pairs)

(Example: all book titles for each author)