DAML (OntoAgents) Homework Assignment 2:
DAML Queries/Life Cycle

  1. For formulating queries we use a query language variant of Frame-Logic currently under development at Stanford DB and Karlsruhe AIFB. The language is not only suitable for evaluating queries, but also for the formulation of axioms. The query language is a simplified variant of Frame-Logic (F-Logic) with special support of RDF features (e.g. namespace declarations). We assume the following namespace definitions thoughout the the queries.


    1. Give me all publications of the researcher with the last name "Studer".

      FORALL Pub <- EXISTS ResID ResID[

    2. The former query just works with one (not specified Input-) RDF model and returns variable substitution. Often it is convenient to specify exactly on what RDF models the query is supposed to work (a database or the RDF store might contain several RDF models). Please note that an RDF model is not the same as an RDF namespace - there is not a one to one relationship between RDF models and namespaces. Also it is often convenient if the the result of an RDF query is also an RDF model. This model can again be stored in a database or used by the tools able to use RDF models. In the following we modify the former mentioned query to fullfill the requirements:

      FORALL Pub,ResID Result(ResID[sw:publications->Pub])<-
      uses (AIFBModel union StanfordDBModel).

    3. Which researchers are cooperating with other researchers?

      FORALL X, Y<-
      X[rdf:type->sw:AcademicStaff; sw:cooperateWith->Y[rdf:type->sw:AcademicStaff]].

    4. On which projects do the phd students work that are supervised by a professor whith the email-adress "studer@aifb.uni-karlsruhe.de" ?

      FORALL ProjID <- EXISTS PhdID, ResID
      and ResID[rdf:type->sw:FullProfessor;sw:email->"studer@aifb.uni-karlsruhe.de"].

    5. Give me the organization that finances a project that deals with the research topic "ontology articulation" as well as all the person in this project that work in that topic?

      FORALL OrgID, ProjID, MemID <-
      and ProjID[rdf:type->sw:Project;sw:isAbout->"Ontology Articulation"; sw:member->MemID].

    6. Find me the name of any project that has members with the homepage 'http://www-db.stanford.edu/~stefan/' and and tell me who they work with

      FORALL ProjID,MemID,OrgID<-
      ProjID[rdf:type->sw:Project; sw:member->MemID]
      and MemID[rdf:type->sw:AcademicStaff; sw:homepage->"http://www-db.stanford.edu/~stefan";affiliation->OrgID].

3. Task: Describe how you would expect these queries to be implemented. Identify the major DAML software components and sketch the control and data flow among them. Your solution may address some or all of the following topics query language, dynamic retrieval, crawling, cacheing, translation, inference, scalability, consistency, security). Consider how this could be accomplished if some of the DAML content was sensitive information stored on multiple WWW sites protected by passwords and/or certificates. You may or may not have access to all of the data.

The overall agent infrastructure requires an information food chain: every part of the food chain provides information, which enables the existence of the next part. The food chain starts with the construction of an ontology, preferably with the OnTo-Agents Ontology Construction Tool. The ontology defines the terms that are possible to use for annotation information in webpages, using the DAML language. The proposed OnTo-Agents Webpage Annotation Tool has means to browse the ontology and to select appropriate terms of the ontology map to mark-up sections of a webpage. The webpage annotation process creates a set of annotated webpages, which are available to an OnTo-Agent to achieve its tasks. The OnTo-Agent itself needs several sub-components, specifically the OnTo-Agents Inference System for the evaluation of rules and queries and general inferences, the OnTo-Agents Ontology Articulation Toolkit for mediation among information obtained from different ontologies. The data in from the annotations can be used to construct additional websites: a Community Web Portal, that presents a community of interest to the outside word in a concise manner. And finally, information-seeking users can give specific retrieval tasks to an OnTo-Agent, or they can query a Community Web Portal for immediate access to the information.

The query processor itself needs to be scalable and to deal with millions, maybe billions of simple statements. Database technology provides this scalable infrastructure, but not for free: query optimizations have to be analyzed and implemented. Scalable retrieval technology should be based on well investigated deductive database technology, which provides special optimization strategies for typical queries. On top of the database technology it is necessary to implement a query language, that allows graph navigation in large RDF graphs and that supports special RDF features. The tradeoff between caching of data and retrieving and query time remains to be investigated. Especially the semantics of query answering with retrieval at runtime remains largely open.

4. Task: Extra Credit: Develop software to implement any or all of 3
W e have developed an RDF/DAML-Crawler

The specialized query and transformation language for RDF and DAML is under development. The inference core will be based on 1) the Java core of SiLRI and 2) XSB

5. tool wishlist

8. discussion of lesson learnd and insights


By Stefan Decker, Siegfried Handschuh