Trio

Prototype Source Code


Overview

This page contains information about the latest (as of February '09) open-source release of the Stanford Trio prototype. More information about the Trio project can be found on the Trio homepage.

This release contains the complete source code and binaries of the Trio prototype implementation as a Python-based, combined client-and-server architecture, as well as precompiled C-code library extensions for the PostgreSQL database system using PostgreSQL's Server Programming Interface (SPI). The Trio client can be used via a convenient browser interface (TrioExplorer), through a command-line interface (trioplus), using direct API calls when linked from another Python script, or running scripts as an external command-line call. This page also includes installation instructions and a brief documentation for these interfaces, describing the must important functionality and design decisions of the Trio prototype. The Trio source code is released under the BSD license.

We intend to continue maintaining this page with future extensions to the Trio prototype as the system keeps evolving and more and more features are implemented.


Download


Installation

There is a fairly comprehensive installation manual for the TrioExplorer and trioplus user interfaces available here.


TriQL - The Trio Query Language

Jennifer's comprehensive TriQL manual is available here.

And an up-to-date list of the currently implemented subset of TriQL is available here.


Example Data and Queries

The Trio source package comes with two semantically interesting example data sets and queries, which we still continue to use for various demonstration purposes.

  • The notorious Trio example, using a fictitious crime solving scenario over the following toy crime dataset and queries.
  • A data integration scenario, using a fusion of IMDB movie information along with user ratings derived from the Netflix challenge. For this collection we are inducing uncertainty on movie titles, directors, and production years based on a vague matching between movie titles from IMDB and Netflix, as well as syntetic confidence distributions for the actual ratings derived from Netflix, using the following movie dataset and queries.

Both the collections come shipped with the open-source package and can be run in both TrioExplorer and trioplus.


Documentation

  • TrioExplorer

    The TrioExplorer web browser interface is intended to be mostly self-explanatory. After installing and logging in, simply click on the 'Samples' tab, and you should be able to upload and execute the two example collections and some TriQL queries with just a few mouse clicks.

    A brief online help for TrioExplorer and trioplus is available here.

  • trioplus

  • For convenient integration and application development, Trio also comes with the trioplus command-line interface.
    • Confidence Computation

      • view table sawperson;
      • view table sawperson compute confidences;
    • Lineage Tracing

      • set aids on;
      • view table sawperson;
      • explain lineage sawperson 1;
  • API

  • The Trio API can be directly integrated into other Python scripts via direct API calls by importing the classes triodb.py and xtyple.py from the Trio main directory, the latter also contains data structures for representing tuple alternatives and their lineage. The Trio API is designed to work in a similar fashion as the Python DB-API interface available from the PyGreSQL module for Python.
    • TrioCnx

      • TrioCnx(pgdb)

        This constructor method creates a new Trio connection from a given PyGreSQL connection pgdb, the default Python DB-API.

      • cursor()

        Returns a new TrioCursor object for the current connection.

      • commit()

        Commits the current transaction.

      • rollback()

        Performs a rollback for the current transaction.

      • close()

        Closes the Trio connection (and the underlying pgdb connection).

    • TrioCursor

      • execute(triql)

        Executes a TriQL statement triql for this cursor object.

      • fetchone()

        Fetches a single XTuple object from the current cursor position.

      • fetchall()

        Fetches and returns a list of all XTuple objects beginning from the current cursor position.

    • XTuple

      • len()

        Returns the number of Alternative objects contained in this XTuple object.

      • getAlternative(idx)

        Returns the Alternative object at the designated index idx.

      • getConfidence()

        Returns the confidence value (if any) of this XTuple object as the sum of its Alternative objects' confidence values.

      • getQuestionMark()

        Returns whether this XTuple object has a question mark or not.

    • Alternative

      • getLineage()

        Returns a list of immediate lineage information as (source-table, source-aid) pairs of this alternative.

      • traceLineage()

        Performs a transitive lineage traversal for this alternative back to the base data.

      • getConfidence()

        Returns the confidence value (if any) of this alternative.

      • computeConfidence()

        Computes the confidence value of this alternative based on the traceLineage() function.