Plan0 Zero Latency Technologies. Summary of a brief planning meeting held at Stanford University, Friday 19 Dec. 1997. Participants: Dorothea Beringer (Stanford CSD), Steve Dawson (SRI), David Dill (CSD). Hector Garcia-Molina (CSD), Pat Lincoln (SRI), Michael Rys (CSD), PierAngela Samarati (SRI), Jeffrey Ullman (CSD), Gio Wiederhold (convener, reporter, CSD), xxx (SRI) Objective: The objective of this program is to invent, integrate, validate, and develop for insertion into practice technologies that push the performance envelope of information systems towards instant responses. This will require a research effort which is orthogonal versus current emphasis, where most effort has been towards high precision and perfect consistency. Those goals should not be abandoned, but the alternatives should be made available where necessary or even essential. The trade off is illustrated in a graph, with axes of quality of service versus response time, which shows the current tradeoff space and future attainable spaces as hyperbolas, see http://www-db.stanford.edu/pub/gio/slides/zero.html. The past efforts in improving information systems performance have focused on getting more perfect results, as required in commercial enterprises as banking, airline reservations, computer-aided design etc. As databases continue to grow, and become more distributed, maintaining consistency, up-to-date-ness, and comprehensiveness at reasonable performance will continue to be a challenge. New directions: A program to better serve mission-critical tasks is needed to focus primarily on performance, to serve crucial decision-support tasks. In that arena having the best possible information when the decision has to be made is crucial, while having better information too late is useless. A focus must be the reduction of latency, the time needed to bring adequate information into the view of the decision maker. Latency is broadly composed of three elements: 1. time needed to locate and access the source data 2. time to process the source data into relevant information by selection, integration, abstraction, and preparing for visualization. 3. time needed for transmission of the results to the destination, over links of differing bandwidth These elements do not occur in strict sequence. Their components are listed below and on http://www-db.stanford.edu/pub/gio/slides/zero.html. In large information systems there will be multiple intermediate waypoints, where data may be processed. These points will be connected by links of various capabilities. While links in CONUS will have gigabit capabilities, albeit still with latencies on the order of fraction of seconds, links into the field will have lower bandwidth, asymmetric performance, and will be beset with failures and dropouts. Novel technology: There are two underlying concepts to move towards and beyond the zero latency objective: 1. Prediction: use information from analysis, doctrine, experience, training to provide models of information requirements for mission-critical tasks 2. Tradeoff: Determination of the optimal balance of currency and correctness versus relevance these concepts are related, and require a model of the customers' needs. These models can be used to drive currently available methods, and prompt the development of new methods that will bring information to the decision-maker in a just-in-time paradigm: * Pre-indexing: material of potential high relevance and need is indexed according to the customers' needs. This includes indexing of image and map features. * Materialized views: selected data are preprocessed and pre-positioned to be rapidly available. Multiple views will cover alternate situations. Specialized, in-the-field data warehouses may be dynamically created to deal with expected and evolving situations. * Correlation: Source data will be scanned for correlation prior to requests so that related data will become available prior to need. * Preprocessing: integration will be predetermined and preprocessed to the optimal extent, leaving some links for access to rapidly changing data. * Exception identification: focus on selecting, processing, and transmitting unexpected events and trends, rather than transmitting baselines of current, but slowly, or as expected, changing information. * Real-time technology: the parameters for rate-monotonic-scheduling may be adjusted dynamically to deal with changing requirements * Cost/benefit models: making the tradeoff explicit between the values of the needed response-time and data quality will provide for adjustment of parameters that drive predictive and pre-processing technologies. * TABLE 1. Contributors towards latency, and candidate technologies to deal with them Locate & Access 1. Find sources . . . . . . . . . . . . . . . . . . Relevance meta-data 2. Find index . . . . . . . . . . . . . . . . . . . . Move towards customer 3. Select . . . . . . . . . . . . . . . . . . . . . . . Learn and pre-position 4. Process index . . . . . . . . . . . . . . . . . Customer relevant combinations 5. Get base data . . . . . . . . . . . . . . . . . Materialized views 6. Get related / linked data . . . . . . . . . Materialized views 7. Filter . . . . . . . . . . . . . . . . . . . . . . . . New algoritms? 8. Join, project . . . . . . . . . . . . . . . . . . Materialized computed views 9. Package to move forwards . . . . . . . Field data-warehouses Process for Relevance 1. Match related data . . . . . . . . . . . . . . Materialized integrated views 2. Adjust for temporal relevance & match. . . . . Mediation 3. Adjust and prune for spatial match. . Materialized integrated views 4. Omit redundant material. . . . . . . . . . Materialized integrated views 5. Summarize with links . . . . . . . . . . . . Hybrid result generation 6. Identify unexpected facts & trends. . Model-based analysis 7. Prepare to present . . . . . . . . . . . . . . Model-based conversions Transmit Where Needed 1. Determine routing. . . . . . . . . . . . . . . Dynamic maps? 2. Set up routing. . . . . . . . . . . . . . . . . . 3. Adjust for bandwidth. . . . . . . . . . . . . 4. Buffer for differences in capability. . . 5. Buffer for backup. . . . . . . . . . . . . . . . 6. Acknowledge receipt . . . . . . . . . . . . 7. Validate content. . . . . . . . . . . . . . . . . 8. Recover from errors . . . . . . . . . . . . . Mixed strategy, forward best 9. Format for presentation . . . . . . . . . . Share transmit/ content needs