Text from Salasin Slides
Courtesy of Dan Siroker
Universal Data Access
Phase 1: Zero Latency
Phase 2: Negative latency (Speculative retrieval)
The Foundation of Information Superiority
John Salasin
Challenges
Data is everywhere
At multiple levels of granularity
Processing is ubiquitous
With varying data requirements
Speed of light is fixed
Needed: Access to universal data with latency approaching zero cycles
Challenges
(Washington Post -- Dec 9, 1997)
US provides Support to Diplomatic Operations (SDO) in Bosnia -- Embassy
staff is safe!! Why is this news?
Combined CIA, DIA, NSA, State Department team required (in trailer)
Merged satellite photos, message intercepts, humint(?)
Standard proceedure -- Support to Military Ops (SMO) at CINC level --
all-source units expensive to set up
Zero Latency Objective -- Provide needed information on-time, to the
individual soldier (or diplomat)
Strategic Assessment 1996
National Defense University
Between 1992 and 1995, expanded peacekeeping operations were undertaken in
Cambodia (UNTAC), Bosnia (UNPROFOR), Somalia (UNITAF), and Haiti (MNF)
Initially, all carried primarily humanitarian and peace-building objectives
The
dominant issue and key determinant of success became the use of
military power and its relationship to other activities.
Special Intelligence Requirements. Special operations require special
intelligence. This may sometimes mean very fine-grained intelligence about a
difficult target.
Preemption requires good intelligence, however, which is often not
available.
Challenges --Todays Situation!
Dimensions of the Problem
Technical Issues
Optimizing latency is different from optimizing throughput
Concern is network and location / organization, not computation
Pipelined processors reduce average time per instruction, but not time to
send an instruction through the pipeline.
Latency reduction based on predicting data needs and determining when to
do a prefetch.
Need to make effective use of information about what the program is LIKELY
to do.
Zero Latency Program
Goal: Implement optimal data access approach(es)
Dynamically
Automatically
Cheaply (standard solutions reduce custom programming)
Approach:
Use maximum information about application, e.g.:
Process discovery
Formal process description
Access / computation history
Use maximum information about system status, e.g.:
Data and process location
QoS / load / capacity information
Integrate technologies
Common representations across spectrum of applications (different granularities)
Prediction of data needs, speculative computation opportunities
Scheduling information transmission
Contributing Technologies
Contributing Technologies
Contributing Technologies
Contributing Technologies
Query/Process partitioning
Contributing Technologies
Pre-caching (predictive access)
Process (automated and manual) discovery, modeling and analysis
support prediction of data needs
enable negative latency
Learning models to improve prediction
Intelligent data wrappers
Agent-based
Driven by semantics of data content and use
Whats new: Extend find me one like to find me one like what I need for
the most likely next step(s) in the process
Contributing Technologies
Compiler- and hardware-level improvements
Memory manages and manipulates data
Tools and languages for dynamically optimizing cache placement
Contributing Technologies
Biological sensor systems react to change, rather than steady state.
Use rate of change of data to moderate resources used for predictive access:
rate of change of input along dimensions of concern [dynamic prioritization]?
rate of change of output (e.g., target information, logistics support
plan) [dynamic reflective prioritization]?
expected change in user behavior?
Other Approaches / Concerns?
COTS encapsulated data (wrapping on demand)
Data is often most persistent part of system
Resides in COTS components (e.g., design tools, GISs, program files)
We will always need to interpret new data (e.g., real-time sensor
readings) in context of legacy data (e.g., characteristics of enemy weapons
systems)
Zero latency ==> need to extract specific information from legacy systems
based on changes in new data.
Zero Latency System View
Tasks
Application-driven Access Planning
Formal workflow description and monitoring
Requires semantic awareness -- what data used where and when
Mapping access / computation history to workflow steps
Analysis of workflow models to infer application goals / plans
Development and testing of workflow-based predictive models
Use task requirements to determine transaction type
Major Milestones
YR1: Demonstrate ability to represent workflow and required resources and
to use conditional proabilities to predict data needs
YR2: Complete model to predict data needs and to specify transaction
characteristics
Tasks
System Capability-driven Access Planning
Integrate information about system status to improve planning.
Data and process locations
QoS / load / capacity information
PMS models
Define triggers based on rate of change of data (and language to specify
conditions/actions)
Integrate with Application-driven Access Planning
Major Milestones:
YR1: Select and modify system models (e.g., from Quorum), Define needed
triggering capabilities based on analysis of real scenarios
YR2: Integrate with Application-driven Access Planning
Tasks
Integration
Demonstrate and evaluate with respect to "Challenge Problems"in areas of:
logistics (more predictable)
battlespace understanding (more unpredictable, but based on doctrine)
crisis management (highly unpredictable)
Evaluation along multiple dimensions:
Access speedup
Ease of insertion (interoperable with existing systems, ease of specifying
rules, triggers, etc.)
Milestones:
YR2: Specify demonstrations
YR3: Conduct demonstration and evaluation