Text from Salasin Slides

Text from Salasin Slides

Courtesy of Dan Siroker

Universal Data Access

Phase 1: Zero Latency
Phase 2: Negative latency (Speculative retrieval)
The Foundation of Information Superiority

John Salasin

Challenges
• Data is everywhere
– At multiple levels of granularity
• Processing is ubiquitous
– With varying data requirements
• Speed of light is fixed
• Needed: Access to universal data with latency approaching zero cycles
Challenges
(Washington Post -- Dec 9, 1997)
• US provides Support to Diplomatic Operations (SDO) in Bosnia -- Embassy
staff is safe!! Why is this news?
– Combined CIA, DIA, NSA, State Department team required (in trailer)
– Merged satellite photos, message intercepts, humint(?)
– Standard proceedure -- Support to Military Ops (SMO) at CINC level --
all-source units expensive to set up
• Zero Latency Objective -- Provide needed information on-time, to the
individual soldier (or diplomat)

Strategic Assessment 1996

National Defense University
Between 1992 and 1995, expanded peacekeeping operations were undertaken in
Cambodia (UNTAC), Bosnia (UNPROFOR), Somalia (UNITAF), and Haiti (MNF)
Initially, all carried primarily humanitarian and peace-building objectives…
The … dominant issue and key determinant of success became the use of
military power and its relationship to other activities.
Special Intelligence Requirements. Special operations require special
intelligence. This may sometimes mean very fine-grained intelligence about a
difficult target.
Preemption requires good intelligence, however, which is often not
available.
Challenges --Today’s Situation!
Dimensions of the Problem

Technical Issues

• Optimizing latency is different from optimizing throughput
– Concern is network and location / organization, not computation
– Pipelined processors reduce average time per instruction, but not time to
send an instruction through the pipeline.
– Latency reduction based on predicting data needs and determining when to
do a prefetch.
– Need to make effective use of information about what the program is LIKELY
to do.
Zero Latency Program
Goal: Implement optimal data access approach(es)
– Dynamically
– Automatically
– Cheaply (standard solutions reduce custom programming)
Approach:
– Use maximum information about application, e.g.:
• Process discovery
• Formal process description
• Access / computation history
– Use maximum information about system status, e.g.:
• Data and process location
• QoS / load / capacity information
– Integrate technologies
• Common representations across spectrum of applications (different granularities)
• Prediction of data needs, speculative computation opportunities
• Scheduling information transmission

Contributing Technologies

Contributing Technologies
Contributing Technologies

Contributing Technologies

Query/Process partitioning
Contributing Technologies
• Pre-caching (predictive access)
– Process (automated and manual) discovery, modeling and analysis
• support prediction of data needs
• enable negative latency
– Learning models to improve prediction
– Intelligent data wrappers
• Agent-based
• Driven by semantics of data content and use
– What’s new: Extend “find me one like” to “find me one like what I need for
the most likely next step(s) in the process”

Contributing Technologies

• Compiler- and hardware-level improvements
– Memory manages and manipulates data
– Tools and languages for dynamically optimizing cache placement
Contributing Technologies
• Biological sensor systems react to change, rather than steady state.
– Use rate of change of data to moderate resources used for predictive access:
• rate of change of input along dimensions of concern [dynamic prioritization]?
• rate of change of output (e.g., target information, logistics support plan) [dynamic reflective prioritization]?
• expected change in user behavior?

Other Approaches / Concerns?

• COTS encapsulated data (wrapping on demand)
– Data is often most persistent part of system
– Resides in COTS components (e.g., design tools, GISs, program files)
– We will always need to interpret “new” data (e.g., real-time sensor
readings) in context of legacy data (e.g., characteristics of enemy weapons systems)
– Zero latency ==> need to extract specific information from legacy systems based on changes in new data.

Zero Latency System View

Tasks
Application-driven Access Planning
• Formal workflow description and monitoring
– Requires semantic awareness -- what data used where and when
• Mapping access / computation history to workflow steps
• Analysis of workflow models to infer application goals / plans
• Development and testing of workflow-based predictive models
• Use task requirements to determine transaction type
• Major Milestones
– YR1: Demonstrate ability to represent workflow and required resources and to use conditional proabilities to predict data needs
– YR2: Complete model to predict data needs and to specify transaction characteristics

Tasks

System Capability-driven Access Planning
• Integrate information about system status to improve planning.
– Data and process locations
– QoS / load / capacity information
– PMS models
• Define triggers based on rate of change of data (and language to specify conditions/actions)
• Integrate with Application-driven Access Planning
• Major Milestones:
– YR1: Select and modify system models (e.g., from Quorum), Define needed triggering capabilities based on analysis of real scenarios
– YR2: Integrate with Application-driven Access Planning

Tasks

Integration
• Demonstrate and evaluate with respect to "Challenge Problems"in areas of:
– logistics (more predictable)
– battlespace understanding (more unpredictable, but based on doctrine)
– crisis management (highly unpredictable)
• Evaluation along multiple dimensions:
– Access speedup
– Ease of insertion (interoperable with existing systems, ease of specifying rules, triggers, etc.)
• Milestones:
– YR2: Specify demonstrations
– YR3: Conduct demonstration and evaluation