Bioinformatics: Converting Data to Knowledge
PPT Slide
Bio-Information
Loops of Data and Knowledge
Volume and Variety
Quantities
Diversity ? Heterogeneity
Scope differences
Heterogeneity inhibits Integration
Heterogeneity among domains is natural
Required precision = F(volume)
Inconsistency causes errors,while results need precision
Broad array of relatable sources
Quality of data verified through publication
Projects requiring manual curation are domain specific
Data integration in Literature
Means to achieve precision in text
Integration makes Semantic Mismatches visible
Shared Knowledge Base
Complex Relationships
PharmGKB
Consistency: global or partial ?
Stanford Infolab SKC project ( Scalable Knowledge Composition )
Exploit Domain-specific Expertise .
SKC grounded definition .
Sample Operation: INTERSECTION
An Ontology Algebra
INTERSECTION support
Other Basic Operations
Tools to create articulations
continue from initial point
Candidate Match Nexus
Using the Match Nexus
Features of an algebra
Knowledge Composition
Support Domain Specialization
Summary Scalable Knowledge Composition
Many Other Tasks at/near Stanford
Provenance of derived data
The People Problem
Up-to-dateness
Privacy requires Ethics
Email: gio@cs.stanford.edu
Home Page: www-db.stanford.edu/people/gio.html