Provenance of derived data
Assure having a proper history of derived results
[ Peter Buneman, UPenn, ] K2 integration tool
Integrated databases often don’t indicate the original sources
I.e., SwissProt does not distinguish inferred versus being observed.
[ William Gelbart, Harvard University] Flybase
Flybase also collects data as exons and their mutations, tranposon insertion sites.
Moving from being Hunter Gatherers in science to Harvesters, moving to an agronomical society
Clasical genomics is being superseded by expression and interaction of gene products and gene perturbation.
[ Peter Karp, SRI Int., Bioinformatics Res.Group, ] EcoCyc
EcoCyc links proteins to 150 metabolic pathways in Ecoli
Databases are supplanting journals. They are re-analyzable. Results in journals are not.
Estimate now about 500 public databases for Bioinformatics; although not all of them have APIs, use real DBMSs, have differing models, units of measurements, leading to semantic problems.