Informatics Case Study: From Data Management and Integration to

Transcription

Informatics Case Study: From Data Management and Integration to
Informatics Case Study:
From Data Management and Integration to
Marker-Based Diagnostics Models
Bob Stanley
VP, Chief Technology Officer
IO Informatics, Inc.
Emeryville, CA
www.io-informatics.com
Beyond Genome, 2006
NIST
Advanced Technology Program
NIST / Advanced Technology Program
2002: Icoria awarded 5 year $11.7M grant on Target Assessment Technology
using Systems Biology
2005: Icoria & IO-Informatics 2 year joint venture for data integration
using Intelligent Multidimensional Object (IMO) Technology
Major Milestones: Coherent Data
Data management - deployment, scale, testing
Sentient Data Management and Suite
Data-driven efficiency - workflow, process management
Sentient Process Manager
Associative networks – diagnostics modeling, screening
Sentient Knowledge Explorer
A Clinical Data, Inc. Company
Data / Information Management:
Integration and Scale Testing Phase
Requirement:
“My users must be able to [simply] drag into a folder or right-click
on any file or set of files to enter data into the [‘iPool’ ] database.”
Director, Icoria division of CLDA
Files - applications output
Instrument output - images,
meta-data
LIMS - database records
Database query results
Web database query results
“Getting started” Architecture
Our Goals:
Avoid heavy time investment for initial roll-out phase
Short runway to roll-out with tangible value
Initial roll-out requirements:
Lightweight scale-up installation, existing apps integrated, scale tested
Refine use cases / demonstrate value for user roles prior to rollout
Result:
Different Data, One View
User experience:
Easy import, unified curation, unified views, annotation, queries, reports,
auditing, integrity checking
Scale-up Efficiency
Our Goals:
Scale-up based on approved requirements, use cases and reviewed prototypes
Maximize use of existing IT and applications
Process Management
Requirement:
“We need a system for process management that can take data and workflows
from each of the many groups associated with Icoria into account within a larger
project framework.”
Chief Science Officer, Icoria
Knowledge Explorer:
Prototyping and Testing
Requirements gathering / prototyping phase
Interview Icoria users, customers, science advisers, peers
Iterate and refine existing Sentient products and roadmap via use cases
User Requirements:
Knowledge Explorer
Example Requirement:
“We’d like to create associative networks to model and screen for diagnostic markers. This
should represent genes, metabolites, tissue data, compounds and clinical endpoints associated
by - for example - common identifiers, foldchange, strength of correlation and reference to
external knowledge-bases”
Director, Systems Biology, Icoria
Prototype - networks of related knowledge created to
validate liver toxicity and cancer metabolic markers
Application - Use Case:
Knowledge Explorer
Applied to:
Liver toxicity – NIEHS / Icoria Compendium study - metabolic marker and study
data in Oracle, supporting gene expression and tissue data accessed via Sentient
Cancer markers – “Cancer Study” - BCP markers with supporting data
Immediate value to Icoria researchers and clients and ultimately to the point of
care
Contextualized visualization and comparison of markers delivers
understanding, validation, sharing, screening
Applications include result reporting; case, study and disease stratification;
adverse event reporting
Resulting Prototype:
Knowledge Explorer
Refinement:
Icoria, peers, Scientific Advisors and Partners review
Implementation - IO and semantic methods (IMO, RDF, OWL)
Summary
Major “Coherent Data” Milestones:
Current use – integrated data management and query functions
Users are now able to run their own queries on formally inaccessible
data
Rolling out – unified workflow, process management functions
Data-driven alerts for project actions and deadlines becomes
possible
Prototyping and refining - knowledge explorer / modeling functions
Unified diagnostics modeling, viewing and screening –
using data from diverse sources - becomes possible
Keys to Success? We’ve focused on:
Low-impact, targeted communication and implementation
Immediate, signed-off, practical benefits – and growing from
there!
References
References:
Bouquet, P.; Giunchiglia, F.; van Harmelen, F.; Serafini, L.; Stuckenschmidt, H. C-OWL:
contextualizing ontologies. Web Semantics: Science, Services, and Agents on the World Wide
Web 2004, 1, 325–43.
Glassbrook, N.; Ryals, J. A systematic approach to biochemical profiling. Curr. Opin. Plant Biol.
2001, 4(3), 186–90.
Gombocz, E.; Stanley R. Achieving interoperability in Systems Biology: New informatics methods
for user-centric, lightweight integration of heterogeneous data" Poster at the 4th International
Symposium on Challenges in Systems Biology at the Institute for Systems Biology in Seattle,
WA, April 24-25, 2005
Hancock, W.; Wu, S.; Stanley, R.; Gombocz, E. The challenge of publishing large proteome
datasets: the meeting of scientific policies and emerging technologies.Trends Biotechnol. (suppl.)
2002, 20(12), 39–44.
Stanley, R.; Hancock, W. Bioinformatics in the clinic: challenges and opportunities for improved
trials and clinical care. Genom. Proteom. Tech. 2003, 3(3), 29–36.
Wang X.; Gorlitsky R.; Almeida, J.S. From XML to RDF: how semantic web technologies will
change the design of “omic” standards. Nat. Biotech. 2005, 23(9), 1099–1103.
Acknowledgements
IO Informatics would like to thank the following:
Icoria Division of Clinical Data, Inc. (CLDA)
Cogenics Division of CLDA
Icoria
National Institute of Science and
Technology (NIST ATP)
Contributors include:
Imran Shah, Principal Investigator, Icoria, CLDA
Kevin Lutz, Grant Administrator, Cogenics, CLDA
Paul Dibello, Associate Director, Software Engineering, Cogenics, CLDA
Tim Hall, Project Manager, NIST
Erik Puskar, Business Manager, NIST
Erich Gombocz, Chief Science Officer, IO Informatics
Chuck Rockey, Software Project Manager, IO Informatics
Mike Travers, Principal Engineer, Knowledge Systems, IO Informatics
Tom Colatsky, Maureen McBride, Omid Omidvar, Alan Higgins, Pat Hurban,
Max Fedor, Hongkang Mei, Judong Shen and others who have contributed.