LRI-AMBIT: a predictive tool dressed in IUCLID, quality and power
Transcription
LRI-AMBIT: a predictive tool dressed in IUCLID, quality and power
LRI-AMBIT: a predictive tool dressed in IUCLID, quality and power built-in FACTS ABOUT QSAR MODELS AND READ ACROSS Mario Negri Institute, Milano, April 29, 2015 Dr Bruno Hubesch Cefic LRI Programme Mgr [email protected] Talk outline 1. What is LRI ? (in brief) 2. AMBIT-IUCLID: A. the tool B. the power C. policy impact 2 What is LRI? – A Global Programme • Commitment demonstration to RC and GPS • ICCA-LRI: A global effort under ICCA • ACC • Cefic • JCIA 3 Executing the long-term refocus of the LRI Key questions (< 10y) • Omics: 21st Century Toxicology; Link information at molecular level to health impact – interpretation of results until we know all; 3Rs for animal testing • Predictive tools for health impact: pragmatic approaches to reduce complexity, but still robust in predicting health effects • Combination effects: Identify combination effects scenarios of concern • Eco-systems approach: New concept for Ecosystems approach with population relevance • « Real life Exposure » : Predictive, validated exposure scenarios for environmental stressors • Comparative assessment: impact of health and environmental stressors • Benefit – risks approaches: societal drivers for public acceptance for innovation 4 CEFIC LRI PROJECT EEM9.3-IC A. THE TOOL: AMBIT-IUCLID Linking LRI AMBIT Chemoinformatic System with the IUCLID Substance Database to Support Read across of Substance Endpoint data & Category Formation Sep 2013 – Sep 2015 NINA JELIAZKOVA IdeaConsult Ltd. Sofia, Bulgaria www.ideaconsult.net http://ambit.sourceforge.net AMBIT CEFIC LRI-EEM9.3 “Building blocks for a future (Q)SAR decision support system: databases, applicability domain, similarity assessment and structure conversions” (2004, 2008) Part of the LRI Toolbox (http://www.cefic-lri.org/lritoolbox/ambit) AMBIT REST web services OpenTox Application Programming Interface (API) since 2009 Dataset web services (import / export) Chemical search [exact, similar, substructure] and Prediction tools e.g. Cramer rules , Protein binding, pKa Data analysis tools e.g. regression, classification, Web Applications using AMBIT REST web services New 2014: Embeddable JS widgets meta data, data pooling, structure QA , structure optimisation, tautomers generation etc Open source clustering, descriptor calculation Multiplatform (Windows, Unix, Mac) AMBIT CEFIC LRI-EEM9.3-IC (2013) 2011, 3:18 doi:10.1186/1758-2946-3-18 6 MAJOR GOALS OF THE CEFIC LRI PROJECT EEM9.3-IC Enhancing the predictive power of in-silico tools like LRI AMBIT Use of Substance data which are already available in a company IUCLID Database. In addition CEFIC LRI is checking how to receive the data from the ECHA Dissemination site for ca. 7000 substances. Other reliable (quality) data will be integrated as well Support read across and category formation workflows Avoiding animal tests and enhance the justification for read across and category formation as requested by authorities Minimize overall testing and resource costs Using available studies for other substances as well + WoE IDEACONSULT LTD. 7 PROJECT OBJECTIVES 1. Implement Web services allowing communication of IUCLID and AMBIT for all data elements important to support readacross and category formation within AMBIT. 2. Establish filters to select the most relevant and most reliable data to be accessed by AMBIT e.g. only data from key studies, only data with Klimisch code 1 or 2, and excluding data from UVCB substances that have no information on composition or structural identity, etc. 3. Enable AMBIT to handle substance IDs, including structures, composition of monoand multi-constituent substances, and, in special cases of UVCBs, addressing constituents, impurities and additives in the way IUCLID is structured. This would help avoiding the disadvantages of other chemoinformatic systems where in some cases multi-constituent substances are erroneously linked to a single structure. 4. Add specific search functions to AMBIT a) to allow tracing of identified toxicants back to one or more substances in IUCLID b) to allow tracing of identified toxicants back to one or more company product(s) using a company product number integrated in AMBIT & IUCLID. 5. Improve methodologies and user interface in AMBIT to support the user in establishing a readacross or category formation strategy using the (filtered) data from IUCLID and to help assessing the results. 6. Improve the linking of other (high) quality databases & other chemoinformatic systems with AMBIT e.g. DEREK, METEOR, etc. 7. Analyse how the data in the ECHA REACH Dissemination Web site can be accessed from AMBIT to check whether data could be made available for read-across and category formation. 8. Ensure that AMBIT is compatible with the OpenTox Framework to allow flexible development using publicly available methodology instead of following proprietary approaches (upgrades) IDEACONSULT LTD. 8 PROJECT OVERVIEW Data transfer Company IUCLID DB ECHA IUCLID DB As Major Data Source Search for data possible but not for structures AMBIT FUNCTIONS Assigning structures -to constituents, impurities … Search structures & Data - exact, similar, substructure - combined with data search Read across/category formation -Workflows supporting the user CEFIC LRI AMBIT Chemoinformatics Prediction tools -Cramer rules, Protein binding etc System Data analysis tools -Regressions, clustering etc. Data management Supporting -flexible import/export of data Read across & Category formation Data exchange - manual or automated via webservices 9 SUBSTANCE IDENTITY STRUCTURES OF CONSTITUENTS, IMPURITIES, ADDITIVES “Only a limited number of tools are capable to provide easily accessible data on substance identity, composition together with chemical structures and high quality and detailed endpoint data” All examples are for illustration only. The data is fictitious. 10 ENDPOINTS DATA IN IUCLID 5 Data retrieval: “Manually” Query Subst. ID Select Substance Select Endpoint Select EP record Scroll to data point of interest Using the IUCLID Query Tool Example: Searching a Substance and a specific endpoint e.g. Biodegradation Filing of Query blocks required SUBSTANCE ENDPOINT RECORDS IUCLID 5.5 XML schema large and complex data model: thousands of data fields, different documents linked by UUIDs, different data types and deeply nested data structures; includes 126 OECD Harmonized templates (OHT) for Endpoint study records (physicochemical, environmental, health hazard and toxicity endpoints); 79 templates, defining Endpoint summaries. the specific templates and fields imported are based on recommendations by the Clariant CompTox team IDEACONSULT LTD. 12 AMBIT: 43 SUPPORTED IUCLID5 ENDPOINTS IDEACONSULT LTD. 13 AMBIT: ENDPOINT STUDY RECORDS PHYSICOCHEMICAL PROPERTIES The endpoint study records are displayed in tabs, grouping physicochemical properties, environmental fate, ecotoxicity and toxicity endpoints IDEACONSULT LTD. 14 AMBIT: ENDPOINT STUDY RECORDS ENVIRONMENTAL FATE IDEACONSULT LTD. 15 AMBIT: ENDPOINT STUDY RECORDS ECOTOXICITY IDEACONSULT LTD. 16 AMBIT: ENDPOINT STUDY RECORDS - TOXICITY IDEACONSULT LTD. 17 TRANSFERRING DATA FROM IUCLID5 TO AMBIT IUCLID has a so called Bulk Export Assistant (manual interaction required) IUCLID can exchange data via Webservice ((semi)automatically), automatic update of data as well 18 AMBIT COMMUNICATION Other Databases & Tools 19 Proposed by Clariant CompTox team Maybe iterative process AMBIT Workflows on Read across & Category formation Creating Assessment and defining initial target Searching for other targets and related structures Selecting relevant structures Selecting data sources Selecting endpoints Data gathering Gap filling Evaluation Doc link Report generation using predefined templates Assessment Storage Finalizing Assessment dataset 20 READ ACROSS AND CATEGORY WORKFLOW Assessment identifier Collect structures Endpoint data used Assessment details Report 5 MAIN STEPS Assessment identifier Collect structures 7 SUB STEPS Collect structures Assessment identifier Collect structures Endpoint data used Endpoint data used Endpoint data used Assessment details Endpoint data used Report Selection of endpoints Assessment details Initial matrix Assessment identifier Collect structures Report List of collected structures Search substance(s) Assessment identifier Collect structures Assessment details Working matrix Assessment details Report Final matrix Report The assessment workflow is organized in five main tabs. Some of the main tabs contain sub-tabs. The workflow goes from left to right in the main tabs as well as the subtabs. In principle, it should be not allowed to leave out any of the steps. 21 READ ACROSS / CATEGORY FORMATION WORKFLOW Currently under testing by Clariant CompTox team Next steps: Report format to be defined Automated data filling from estimation tools or other databases Selection of reliable external databases and loading into AMBIT Adding new prediction tools Others? IDEACONSULT LTD. 22 AMBIT SUMMARY REQUIREMENTS U P D AT E S : Server: • Java 1.6/1.7 • Tomcat 5.5, 6.x, 7.x • MySQL 5.5/5.6 Clients: • Modern JavaScript enabled browser • Desktop applications • Workflow engines • Substances and compositions • Endpoint records • Read across/ category formation workflow • User management system to grant access rights via roles The clients communicate via REST web service API (Application Programming Interface) CLARIANT HQ, DE 23 C. LRI tools impact ? Science for policy & innovation / R&D-acad. + regulatory 24 Hands-on session Sept.23 at Cefic (Brussels) ACKNOWLEDGEMENTS CEFIC LRI EEM9.3-IC Project idea for LRI EEM9.3-IC Volker Koch, Clariant Project input : Clariant CompTox Team Udo Jensch (Toxicologist) Volker Koch (Ecotoxicologist) Qiang Li (Toxicologist) Joachim Schneider-Reigl (Ecotoxicologist) Project implementation : IdeaConsult Ltd. 26 Bruno Hubesch [email protected] +32 2 676 74 92