COPO - TGAC Documentation

Transcription

COPO - TGAC Documentation
COPO: Collaborative Open Plant Omics
Rob Davey
Data Infrastructure and Algorithms Group Leader
[email protected]
@froggleston
Toni Etuk
Acknowledgements
Oxford eResearch Centre
Susanna Sansone
Alejandra Gonzalez-Beltran
Philippe Rocca-Serra
Alfie Abdul-Rahman
Felix Shaw
Warwick
Jim Beynon
Katherine Denby
Ruth Bastow
EMBL-EBI
Paul Kersey
TGAC
Vicky Schneider
Tanya Dickie
Emily Angiolini
Matt Drew
COPO
What prevents plant scientists from openly depositing their data
and metadata?
●
●
Lack of interoperability between:
●
metadata annotation services
●
data repository services
●
data analysis services
●
data publishing services
Researchers might not:
●
be aware that the services exist
●
have the expertise to use them
●
see the value in properly describing their data
COPO
What prevents plant scientists from openly depositing their data
and metadata?
●
●
Lack of interoperability between:
●
metadata annotation services
●
data repository services
●
data analysis services
●
data publishing services
Researchers might not:
5% technical
95% cultural
●
be aware that the services exist
●
have the expertise to use them
●
see the value in properly describing their data
COPO
●
It's not because these services don't exist!
●
●
Clearly, barriers exist between the scientist and the service
Infrastructure can help by:
●
wiring existing services together
●
improving access to services
●
facilitating collaboration
●
raising profile of the benefits of open science
COPO
●
●
Recently awarded 3-year £1.7m BBSRC BBR grant
●
TGAC, Univ. Oxford, Univ. Warwick, EMBL-EBI
●
Supported by GARNet, iPlant, Eagle Genomics
Empower bioscience plant researchers to:
1. Enable standards-compliant data collection, curation and
integration
2. Enhance access to data analysis and visualisation
pipelines
3. Facilitate data sharing and publication to promote reuse
●
Train plant researchers in best practice for data sharing and
producing Research Objects
COPO
COPO
●
Searching COPO for any supported accessions or DOIs will produce
the full history of what was carried out
●
Linked to other research outputs that used the same data, analysis
or metadata
Study A
Researcher B
Institute C
Metadata fragments
Raw data
Analysis
software
Outputs
TRANSPARENCY
COPO
●
Searching COPO for any supported accessions or DOIs will produce
the full history of what was carried out
●
Linked to other research outputs that used the same data, analysis
or metadata
Study X
Researcher Y
Institute C
Study A
Researcher B
Institute C
Raw data
Raw data
Analysis
software
Analysis
software
Outputs
TRANSPARENCY
Outputs
COPO
●
Searching COPO for any supported accessions or DOIs will produce
the full history of what was carried out
●
Linked to other research outputs that used the same data, analysis
or metadata
Study X
Researcher Y
Institute C
Study A
Researcher B
Institute C
Raw data
Raw data
Analysis
software
Analysis
software
Outputs
TRANSPARENCY
Outputs
COPO
●
Build graphs of interconnected data, analyses and outputs
●
Searches hitting any part of the graph will allow retrieval of the rest
●
Including any citations, data or text
Study X, Researcher Y, Institute Z
Study A
Researcher B
Institute C
Analysis
software
Outputs
Raw data
Analysis
software
Outputs
Researchers L, M, N
Analysis
Study
X, Researcher Y, Outputs
Institute Z
Analysis
Analysis
software
Outputs
Study
X,
Researcher
Y,
Institute Z
Analysis
Outputs
Analysis
software
Outputs
Study
X, Researcher Y,Outputs
Institute Z
software
Analysis
software
software
Outputs
Analysis
software
Outputs
software
TRANSPARENCY
COPO
Addressing the technical issues
●
Single point of entry to deposit datasets into suitable repositories
●
Wizard systems to guide users through metadata annotation and
submission pathways
●
●
Reduce friction to get their research into citable form
●
Data citations should be first class resources, like papers
●
Make linked data available to platforms like F1000Research
Greater access to a wider, deeper range of better understood
research
COPO
Addressing the cultural issues
●
●
Build interconnected graphs of research objects
●
Use metadata to find related data, analyses and outputs
●
“Give me other wheat datasets that are similar to mine”
●
“How many studies have used my data in their analyses?”
●
“How many people have run my analysis workflow?”
●
“How many times has my data been cited?”
Makes research data valuable
●
Transparent
●
Findable
●
Reusable
●
Citable
COPO
Thank you
We have a poster!
Please come and talk to us!
Alejandra Gonzalez-Beltran, Philippe Rocca-Serra,
Felix Shaw, Toni Etuk, myself
COPO
@copo_project