In this post, Dr David Johnson gives his reflection on an ISA specification hackathon held in July 2015, in advance of joining the ISA team at Oxford as a research software engineer.
Last week I joined my prospective colleagues at the Oxford University e-Research Centre (OeRC) with some of their collaborators to thresh out an evolution of the ISA (Investigation/Study/Assay) metadata tracking framework. I will be joining the ISA development team at Oxford from September this year, which is a new phase in my career that I am very much looking forward to.
ISA consists of a model specification that describes its key concept elements and structure, while implementations of the specification are also developed by the ISA team. The framework aims to facilitate standards-compliant collection, curation, management and reuse of datasets in the life sciences. The first version of the specification, a Release Candidate from 2008, is implemented as the ISA-Tab (tabular) format – a table-based format that many working in the life sciences are used to, where data is abundantly stored and manipulated in spreadsheets. More recently ISA can also be converted to RDF via linkedISA.
— Robert Davey (@froggleston) July 21, 2015
While I have yet to officially join the ISA team (I am currently on a short sabbatical since leaving a research post at Imperial College London) I was invited to attend a 3-day workshop in Oxford to review and make new amendments to the ISA specification towards a version 2.0 release. The workshop, an ELIXIR UK event, was billed as the “ISA as a FAIR research object” Hack-the-Spec event. We were joined by representatives from The Genome Analysis Centre, the European Bioinformatics Institute, Leiden, Manchester and Birmingham universities and even a group visiting from my home-town of Hong Kong, from the GigaScience journal that was launched by Beijing Genomics Institute in 2012. We also were joined online by a number of researchers dialling in via Google hangouts from various sites in Europe.
As a workshop report will come out in due course I won’t get into the detail of the outcomes, but broadly the discussions focused around:
- Evolving ISA to enable FAIR (Findable, Accessible, Interoperable, Reusable) research objects
- Fixing ambiguities, missing structures and elements in ISA 1.0
- Enabling integration of standard identification schemes such as ORCID
- Redefining the spec to define the ‘core’ ISA elements and separating out domain specific ‘extensions’
- Specifying conventions, mechanisms, and best practices for developing extensions to this new ‘ISA core’.
What was clear was that there was plenty of scope for evolving ISA from various parts of the user community. By abstracting out the core ISA specification, what we need now is contributions from a diverse range of exemplar projects to ensure that the core is truly interoperable. To this end, we are now encouraging communities to share their ISA templates along with exemplar experiments and start building a repository of extensions in the ISA commons website. In the meantime the ISA team will be formalising the ISA core and developing new reference implementations in tabular and JSON formats and supporting tools. We hope to have a draft specification presented to the community in the fall of 2015.
Apart from the 3 days of discussions fuelled by much coffee and cake, we did also find some time in the evenings to get out to enjoy the sunshine and enjoy a couple of Oxford’s wonderful restaurants…
— Dr David Johnson (@NuDataScientist) July 21, 2015
One of my key takeaways from the workshop, apart from having a crash course into the ISA spec that I will be working on in the coming months, is the importance of going through the community engagement process when developing a data specification. As with engineering software, we need to make sure we are building the right thing. Soliciting feedback is not a vanity exercise or even a political exercise, but an essential part of a carefully-managed process to ensure we evolve the specification to fulfil the changing needs of the people that matter – the user community.