4th International SWAT4LS Workshop - Semantic web applications and tools for life sciences

Scientific Programme

Overview

Tuesday 6th December 2011 (Hackathon evening)

Location: To be defined (close to other venues)

18.00- Hackathon  evening. The Hackathon will start with an informal evening to discuss and organize the groups and activities at next day hackathon

Wednesday 7th December 2011 (Hackathon day)

Location: University of London Union (to be confirmed)

9.00-18.30 Hackathon Co-organized with the Open Knowledge Foundation Open Science Working Groupand JISC. The hackathon, co-organized by JISC, SWAT4LS and OKFN will focus on the integration and elaboration of disease specific information. It will focus on a small number of cases, and it will include gathering information from several sources (including open literature), its annotation and its mashup and visualization. Participants to the hackathon should bring their own laptop computer.More details on the hackathon activities are available at the DevCSI website.

Thursday 8th December 2011 (Tutorial day)

Location: University of London Union (to be confirmed)

9.00-18.30 Tutorials
Tutorials are organized in two parallel tracks (A, B), there are no restrictions on which tutorials to attend.

  • 09:00-13:00 – track A
    The Semantic Web: A Deep Dive Tutorial, J Phil Brooks, Lilly Research Laboratories, Lilly Systems Biology. This tutorial will combine instructor-led training with a hands-on lab to provide a deep dive into many of the key fundamentals of the Semantic Web. Concepts covered in the session will include RDF, OWL, inferencing, the use of ontologies to synchronize vocabularies, relational to semantic mapping, semantic integration, and others. Attendees will perform hands-on lab exercises using the TopBraid Composer tool to reinforce their learning.
  • 09:00-13:00 – track B
    Building Semantic Web Services with SADI, Luke McCarthy, Providence Heart + Lung Institute at St. Paul’s Hospital, University of British Columbia. SADI (Semantic Automated Discovery and Integration) is a set of standards-compliant best practices that simplify the discovery and interoperability of Semantic Web Services. This tutorial introduces SADI and what is required of a SADI web service before walking through the creation and deployment of a simple service in Java and/or Perl using the SADI Protégé plugin. A scheme for modelling SAWSDL services in a SADI-compliant way and an overview of compatible client software will also be presented.
    For more information, visit http://sadiframework.org or the Google Code site (http://sadi.googlecode.com).
  • 14:00-15:00 – track A
    NCBO Web Services and Development of Semantic Applications, Trish Wehtzel, NCBO. Ontologies provide domain knowledge to drive data integration and information retrieval. The National Center for Biomedical Ontology provides a suite of ontology-based Web services for the development of semantically aware software applications. This presentation will provide an overview of these Web services and their use within software applications.
  • 15:00-15:30 – track A
    Using SPARQL with UniProt RDF, Jerven Bolleman, Swiss Bioinformatics Institute. Basic introduction to SPARQL 1.1 using UniProt as a data source.
    Includes examples on how SPARQL can be used to check the validity of your data. As well as a simple example of how one can combine querying data and a blast result.
  • 16:00-16:30 – track A
    Perennial identification, Nick Juty, EBI. We will describe Identifiers.org: an annotation and
    cross-referencing framework which provides perennial, unambiguous and
    resolvable identifiers. The system is built on the information stored in
    the MIRIAM Registry, which catalogues a community developed shared list
    of dataset namespaces.
    More information: http://www.identifiers.org/, http://www.ebi.ac.uk/miriam/
  • 16:30-17:30 – track A
    Integration of the scientific literature into the Semantic Web: facts from biomedical data resources, Dietrich Rebholz-Schuhmann, EBI. The scientific literature is the primary resource for relevant and innovative information. The integration of the literature with the other data resources in the biomedical research community generates overhead that can be avoided through the used of Semantic Web Technology generating openly accessible data. Projects such as SESL and CALBC have producted significant amount of data that that are ready for exploitation.
    The tutorial will teach different approaches how to integrated the scientific literature with the content from biomedical databases and will discuss the inferences that can be achieved. Furthermore, the tutorial will point to the resources that are ready for use and enable integration of the literature at your discretion
    A good understanding of Semantic Web technology, ontologies, OWL and the existing biomedical data resources is advantageous to easily follow the tutorial.
  • 14:00-15:30 – track B
    LS4LS – Linked Services for Life Sciences, Barry Norton, Ontotext, Sebastian Speiser, KIT (Karlsruhe Institute of Technology), Ruben Verborgh, Multimedia Lab – IBBT / ELIS, Ghent University.
    Linked Services is the generic term for the combination of Linked Data and service technology. A major interest here is in the coexistence and interaction of Linked Data principles and best practice with the principles of REST. This tutorial is a continuation of previous editions held at ISWC 2010, and at ESWC 2011. The LS4LS tutorial at the Workshop Semantic Web Applications and Tools for Life Sciences (SWAT4LS) will focus on using Linked Services in Life Science applications (More information is available at http://linkedservices.org/wiki/SWAT4LS_2011_Tutorial).
  • 16:00-17:30 – track B
    Title: Generating ontology content with semantic spreadsheets and OPPL scripts, Simon Jupp, University of Manchester and Mikel Egaña Aranguren, Technical University of Madrid. This tutorial will introduce the ontology pre-processing language (OPPL) and demonstrate the construction of a simple ontology using Populous, semantic spreadsheets and OPPL. Populous is a tool for building OWL ontologies from spreadsheet based templates. Populous can be used to add complex patterns of axioms, in a consistent manner, to a growing ontology. The patterns are captured in a template that can be constrained to use terms from existing ontologies. These templates allows easy filling of the pattern, especially by domain experts who may be less familiar with OWL or authoring tools like Protege. Populous supports the transformation of a populated template into OWL axioms using design patterns expressed in the Ontology Pre Processing Language (OPPL). Populous is available from http://www.populous.org.uk.
  • 18:00-19:30 – track B
    SWObjects, Helena Deus, DERI. Sciences databases come in many shapes and sizes. A large number of them already exist in the machine readable, interoperable format known as RDF (Resource Description Framework). Many of them, however, still exist in their original relational formats. As popularity of linked life sciences increases, so does the need to support increasingly complex queries and to enable proper linking between datasets. SWObjects is a SPARQL 1.1. supporting engine supporting multiple SPARQL service calls in a single query; usage of maps to transform heterogeneous data into a single common representation and serialization of SPARQL queries into SQL. In this tutorial, query federation and transformation will be used to query and integrated multiple life sciences datasets and health care datasets.

The tutorial program is still provisional and subject to minor adjustments

Friday 9th December 2011 (Workshop day)

Location: Welcome Trust Collection Conference Center

The program is provisional and subject to minor variations

8.30-08.50 Arrival and registration
08.50-09.05 Welcome and introduction
09.05-09.50 Keynote
Wendy Hall: Towards a Smarter Web
09.50-10.30 Talks

10.30-11.00 Coffee Break
Held in the poster and demo room
11.00-12.00 Talks

12.00-12.40 Industry session

12.40-13.40 Lunch
Held in the poster and demo room
short visits to the Wellcome Collection (tentative)
13.40-14.00 Talk

14.00-14.45 Short talks

14.45-15.00 Highlight posters

15.00-15.30 Keynote
Douglas Kell: Semantic approaches in Biotechnology and Biological Sciences
15.30-16.30 Poster and demo session
Includes Coffee break
16.30-17.00 Short talks

17.00-17.45 Keynote
Tetsuro Toyoda: Future Biology is Semantic Information Science
17.45-18.15 Panel discussion
To be defined.
Chairing: Rebholz Dietrich-Schumann
18.15-18.35 Wrapup and conclusions

Keynotes

NCBO Keynote: Tetsuro Toyoda: Future Biology is Semantic Information Science

Global cloud frameworks for bioinformatics research databases become huge and heterogeneous; solutions face various diametric challenges comprising cross integration, retrieval, security and openness. To address this, as of March 2011 organizations including RIKEN published 192 mammalian, plant and protein life sciences databases having 7.5 million data records, integrated as Linked Open or Private Data (LOD/LPD) using SciNetS.org, the Scientists’ Networking System or a “Virtual Laboratory Centre”, providing numerous virtual labs for semantic-web-based data management and collaboration of various purposes. SciNetS.org is thus designed to discover novel relationships between data, because automatic intelligent agent generates for each data item a summary content displaying the linked information via Semantic Web. We demonstrate we successfully used the SciNetS interface across 26 million semantic relationships for biological applications including genome design, sequence processing, inference over phenotype databases, full-text search indexing and human-readable contents like ontology and LOD tree viewers.

Tetsuro Toyoda was born in Tokyo, Japan, in 1968. He graduated from the Faculty of Pharmaceutical Sciences at the University of Tokyo in 1992, and obtained his PhD in 1997 from the same university. He started as a researcher at the Institute of Medical Molecular Design in 1997, and joined RIKEN as team leader in the Genomic Sciences Center in 2001. He became director of the RIKEN Bioinformatics and Systems Engineering Division when it was established in 2008. His expertise is in bioinformatics and computer-aided rational design of biomolecules, including rational database-supported drug design based on protein structural information and rational genome design in synthetic biology for biomass engineering. He promotes Japan’s database integration projects as a member of several national database committees.

Prof Douglas Kell: Semantic approaches in Biotechnology and Biological Sciences

Doug KellThe rise of “big data” and their attendant metadata in biology requires new methods for their optimal exploitation, for which semantic approaches offer both potential and real benefits. I illustrate this with a number of examples, including the principled description of biochemical networks [1] and a semantic approach to linking the biochemical literature and its underpinning data [2-4].

[1] Herrgård MJ, 31 others and Kell DB: A consensus yeast metabolic network obtained from a community approach to systems biology. Nature Biotechnol 2008; 26:1155-1160.
[2] Attwood TK, Kell DB, McDermott P, Marsh J, Pettifer S, Thorne D: Utopia Documents: linking scholarly literature with research data. Bioinformatics 2010; 26:i568-574.
[3] Attwood TK, Kell DB, McDermott P, Marsh J, Pettifer SR, Thorne D: Calling International Rescue: knowledge lost in literature and data landslide! Biochem J 2009; 424:317-333.
[4] Pettifer SR, Thorne D, McDermott P, Marsh J, Villéger A, Kell DB, Attwood TK: Visualising biological data: a semantic approach to tool and database integration. BMC Bioinformatics 2009; 10:S19.

Before taking up the post of BBSRC Chief Executive in October 2008, Douglas Kell was the Director of the Manchester Centre for Integrative Systems Biology based in the Manchester Interdisciplinary Biocentre (MIB). His research covers a broad range of topics from analytical chemistry to systems biology, usually coupled to biochemical and data modelling. He continues to work at the MIB one day a week. He joined UMIST in 2002, which merged with the Victoria University of Manchester to form The University of Manchester in 2004. He was, from 1997 until 2002, Director of Research of the Institute of Biological Sciences at the University of Aberystwyth. He held a personal Chair with the University of Aberystwyth for 10 years until 2002. He is a Director of Aber Instruments, which won the Queen’s Award for Export Achievement in 1998. Prof Kell was a member of BBSRC Council 2001-2006, member of Strategy Board 2005-2006, member of the Bioscience for Industry Strategy Panel 2007-present and chaired the review of BBSRC’s Bioenergy research in 2006. Prof Kell was born in London in 1953. He is married with three children.

Prof K Wendy Hall: Towards a Smarter Web

In this talk, we will reflect on the evolution of the Web. We will do this by analyzing the reasons why it became the first truly ubiquitous hypertext system against all competitors, and then by looking both at the way it has evolved from a network of linked documents to a system that facilitates social networking on a scale previously unimaginable, and at how it will evolve in the future as a network of linked data and beyond. The study of the Web—its evolution and its impact on society, on business, and on government—is referred to as Web science. We consider some of the major challenges of Web science and discuss possible Web worlds of the future.

Wendy Hall, DBE, FRS, FREng is Professor of Computer Science at the University of Southampton, UK, and Dean of the Faculty of Physical and Applied Sciences. She was Head of the School of Electronics and Computer Science (ECS) from 2002 to 2007.
One of the first computer scientists to undertake serious research in multimedia and hypermedia, she has been at its forefront ever since. The influence of her work has been significant in many areas including digital libraries, the development of the Semantic Web, and the emerging research discipline of Web Science. She has published over 400 papers and is frequently invited to speak at high profile conferences and events
With Tim Berners-Lee and Nigel Shadbolt she co-founded the Web Science Research Initiative in 2006 and she is currently a Director of the Web Science Trust which has a global mission to support the development of research, education and thought leadership in Web Science.
In addition to playing a prominent role in the development of her subject, she also helps shape science and engineering policy and education.
She became a Dame Commander of the British Empire in the 2009 UK New Year’s Honours list, and was elected a Fellow of the Royal Society in June 2009.
She was President of the Association for Computing Machinery (ACM) from 2008-2010; the first person from outside North America to hold this position. Other significant posts she has held include Senior Vice President of the Royal Academy of Engineering, member of the Prime Minister’s Council for Science and Technology, founding member of the European Research Council, member of the EPSRC Council, President of the British Computer Society and EPSRC Senior Research Fellow. She was appointed as Chair of the European Commission’s ISTAG in July 2010


Accepted posters and demos

This is only a partial list and more posters will be confirmed at the workshop

Posters

In addition the highlight posters

    • A Translational Model for Representing Research Articles
      Alexander Garcia and Leyla Jael Garcí a Castro

Systematic identification and correction of spelling errors in the Foundational Model of Anatomy
Phil Gooch

  • Applying ontologies and exploring nanopublishing in a genome-wide association study database
    Tim Beck, Gudmundur Thorisson and Anthony Brookes
  • BioSPARQL: Ontology-based smart building of SPARQL queries for biological Linked Open Data
    Norio Kobayashi and Tetsuro Toyoda
  • BioLOD.org: Ontology-based integration of Biological Linked Open Data
    Koro Nishikata and Tetsuro Toyoda
  • LogMap 2.0: towards logic-based, scalable and interactive ontology matching
    Ernesto Jimenez-Ruiz, Bernardo Cuenca Grau and Yujiao Zhou
  • Planteome Annotation Wiki: A Semantic Application for the Community Curation of Plant Genotypes and Phenotypes
    Justin Preece, Justin Elser and Pankaj Jaiswal
  • Supporting Nanopublication Provenance: PMID2DOI Converter
    Christine Chichester, Kees Burger, Hailiang Mei and Barend Mons
  • Towards decentralized and cooperative repositories of distributed ontologies
    Gayo Diallo
  • Workflow forever: Semantic Web Semantic models and tools for preserving and digitally publishing computational experiments
    Kristina Hettne, Khalid Belhajjame, Marco Roos, Stian Soiland-Reyes, Matthew Gamble, Oscar Corcho, Graham Klyne and Sean Bechhofer
  • Genome Design with the Semantic Web
    Robert Sidney Cox Iii and Tetsuro Toyoda

Demos


In addition to the highlight “demos”

  • CLI-mate: An Interface Generator for Command Line
    Zuotian Tatum, Johan Den Dunnen and Jeroen F.J. Laros

Industry session

Selventa Introduction to the Biological Expression Language (BEL) and the BEL Framework

The Biological Expression Language (BEL) and BEL Framework is an emerging open-platform technology specifically designed to overcome many of the challenges associated with capturing, integrating, and storing knowledge within an organization, and sharing the knowledge across the organization and between business partners.
BEL is intended as a knowledge capture and interchange medium, supporting the operation of systems that integrate knowledge derived from independent efforts. The language is designed to be use-neutral, facilitating the storage and use of structured knowledge for inference by applications through a knowledge assembly process that can create computable biological networks.
The BEL Framework provides mechanisms for:

  • knowledge capture (BEL) and management
  • integration of knowledge from multiple, disparate knowledge streams
  • knowledge representation and standardization in an open, use-neutral format. Using standard and non standard vocabularies & Ontologies
  • creating customizable, computable biological networks from captured knowledge
  • quickly enabling knowledge-aware applications using standardized application programming interfaces (APIs) across all major development platforms

Ontotext Effectively managing large scale RDF warehouses

Ontotext is a developer of core semantic technology and text mining solutions. Our mission is to develop and evangelize open, skillfully engineered tools, which considerably reduce the cost of implementation and use of semantic technologies of the existing expensive black boxes with unpredictable performance and academic prototypes that never reach industrial strength.

In this talk we will present Linked Life Data (LLD) service. LLD is a very robust and highly efficient RDF warehouse solution, based on OWLIM. It offers free access to over 25 integrated data sources and handles more than 100K SPARQL query requests per month resulting in GBs of network transfer. Although, it is demanding to support such a service, an even bigger challenge is to implement and control the data transformation, loading, testing and deployment processes. We will demonstrate years of experience and in-house tools for maintaining large-scale RDF warehouses.