Announcing WormBase ParaSite release 17

Have you been waiting for our next release? The wait is finally over! Despite being understaffed and underfunded, WormBase ParaSite launches its new release 17 with an exciting list of new/updated genomes and new features:

  • Integration of AlphaFold 3D protein structures for 8 species.
  • Addition of 11 new genome assemblies of which 6 are new species.
  • Annotation updates for 2 genomes.
  • Gene-phenotype associations are now available in our FTP directory.
  • Improvements in the way external gene synonyms are integrated and displayed.
  • Deployment of WebApollo instances for more species to further facilitate community curation.

New Species

Angiostrongylus vasorum – a clinically important parasitic nematode living in the arteries and heart of several canid species, including domestic dogs (from Tayrov et al., 2021).
Cercopithifilaria johnstoni – a filarial nematode transmitted by hard ticks to infect a broad native Australian murid and marsupial hosts (from McCann et al., 2021).
Fasciolopsis buski – a large fluke that infects the small intestine of humans and pigs in East/Southeast Asia (from Choi et al., 2020).
Gyrodactylus bullatarudis –  monogenean parasite of the guppy fish (from Konczal et. al, 2020).
Halicephalobus spNKZ332 – small, parthenogenic clade IV nematode isolated from termites in Japan (from Ragsdale et al., 2019).
Heterodera schachtii – a.k.a the beet cyst nematode, is a plant pathogenic parasite which can infect more than 200 plants including the model plant Arabidopsis thaliana (from Siddique et al., 2021).


Assembly Updates

Schistosoma mansoni
We are proud to present the best ever S. mansoni assembly created until the next one! S. mansoni (blood fluke) is one of the three major infectious agents responsible for the chronic debilitating disease schistosomiasis found throughout Africa and South America. Its previous assembly (v7) was substantially upgraded following the incorporation of HiC data, and further PacBio analysis to resolve repeats leading to the new, near-complete chromosomal assembly (v9) presented in this release! More information can be found in this pre-print by Buddenborg et al 2021.

Fasciola hepatica
This sheep liver fluke or common liver fluke, is a parasite that infects humans, cows and sheep, causing fascioliasis. The previous assembly and annotation which were submitted in 2013 were drastically improved in this release as described by McNulty et al., 2017.

Heterodera glycines
The soybean cyst nematode (SCN), Heterodera glycines, is a plant-parasitic nematode infecting soybean roots. The previous assembly and annotation which were submitted in 2013 were drastically improved in this release as described by Masonbrink et al., 2021.


Annotation updates


AlphaFold 3D protein structures are now browsable from WormBase ParaSite

For the first time in WormBase ParaSite, you can browse 3D protein structures visualised in a user-friendly and fun-to-explore viewer! Users can now explore the 3D protein models of their favourite genes in 8 WBPS species:

SpeciesPredicted structuresLinks
Brugia malayi8,743WormBase Parasite example
Caenorhabditis elegans19,694WormBase ParaSite example
Dracunculus medinensis10,834WormBase ParaSite example
Onchocerca volvulus12,047WormBase ParaSite example
Schistosoma mansoni13,865WormBase ParaSite example
Strongyloides stercoralis12,613WormBase ParaSite example
Trichuris trichiura9,564WormBase ParaSite example
Wuchereria bancrofti12,721WormBase ParaSite example
Table 1. Number of structural predictions for complete proteomes of parasitic worms in AlphaFold DB v.2.1.2 and WormBase ParaSite 17.

Want to know more about how to visualise AlphaFold Protein Structure in WormBase ParaSite?
The page containing the 3D protein structure for a protein of interest is located under the transcript summary page on the left-side “Transcript-based displays” menu under “AlphaFold predicted model”. For a step-by-step tutorial click here. You can now view the shiny new interactive 3D AlphaFold structure for our protein of interest:

AlphaFold predicted model page for S. mansoni transcript Smp_170450.1 in WormBase ParaSite 17

“What functionalities does the 3D protein structure viewer offer?”

The central panel (viewer) annotates the model with regions of high confidence (blue) to low confidence (orange) with its protein sequence displayed above. It’s very simple to use it: Just drag and drop with your mouse pointer to rotate the stucture and scroll to zoom in and zoom out! You can rapidly zoom in a specific residue by clicking on it in the protein sequence above the model. The right hand panel enables highlighting of one or more exons and protein features (Gene3D, PROSITE, Pfam, etc) which are controlled by clicking on the eye icon.

“More species please?”

As AlphaFold plans to expand their database in 2022 to cover additional proteomes of more species, as well as a much larger proportion of all catalogued proteins, we anticipate that more parasitic worms will make it into the AlphaFold database. WormBase ParaSite will then be able to enable “AlphaFold predicted model” viewer for these species. You can monitor the list of species present in the AlphaFold database here.


Phenotypes available on the FTP

“Is there any way to export gene-phenotype associations from WormBase ParaSite?”

In our previous release 16 we were happy to announce the import of over 350,000 C. elegans and S. mansoni gene-phenotype associations from our sister site, WormBase (C. elegans example). These associations were also propagated between orthologs to all our hosted species (H. polygyrus example). In this release we also made these gene-phenotype associations available in our FTP directory.

Gene-Phenotype associations have been deposited in GAF version 2.1 files in our FTP directory. For each species you will find 2 different GAF files:

  1. <SPECIES>.<BIOPROJECT>.WBPS17.orthology-inferred_phenotypes.gaf.gz (Example): This file contains species-specific orthology inferred gene-phenotype associations from C. elegans or S. mansoni (you can find which one in the 8th column). A file like this is availabe for every species in WormBase ParaSite.
  2. <SPECIES>.<BIOPROJECT>.WBPS17.phenotypes.gaf.gz (Example): This file contains original gene-phenotype associations for the species of interest. For the moment we only host original gene-phenotype association data for C. elegans and S. mansoni and therefore this file is available for these 2 species.

Sometimes GAF files are hard to interpret. For this reason, in the header of these files, we have included very useful column descriptions and general information. Enjoy!

Curated gene synonyms

“Gene synonyms are really useful, but is there any way to export them?”

In our release 16 we announced the import of literature-curated gene name synonyms for Strongyloides stercoralis. These synonyms are searchable (via the top-right search box) and appear in the new “Synonyms” line of the gene page. In this release we also made these synonyms exportable via WormBase ParaSite Biomart!

To export curated gene synonyms for S. stercoralis: First navigate to the WormBase ParaSite Biomart and submit your Query Filters. Then enable the “Curated Gene Synonym ID” under the “EXTERNAL DATABASE REFERENCES AND ID CONVERSION” of the Output Attributes like shown here:

Then click Results and you will get a list of your selected genes and their curated synonyms:

We need your help on this!
At the moment, we have only imported External Curated Gene synonyms for S. stercoralis but we would love to import more synonyms for other species. For this reason we need your help! If you have (or someone you know has) curated gene synonyms for any of the genomes we are hosting in WormBase ParaSite please contact us (parasite-help@wormbase.org) and we will be able to integrate them.

Community Annotation

“I would like to manually curate gene models on a genome I have previously submitted (or I am about to submit) to WormBase ParaSite.”

We supplement our in-house gene curation platform by hosting Web Apollo instances for an increasing number of genomes. Web Apollo is an instanteneous, collaborative genomic annotation editor available on the web.
Users can request relevant Web Apollo instances to be deployed from us. We would be happy to provide the relevant training! Please feel free to contact us (parasite-help@wormbase.org) to make such a request.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s