Tutorial

Parsnp download instructions

To further demonstrate the functionality of Parsnp we have prepared two small tutorial datasets. The first dataset is a MERS coronavirus outbreak dataset involving 49 isolates. The second dataset is a selected set of 31 Streptococcus pneumoniae genomes. Both of these datasets should run on modestly equipped laptops in a few minutes.

  1. 49 MERS Coronavirus genomes

    • Download genomes:

    • Run parsnp with default parameters

      parsnp -g ./mers_virus/ref/England1.gbk -d ./mers_virus/genomes -c
      
    • Command-line output

    ../../_images/run_mers.cmd1.png
    • Visualize with Gingr GGR
    ../../_images/run_mers.gingr1.png
    • Configure parameters

      • 95% of the reference is covered by the alignment. This is <100% mainly due to a 1kbp unaligned region from 26kbp to 27kbp.

      • To force alignment across large collinear regions, use the -C maximum distance between two collinear MUMs:

        parsnp -g ./mers_virus/ref/England1.gbk -d ./mers_virus/genomes -C 1000 -c
        
    • Visualize again with Gingr GGR

      • By adjusting the -C parameter, this region is no longer unaligned, boosting the reference coverage to 97%.
      ../../_images/run_mers.gingr2.png
    • Zoom in with Gingr for nucleotide view of region

      • On closer inspection, a large stretch of N’s in Jeddah isolate C7569 was the culprit
      ../../_images/run_mers.gingr3.png
    • Inspect Output:

  2. 31 Streptococcus pneumoniae genomes

    • Download genomes:

    • Run parsnp

      parsnp -r ./strep31/NC_011900.fna -d ./strep31 -p <num threads>
      
    • Command-line output:

      ../../_images/run1.png
    • Force inclusion of all genomes (-c)

      parsnp -r ./strep31/NC_011900.fna -d ./strep31 -p <num threads> -c
      
  • Command-line output:

    ../../_images/run2.png
  • Visualize with Gingr GGR

    ../../_images/run1.gingr.png
  • Enable recombination detection/filter (-x)

    parsnp -r ./strep31/NC_011900.fna -d ./strep31 -p <num threads> -c -x
    
  • Re-visualize with Gingr GGR

    • Bootstrap values have improved after running recombination filter; columns with filtered SNPs are displayed in image:
    ../../_images/run2.gingr.png
  • Inspect Output: