mtArchitect

Description:

mtArchitect is a pipeline that uses other existing tools to reconstruct the complete sequence of the mitogenome from Whole Genome Sequencing fastqs.

Dependencies:

BWA (0.7.15) https://sourceforge.net/projects/bio-bwa/files/

SAMtools (1.3.1) http://sourceforge.net/projects/samtools/files/samtools/

Bam2fastq (1.1.0) https://gsl.hudsonalpha.org/information/software/bam2fastq

Tabix (0.2.5) http://sourceforge.net/projects/samtools/files/tabix/

BCFtools (1.3.1) http://www.htslib.org/download/

Picardtools (1.119) http://sourceforge.net/projects/picard/files/picard-tools/

bgzip (htslib 1.3.1) http://www.htslib.org/download/

VCFtools (0.1.13) https://sourceforge.net/projects/vcftools/

Hapsembler (2.21) http://compbio.cs.toronto.edu/hapsembler/hapsembler.html

Blast (2.4.0) ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/

Please note that the pipeline was developed using the versions in parenthesis. As the commands and options may vary between versions there can be some problems of compatibility with previous versions.

Other requirements:

A mitochondrial genome reference is used in the reconstruction. This reference can have up to 12% divergent without demising the quality of the resulting sequence. Also, the 4th step of the pipeline is mapping to the nuclear genome to remove reads that are preferentially mapped to it, so a nuclear assembly (with the mitochondria included) is also required.

Note that the pipeline has been designed for Linux environments and adapted for Mac.

Scripts:

Download the scripts from here

Usage:

Download the scripts folder and uncompress it. Open mtArchitect.sh and edit the “Variables to set” and the “Paths to tools” according to your system.

Once you have adapted the main script, run

./mtArchitect.sh sample_1.fastq sample_2.fastq sampleName

The final sequence will be in FINALseq/sampleName/sampleName.fa

Also, in this same folder you will find sampleName.bam which you can view with IGV (Use sampleName.fa in Genomes/Load Genome from File and sampleName.bam File/Load from File) to check any problems with the resulting sequence.

 

Updates:

12/01/2016 – mtArchitect.sh as well as the R scripts have been debugged and updated.

Contact: irene.lobon@ibe.upf-csic.es

Citation: Lobon, Irene, et al. “Demographic history of the genus Pan inferred from whole mitochondrial genome reconstructions.” Genome Biology and Evolution 8.6 (2016): 2020-2030.