Poster Presentation Australian Microbial Ecology 2022

Facilitating genome-resolved metagenome analysis with Metaphor (#141)

Vinícius Werneck Salazar 1 , Babak Shaban 2 , Maria Quiroga 2 , Robert Turnbull 2 , Edoardo Tescari 2 , Vanessa Rossetto Marcelino 3 , Heroen Verbruggen 4 , Kim-Anh Le Cao 1
  1. School of Mathematics and Statistics, Melbourne Integrative Genomics, The University of Melbourne, Melbourne, Victoria, Australia
  2. Melbourne Data Analytics Platform, The University of Melbourne, Melbourne, Victoria, Australia
  3. Department of Molecular and Translational Science, Hudson Institute of Medical Research, Melbourne, Victoria, Australia
  4. School of Biosciences, The University of Melbourne, Melbourne, Victoria, Australia

The reconstruction of metagenome-assembled genomes (MAGs) from environmental and host-associated microbiomes is a powerful technique to obtain taxonomic and functional information for uncultivated microbes. However, the multiple bioinformatic steps to obtain MAGs can be cumbersome and computationally inefficient for large data sets. Here we present Metaphor, a user-friendly workflow designed to facilitate quality control, assembly and binning of metagenomes. Metaphor incorporates best practices in Snakemake workflow development, optimizing computational efficiency and enabling large-scale data analyses. Our pipeline has flexible options for individual or pooled assembly and binning, and multiple binning algorithms with a binning refinement step. Metaphor also returns gene and contig-level coverage estimation, functional and taxonomic annotation, performance benchmarking (runtime and memory usage per computational task), and a variety of tabular and graphical reports generated by each of its six modules: quality control, assembly, annotation, binning, read mapping, and postprocessing. Final data generated by Metaphor is designed to be easily imported into downstream applications for further statistical and bioinformatic analyses. We showcase Metaphor with metagenome datasets from the CAMI challenge, an initiative for the performance evaluation of metagenomics software, and demonstrate how to fine-tune Metaphor for different datasets.