Metagenomic Binning Tools Compared: MetaBAT2 vs MaxBin2 vs CONCOCT

Estimated reading time: 2 min

Table of Contents


Introduction

Shotgun metagenomics allows researchers to sequence all genetic material in an environmental sample. However, after assembly, the resulting dataset contains thousands of contigs from multiple organisms.

To reconstruct individual microbial genomes, these contigs must be grouped into bins. This process is known as metagenomic binning.

In this article, we compare the most widely used metagenomic binning tools and explain how to choose the right approach for recovering high-quality metagenome assembled genomes (MAGs).

If you are new to metagenomics workflows, see our guide: Metagenome Assembly Pipeline.


What Is Metagenomic Binning?

Metagenomic binning is the process of grouping assembled contigs into clusters that represent individual genomes.

These clusters, known as bins, can be refined into metagenome assembled genomes (MAGs).

Genome binning process grouping contigs into metagenome assembled genomes

Binning relies on multiple signals:

  • sequence composition (GC content, k-mers)
  • coverage patterns across samples
  • phylogenetic markers

Main Metagenomic Binning Tools

MetaBAT2

MetaBAT2 is one of the most widely used binning tools.

It uses probabilistic distances based on tetranucleotide frequency and coverage depth to group contigs.

Advantages:

  • high accuracy
  • fast execution
  • works well with complex communities

Limitations:

  • requires sufficient sequencing depth

MaxBin2

MaxBin2 uses an Expectation-Maximization algorithm combined with marker genes.

Advantages:

  • robust for low-abundance genomes
  • uses marker genes for improved classification

Limitations:

  • slower than MetaBAT2
  • may produce more fragmented bins

CONCOCT

CONCOCT clusters contigs based on coverage across multiple samples and sequence composition.

Advantages:

  • effective for multi-sample datasets
  • captures strain-level variation

Limitations:

  • requires multiple samples
  • more complex setup

Comparison of Binning Tools

Tool Best Use Case Strength Limitation
MetaBAT2 General-purpose binning Fast and accurate Needs good coverage
MaxBin2 Low-abundance genomes Marker gene support Slower
CONCOCT Multi-sample studies Coverage-based clustering Complex workflow

Combining Multiple Binning Tools

In practice, many researchers combine multiple binning tools to improve genome recovery.

Tools such as DASTool integrate results from different binning methods to produce higher-quality bins.

This approach often improves completeness while reducing contamination.

Metagenome assembled genomes reconstructed from environmental sequencing data


Quality Assessment of Bins

After binning, genome quality must be evaluated.

Key metrics include:

  • completeness
  • contamination
  • strain heterogeneity

Common tools include:


How to Choose the Right Binning Tool

The choice of binning tool depends on your dataset:

  • Single sample: MetaBAT2
  • Low abundance genomes: MaxBin2
  • Multiple samples: CONCOCT
  • Best results: combine tools with DASTool

For complex microbiomes, combining multiple approaches is often the best strategy.


Final Thoughts

Metagenomic binning is a critical step in reconstructing microbial genomes from shotgun sequencing data.

Choosing the right binning tools and combining methods when appropriate can significantly improve the quality of recovered MAGs.

If you need support with metagenomics data analysis and genome reconstruction, explore our Metagenomics Services.

Rubén Javier López Avatar

Rubén Javier López

Founder and Bioinformatician PhD in Microbiology

Rubén holds a microbiology PhD degree granted by the University of Bergen (Norway). He is proficient in bacterial metagenomics, genomics, transcriptomics and transcriptomics. He has hands-on experience and data analysis expertise in Illumina, Nanopore and PacBio sequencing technologies and has collaborated with scientists and labs all over the world. Moreover, he has been associated with biomedicine research groups, analyzing microbiome and mycobiome data.

Areas of Expertise: Microbiology, Extremophiles, NGS, Microbial Genomics, Transcriptomics, Differential Gene Expression, Metagenomics, Microbiome studies.
Fact Checked & Editorial Guidelines
Reviewed by: Subject Matter Experts

Ready to uncover the functional landscape of your microbial samples?

Explore our services at Tailoredomics. Request a quote or contact us for consultation

Leave a Reply

Metagenome assembled genomes reconstructed from environmental sequencing data
Metagenomics & Microbiome
Rubén Javier López

Low-Quality MAGs: Common Causes and Fixes

Metagenome-assembled genomes, or MAGs, are one of the most useful outputs of shotgun metagenomics. Instead of only asking which organisms are present, MAGs allow researchers to reconstruct draft genomes directly from complex microbial communities. This can reveal metabolic potential, ecological roles, genome content, functional pathways, and possible interactions between organisms. However, MAG recovery is not always successful. A metagenomics project may produce many bins, but only a few of them may be useful. Some bins may have low completeness, high contamination, poor taxonomic consistency, or fragmented assemblies. Others may look acceptable numerically but still be difficult to interpret biologically. In

Read More »
Metagenome assembly pipeline from environmental DNA sequencing reads to metagenome assembled genomes
Metagenomics & Microbiome
Rubén Javier López

Kraken2 vs Kaiju vs MetaPhlAn: Which Taxonomic Profiler Should You Use?

Taxonomic profiling is one of the most common tasks in shotgun metagenomics. After sequencing a microbiome or environmental sample, one of the first questions is usually simple: Which organisms are present, and in what relative abundance? To answer that question, many researchers use tools such as Kraken2, Kaiju, or MetaPhlAn. All three are widely used for metagenomic taxonomic profiling, but they do not work in the same way and they do not always answer the same question with the same assumptions. This matters because different tools can produce different taxonomic profiles from the same dataset. A sample may look more

Read More »
Bioinformatic Workflows
Rubén Javier López

Common DESeq2 Mistakes and How to Avoid Them

DESeq2 is one of the most widely used tools for differential gene expression analysis in RNA-seq experiments. It is powerful, well documented, and suitable for many standard bulk RNA-seq designs. However, it is also easy to misuse. Many problematic RNA-seq results are not caused by DESeq2 itself, but by mistakes before, during, or after the DESeq2 analysis. These mistakes can lead to false positives, missing differentially expressed genes, misleading volcano plots, incorrect biological conclusions, or results that are difficult to reproduce. Common problems include using the wrong input data, ignoring batch effects, designing the model incorrectly, filtering genes too aggressively,

Read More »