How to Assemble a Bacterial Genome from Nanopore Reads

Estimated reading time: 3 min

Learn how to assemble a bacterial genome from Oxford Nanopore sequencing reads. This guide explains the complete genome assembly workflow using long-read sequencing data.

Table of Contents


Introduction

Long-read sequencing technologies have transformed microbial genomics by enabling the assembly of complete bacterial genomes from a single sequencing experiment.

Among these technologies, Oxford Nanopore sequencing has become widely used due to its ability to generate ultra-long reads capable of spanning repetitive genomic regions.

In this guide we explain how to assemble a bacterial genome from Nanopore reads, including the major steps involved in long-read assembly, polishing, and quality assessment.

If you need help analyzing microbial sequencing datasets, explore our Microbial Genomics Services.


Why Nanopore Sequencing Is Useful for Bacterial Genome Assembly

Traditional short-read sequencing technologies often produce fragmented genome assemblies because repetitive regions cannot be resolved.

Oxford Nanopore sequencing generates long reads that can span these regions, allowing the reconstruction of near-complete bacterial chromosomes.

Advantages of Nanopore sequencing include:

  • long read lengths, often tens of kilobases
  • ability to resolve repetitive genomic regions
  • improved assembly contiguity
  • portable sequencing devices

Overview of a Nanopore Genome Assembly Pipeline

A typical Nanopore genome assembly pipeline includes the following steps:

  1. quality control of raw reads
  2. genome assembly
  3. assembly polishing
  4. genome circularization
  5. assembly quality assessment
  6. genome annotation

Bacterial genome assembly workflow using Oxford Nanopore sequencing reads


Step 1: Quality Control of Nanopore Reads

Raw Nanopore reads are typically stored in FASTQ format after basecalling.

Before assembly, sequencing reads should be evaluated to identify potential problems such as:

  • low-quality reads
  • adapter contamination
  • very short fragments

Common quality control tools include:


Step 2: Genome Assembly

The next step is assembling long sequencing reads into contiguous genomic sequences.

Several tools are optimized for long-read genome assembly.

Popular assemblers include:

These tools construct assembly graphs that connect overlapping reads to reconstruct the bacterial chromosome.

Genome assembly graph showing overlaps between long sequencing reads


Step 3: Assembly Polishing

Nanopore reads have higher raw error rates than short-read technologies. As a result, genome assemblies must be polished to correct sequencing errors.

Polishing improves base-level accuracy and gene prediction quality.

Common polishing tools include:

Multiple rounds of polishing are often performed to achieve optimal accuracy. If you need support with long-read assemblies, see our Microbial Genomics Services.


Step 4: Genome Circularization

Many bacterial genomes consist of circular chromosomes.

After assembly, the contig representing the chromosome may contain overlapping ends that should be trimmed to create a properly circularized genome.

Tools such as Unicycler or Circlator can assist with this step.


Step 5: Assembly Quality Assessment

Once the genome assembly is complete, its quality must be evaluated.

Key metrics include:

  • genome completeness
  • contig count
  • N50 statistics
  • contamination levels

Common evaluation tools include:


Step 6: Genome Annotation

After assembling the genome, the next step is identifying genes and functional elements.

Genome annotation tools predict coding sequences, RNA genes, and functional pathways.

Common annotation tools include:

If you are unfamiliar with genome annotation workflows, see our detailed guide: What Is Genome Annotation?.


Final Thoughts

Oxford Nanopore sequencing enables the assembly of high-quality bacterial genomes with fewer contigs and improved structural accuracy.

By combining long-read assembly, polishing, and quality assessment, researchers can reconstruct near-complete microbial genomes suitable for comparative genomics, functional analysis, and evolutionary studies.

If you need assistance assembling microbial genomes from Nanopore or hybrid sequencing data, explore our Microbial Genomics Services.

Rubén Javier López Avatar

Rubén Javier López

Founder and Bioinformatician PhD in Microbiology

Rubén holds a microbiology PhD degree granted by the University of Bergen (Norway). He is proficient in bacterial metagenomics, genomics, transcriptomics and transcriptomics. He has hands-on experience and data analysis expertise in Illumina, Nanopore and PacBio sequencing technologies and has collaborated with scientists and labs all over the world. Moreover, he has been associated with biomedicine research groups, analyzing microbiome and mycobiome data.

Areas of Expertise: Microbiology, Extremophiles, NGS, Microbial Genomics, Transcriptomics, Differential Gene Expression, Metagenomics, Microbiome studies.
Fact Checked & Editorial Guidelines
Reviewed by: Subject Matter Experts

Ready to uncover the functional landscape of your microbial samples?

Explore our services at Tailoredomics. Request a quote or contact us for consultation

Leave a Reply

Proteomics
Rubén Javier López

How to Submit Proteomics Data to PRIDE: A Practical Guide

Submitting proteomics data to the PRIDE repository is a mandatory requirement for publication in most journals — yet it is one of the most common bottlenecks that delays manuscript submission in proteomics groups. The science is done. The paper is written. And then everything stalls at data deposition. This post explains what PRIDE submission involves, why it fails more often than it should, and what your options are when you need it done quickly and correctly. Note: Tailoredomics provides downstream proteomics bioinformatics and PRIDE data deposition services. We do not perform mass spectrometry or wet-lab work — we work with

Read More »
Tips
Rubén Javier López

How to Choose a Bioinformatics Service Provider

Sequencing data are easier to generate than ever, but analyzing them correctly remains difficult. Many research groups now receive FASTQ files, count tables, genome assemblies or metagenomic datasets from sequencing facilities, but do not always have the time, computational resources or specialized expertise to process them into reliable biological results. This is where a bioinformatics service provider can help. The right provider can turn raw sequencing data into reproducible workflows, interpretable figures, clear reports and publication-ready results. The wrong provider can produce generic outputs, poorly documented methods, unclear files, weak interpretation or results that are difficult to defend in a

Read More »
Volcano plot showing differentially expressed genes with log2 fold change on the x-axis and statistical significance on the y-axis.
Transcriptomics
Rubén Javier López

How to Interpret Differential Gene Expression Results

Differential gene expression analysis is one of the most common outputs of RNA-seq experiments. After running tools such as DESeq2, edgeR or limma-voom, researchers often receive a table containing gene IDs, expression values, log2 fold changes, p-values and adjusted p-values. At first glance, this table may look straightforward. Genes with low adjusted p-values are “significant”. Genes with positive log2 fold change are “upregulated”. Genes with negative log2 fold change are “downregulated”. But interpretation is more subtle than that. A differential expression result is not just a list of significant genes. It is a statistical summary of an experiment, shaped by

Read More »