How to Interpret Differential Gene Expression Results

Estimated reading time: 7 min

Volcano plot showing differentially expressed genes with log2 fold change on the x-axis and statistical significance on the y-axis.

Table of Contents

Differential gene expression analysis is one of the most common outputs of RNA-seq experiments.

After running tools such as DESeq2, edgeR or limma-voom, researchers often receive a table containing gene IDs, expression values, log2 fold changes, p-values and adjusted p-values.

At first glance, this table may look straightforward.

Genes with low adjusted p-values are “significant”. Genes with positive log2 fold change are “upregulated”. Genes with negative log2 fold change are “downregulated”.

But interpretation is more subtle than that.

A differential expression result is not just a list of significant genes. It is a statistical summary of an experiment, shaped by sample quality, metadata, biological variability, sequencing depth, annotation quality and the model used for testing.

In this article, we explain how to interpret differential gene expression results and avoid common mistakes when moving from statistical output to biological insight.

For a broader workflow overview, see: RNA-Seq Data Analysis Pipeline: From FASTQ Files to Differential Gene Expression


What does a differential expression result table contain?

A typical RNA-seq differential expression result table includes columns such as:

  • gene ID;
  • gene name or annotation;
  • base mean or average expression;
  • log2 fold change;
  • standard error;
  • test statistic;
  • p-value;
  • adjusted p-value;
  • functional annotation.

The exact column names depend on the software used.

For example, DESeq2 result tables commonly include baseMean, log2FoldChange, lfcSE, stat, pvalue and padj.

These values should not be interpreted independently. A good interpretation considers expression level, effect size, statistical support and biological context together.


Start with the biological question

Before interpreting any gene list, return to the experimental question.

For example:

  • Which genes change after antibiotic exposure?
  • Which pathways respond to nutrient limitation?
  • How does a mutant differ from the wild type?
  • Which functions are induced during host interaction?
  • How does expression change over time?

The result table should be interpreted in relation to that question.

A common mistake is to sort genes by adjusted p-value, pick the top 20 and build a story around them. That can produce misleading conclusions if the original hypothesis, contrast and experimental design are not considered.

Ask:

  • What comparison was tested?
  • What is the reference condition?
  • Were batches modeled?
  • Were samples paired?
  • Were enough biological replicates included?
  • Is the contrast biologically meaningful?

If the statistical contrast is wrong, the interpretation will also be wrong.


Interpreting log2 fold change

Log2 fold change measures the estimated expression difference between conditions.

A positive log2 fold change usually means higher expression in the tested condition relative to the reference.

A negative log2 fold change usually means lower expression.

For example:

  • log2 fold change = 1 means approximately 2-fold higher expression;
  • log2 fold change = 2 means approximately 4-fold higher expression;
  • log2 fold change = -1 means approximately 2-fold lower expression;
  • log2 fold change = -2 means approximately 4-fold lower expression.

However, fold change alone is not enough.

A gene with a large fold change but very low expression may be unstable. A gene with a modest fold change but consistent expression across many samples may be biologically meaningful.

Practical interpretation

When evaluating log2 fold change, check:

  • Is the gene expressed at a meaningful level?
  • Is the direction of change consistent with the biology?
  • Is the fold change supported by adjusted p-value?
  • Is the gene annotation reliable?
  • Is the result driven by one outlier sample?
  • Does the gene belong to a relevant pathway or operon?

In microbial transcriptomics, neighboring genes in the same operon or pathway may provide useful context. A single gene changing alone may be interesting, but coordinated expression changes across related genes are often more interpretable.


Interpreting adjusted p-values

RNA-seq differential expression tests thousands of genes.

If thousands of tests are performed, some genes will have low raw p-values by chance. Adjusted p-values correct for this multiple-testing problem.

In many workflows, genes are considered statistically significant if they pass a threshold such as:

  • adjusted p-value < 0.05;
  • adjusted p-value < 0.1.

The threshold should be selected and reported transparently.

However, adjusted p-value is not the same as biological importance.

A gene can be statistically significant but biologically minor.

Another gene can have a strong biological effect but insufficient statistical support because of low sample size, high variability or low expression.

Practical interpretation

Use adjusted p-values to assess statistical reliability, but combine them with:

  • log2 fold change;
  • expression level;
  • replicate consistency;
  • annotation;
  • pathway context;
  • known biology.

Avoid using raw p-values as the main criterion for interpretation.


Volcano plots: useful but limited

Volcano plots are one of the most common RNA-seq figures.

They usually show:

  • log2 fold change on the x-axis;
  • statistical significance on the y-axis;
  • highlighted genes passing chosen thresholds.

Volcano plots are useful because they provide a quick overview of the differential expression landscape.

They help answer:

  • Are many genes changing?
  • Are changes mostly upregulated or downregulated?
  • Are there genes with both large effect and strong statistical support?
  • Are there outliers that need inspection?

But volcano plots can also mislead.

A volcano plot does not tell you whether the experiment was well designed, whether batch effects were handled, whether annotation is correct or whether the highlighted genes make biological sense.

Practical interpretation

Use volcano plots as summaries, not final conclusions.

For important genes, inspect:

  • normalized counts;
  • expression across individual samples;
  • annotation quality;
  • pathway context;
  • whether the gene is part of a broader biological pattern.

A volcano plot is the beginning of interpretation, not the end.


MA plots and expression level

An MA plot shows the relationship between average expression and log fold change.

This is useful because low-expression genes often show more variable fold changes.

If many genes with extreme fold changes are lowly expressed, interpret them cautiously. Some may be real, but others may reflect low count instability.

MA plots help identify whether differential expression is concentrated among well-expressed genes or dominated by low-count features.

In DESeq2 workflows, log2 fold-change shrinkage can improve interpretability, especially for ranking genes and visualizing effect sizes.


PCA plots and sample-level interpretation

Before interpreting differentially expressed genes, inspect sample-level structure.

PCA plots and sample distance heatmaps can reveal whether samples cluster by:

  • biological condition;
  • batch;
  • time point;
  • patient or donor;
  • sequencing run;
  • outlier status.

If samples do not behave as expected, the differential expression table may be unreliable.

For example, if samples cluster by sequencing batch rather than condition, batch effects may dominate the result.

If one replicate is far away from all others, it may drive many apparent expression changes.

If controls and treatments do not separate at all, the biological effect may be weak, noisy or absent.

PCA does not prove differential expression, but it helps assess whether the dataset is coherent.


Do not interpret genes without annotation context

A gene ID alone is rarely enough.

For biological interpretation, you need annotation.

Useful annotation layers may include:

  • gene product name;
  • functional category;
  • KEGG pathway;
  • COG category;
  • Gene Ontology terms;
  • Pfam domains;
  • enzyme commission number;
  • operon context;
  • antimicrobial resistance database hits;
  • secretion system or virulence annotations.

This is especially important in microbial RNA-seq, where many genes may be annotated as hypothetical proteins.

A result table full of hypothetical proteins is difficult to interpret unless additional functional annotation is performed.

If the organism is non-model, recently assembled or poorly annotated, genome annotation quality may strongly affect transcriptomics interpretation.


Look for patterns, not only individual genes

Individual genes can be important, but expression patterns are often more informative.

Look for coordinated changes in:

  • metabolic pathways;
  • transport systems;
  • stress response genes;
  • ribosomal proteins;
  • motility genes;
  • secretion systems;
  • virulence factors;
  • carbohydrate-active enzymes;
  • regulatory genes;
  • operons;
  • gene clusters.

For example, if several genes from the same pathway are upregulated, the biological interpretation is stronger than if only one isolated gene appears significant.

In bacterial transcriptomics, operon structure can be especially informative. Genes located together and transcribed together may show similar expression patterns.


Functional enrichment and pathway analysis

Differential expression tables can be long and difficult to interpret manually.

Functional enrichment helps summarize whether certain biological processes, pathways or functional categories are overrepresented among differentially expressed genes.

Depending on the organism and annotation, this may include:

  • GO enrichment;
  • KEGG pathway enrichment;
  • COG category enrichment;
  • Reactome pathways;
  • custom functional groups;
  • manually curated gene sets.

For microbial datasets, KEGG, COG, eggNOG, Pfam and custom pathway annotations may be more useful than generic GO terms alone.

However, enrichment results also require caution.

They depend on:

  • annotation quality;
  • background gene set;
  • statistical method;
  • threshold used to define differentially expressed genes;
  • pathway database coverage;
  • organism-specific biology.

Enrichment analysis is useful, but it is not automatic biological truth.


Beware of overinterpreting small datasets

Small RNA-seq datasets can produce unstable results.

With few biological replicates, it becomes harder to estimate variability. This affects statistical power, adjusted p-values and confidence in gene-level changes.

A dataset with two replicates per condition may still show patterns, but conclusions should be cautious.

A dataset with no biological replication should usually be considered exploratory.

In such cases, focus more on:

  • broad expression trends;
  • large and biologically plausible changes;
  • consistency with independent evidence;
  • transparent reporting of limitations.

Avoid making strong mechanistic claims from weakly powered experiments.


Common interpretation mistakes

Common mistakes include:

  • ranking genes by raw p-value;
  • ignoring adjusted p-values;
  • focusing only on fold change;
  • ignoring low expression;
  • overinterpreting volcano plots;
  • forgetting batch effects;
  • interpreting genes without checking annotation;
  • treating all significant genes as equally important;
  • ignoring sample-level QC;
  • failing to inspect individual gene counts;
  • using pathway analysis without checking the gene background;
  • turning exploratory results into strong claims.

Most interpretation errors happen when the result table is treated as a final answer rather than a statistical starting point.

For technical pitfalls before interpretation, see: Common DESeq2 Mistakes and How to Avoid Them


Practical checklist for interpreting differential expression

Before writing conclusions, check:

  1. Was the correct comparison tested?
  2. Is the reference condition correct?
  3. Were batch effects considered?
  4. Do PCA plots look reasonable?
  5. Are there outlier samples?
  6. Are genes filtered appropriately?
  7. Are adjusted p-values used?
  8. Are fold changes interpreted with expression level?
  9. Are key genes supported by normalized counts?
  10. Is annotation reliable?
  11. Are pathway-level patterns coherent?
  12. Are limitations clearly reported?

This checklist helps avoid overinterpretation and improves the quality of downstream biological conclusions.


Final thoughts

Differential gene expression analysis is not just a statistical procedure. It is an interpretation workflow.

A result table can identify candidate genes, but biological meaning comes from integrating statistics, expression levels, sample quality, annotation, pathways and experimental context.

Strong RNA-seq interpretation requires both computational care and biological judgment.

If you are working with RNA-seq data and need support with differential expression analysis, DESeq2 workflows, pathway interpretation or publication-ready figures, Tailoredomics offers transcriptomics services adapted to microbial and biological research projects.


FAQ

What does log2 fold change mean in RNA-seq?

Log2 fold change represents the estimated expression difference between conditions on a log2 scale. A value of 1 means approximately 2-fold higher expression, while -1 means approximately 2-fold lower expression.

Should I use p-values or adjusted p-values?

Adjusted p-values should usually be used for interpreting RNA-seq differential expression because thousands of genes are tested simultaneously.

Is a volcano plot enough to interpret RNA-seq results?

No. A volcano plot is useful for visualization, but it should be interpreted together with sample QC, normalized counts, annotation and pathway context.

Why are some large fold changes not significant?

Large fold changes may not be statistically significant if expression is low, variability is high, or sample size is limited.

What should I do after getting a list of differentially expressed genes?

Inspect sample quality, check key gene counts, annotate genes, look for pathway-level patterns and interpret results in relation to the biological question.

Rubén Javier López Avatar

Rubén Javier López

Founder and Bioinformatician PhD in Microbiology

Rubén holds a microbiology PhD degree granted by the University of Bergen (Norway). He is proficient in bacterial metagenomics, genomics, transcriptomics and transcriptomics. He has hands-on experience and data analysis expertise in Illumina, Nanopore and PacBio sequencing technologies and has collaborated with scientists and labs all over the world. Moreover, he has been associated with biomedicine research groups, analyzing microbiome and mycobiome data.

Areas of Expertise: Microbiology, Extremophiles, NGS, Microbial Genomics, Transcriptomics, Differential Gene Expression, Metagenomics, Microbiome studies.
Fact Checked & Editorial Guidelines
Reviewed by: Subject Matter Experts

Ready to uncover the functional landscape of your microbial samples?

Explore our services at Tailoredomics. Request a quote or contact us for consultation

Leave a Reply

Volcano plot showing differentially expressed genes with log2 fold change on the x-axis and statistical significance on the y-axis.
Transcriptomics
Rubén Javier López

How to Interpret Differential Gene Expression Results

Differential gene expression analysis is one of the most common outputs of RNA-seq experiments. After running tools such as DESeq2, edgeR or limma-voom, researchers often receive a table containing gene IDs, expression values, log2 fold changes, p-values and adjusted p-values. At first glance, this table may look straightforward. Genes with low adjusted p-values are “significant”. Genes with positive log2 fold change are “upregulated”. Genes with negative log2 fold change are “downregulated”. But interpretation is more subtle than that. A differential expression result is not just a list of significant genes. It is a statistical summary of an experiment, shaped by

Read More »
Metagenome assembled genomes reconstructed from environmental sequencing data
Metagenomics & Microbiome
Rubén Javier López

Low-Quality MAGs: Common Causes and Fixes

Metagenome-assembled genomes, or MAGs, are one of the most useful outputs of shotgun metagenomics. Instead of only asking which organisms are present, MAGs allow researchers to reconstruct draft genomes directly from complex microbial communities. This can reveal metabolic potential, ecological roles, genome content, functional pathways, and possible interactions between organisms. However, MAG recovery is not always successful. A metagenomics project may produce many bins, but only a few of them may be useful. Some bins may have low completeness, high contamination, poor taxonomic consistency, or fragmented assemblies. Others may look acceptable numerically but still be difficult to interpret biologically. In

Read More »
Metagenome assembly pipeline from environmental DNA sequencing reads to metagenome assembled genomes
Metagenomics & Microbiome
Rubén Javier López

Kraken2 vs Kaiju vs MetaPhlAn: Which Taxonomic Profiler Should You Use?

Taxonomic profiling is one of the most common tasks in shotgun metagenomics. After sequencing a microbiome or environmental sample, one of the first questions is usually simple: Which organisms are present, and in what relative abundance? To answer that question, many researchers use tools such as Kraken2, Kaiju, or MetaPhlAn. All three are widely used for metagenomic taxonomic profiling, but they do not work in the same way and they do not always answer the same question with the same assumptions. This matters because different tools can produce different taxonomic profiles from the same dataset. A sample may look more

Read More »