Inferentus Launching October 1, 2025!
Data analyses à la carte

Prokaryotic 16S metabarcoding

Code: MTBC16S01

Analysis of prokaryotic 16S rRNA gene short-read amplicon sequences, including OTU clustering, taxonomic identification and abundance profiling, tree construction, functional predictions, dissimilarity metrics, MDS, visualizations and screening for common pathogens. The client provides raw amplicon sequence data and sample metadata.

Log in to see availability and payment modalities.

▾ General introduction

Bacterial and archaeal 16S metabarcoding is a genetic technique for surveying microbial diversity in a variety of environments, ranging from soil, the ocean and groundwater to animal feces and the human gut. A gene called 16S rRNA gene, shared by all bacteria and archaea but varying slightly between species, is used as a marker gene to identify which microbial taxa are present in a sample and at which proportions.

In a single teaspoon of soil, for example, 16S metabarcoding can identify hundreds of distinct species, yielding a high-resolution microbial blueprint of each sample that can be used for statistical comparisons across treatments, space or time, for visualization of local biodiversity, or for the detection of potential pathogens. Due to its relatively low cost and practicality, 16S metabarcoding is used in thousands of microbiome studies worldwide, for example to examine the effects of climate on microbial processes in the ocean, to determine potential disease factors in humans, and to improve microbially driven wastewater treatment processes.

A typical 16S metabarcoding study proceeds in the following stages:

  • Collection of small amounts (<1 g) of material from each sample by the researcher.
  • Extraction of DNA from each sample using an in-house or commercial kit. This step is sometimes outsourced to an academic or commercial service provider.
  • Amplification of DNA fragments belonging to a specific region of the 16S rRNA gene using PCR, library preparation and sequencing of the amplified DNA. This step is commonly performed by an academic or commercial service provider. The most widespread technology is short read Illumina sequencing, which yields large numbers of sequences around 150-300 bp long.
  • Sequencing ultimately yields a separate set of DNA sequences for each sample, all covering the same region of the 16S rRNA gene, ranging from thousands to millions of sequences per sample. These data are commonly stored in fastq files, which are delivered by the sequencing service provider to the researcher.
  • Computational analysis of the sequences, including trimming and removal of poor quality (i.e., likely erroneous) sequences, clustering of similar sequences to reduce redundancy and identify species-like units called OTUs and strain-like units called ASVs, and estimation of the relative abundance of each OTU/ASV/taxon in each sample.
  • Statistical analysis, hypothesis testing and visualization of microbial community compositions. This step generally incorporates additional sample metadata, such as information about treatment groups, chemical measurements at each site, disease symptoms in human subjects, and so on.

We are eager to help you with your data analysis. Just upload your sequences and metadata, optionally configure the analysis to your preferences, and we can handle it from there.

▸ Overview of provided analysis
▸ Input requirements
▸ Examples of data products
▸ Examples of generated figures
▸ Used 3rd party resources
▸ Relevant publications
▸ Price and billing