E. coli is one of the most widely used expression hosts for the production of recombinant proteins. It is often the first system chosen for producing recombinant proteins due to its advantages over other systems, such as:

  • Inexpensive setup and running costs
  • High recombinant protein production levels 
  • Short timeline from cloning to protein recovery
  • Limited technical knowledge required for culturing
  • Scalability from small (1 mL) to very large culture (>10,000 L) volumes

However, bacterial expression systems are limited in their ability to perform post-translational modifications (PTMs) and facilitate disulphide bond formation. Most proteins require some form of PTM to be produced in their native conformation. Due to the reducing environment of the bacterial cytoplasm, disulphide bond formation can only be achieved by targeting the protein to the oxidative periplasm. Some modern E. coli strains have been developed to overcome some of these limitations (see below).

Bacterial Expression Hosts

A wide range of bacterial strains are available for recombinant protein production, each offering a novel advantage. The most common strains carry a DE3 lysogen that enables expression of T7 RNA polymerase, driving high level expression of proteins under control of the T7 Promotor in the vector. In these systems, recombinant protein expression is repressed by the host strain (and by the vector if using pET) until induction with IPTG.

Another feature common in a range of host strains is pLysS. pLysS is a plasmid carrying a gene encoding T7 phage lysozyme, an inhibitor of T7 RNA polymerase. This system enables tighter control over protein expression and is particularly useful for the production of toxic genes, or if expression is observed prior to induction ('leaky' expression).

Some common E. coli expression strains are listed below.

Strain Features
BL21 (DE3) Most common host strain, enables high-level recombinant protein expression
BL21 (DE3) pLysS Enables high-level expression and suppression of T7 RNA Polymerase basal level expression
BL21-CodonPlus Improved expression of genes with codons rarely used in bacteria
Rosetta Improved expression of genes with codons rarely used in bacteria
Arctic Express Contains a plasmid encoding chaperonins to aid in folding and allow expression at low temperature (12°C)
Tuner Enables uniform uptake of IPTG into each cell, finer control of expression can be achieved by varying IPTG concentration
Shuffle® T7 Express Expresses DsbC to enable cytoplasmic disulphide bond formation

Optimising Expression in Bacteria

Generating soluble, active protein is a major challenge for recombinant protein production in E. coli. Over-expression of eukaryotic proteins often leads to the formation of insoluble aggregates known as inclusion bodies. Although proteins in inclusion bodies are generally of a very high yield and purity, the material must be denatured and refolded to restore native protein structure.

Several methods have been developed to combat the formation of inclusion bodies and drive soluble protein expression. Some of these are described briefly below.

Host strain selection

Selection of the right cell strain is critical for successful soluble protein production. Advances in host cell engineering have led to the development of several key capabilities that should be taken into consideration when choosing a strain for expression:

  • Rare codon usage: These strains carry an extra plasmid (e.g. pRARE) that encode a number of rare codon tRNAs. This allows more efficient expression of genes that contain these rare codons (generally those of eukaryotic origin)
  • Improved folding: These strains carry either a mutation in specific genes that inhibit the formation of disulphide bonds, or carry a chromosomal copy of disulphide bond isomerase. This enables cytoplasmic production of proteins requiring disulphide bonds
  • Induction control: These strains contain a mutation in the lac permease, allowing homogenous uptake of IPTG into all cells in the culture. This enables finer control over induction by varying the concentration of IPTG

Growth temperature

Although most bacterial strains are typically cultured at 37°C, lowering the temperature (e.g. to 30°C or 15°C) often improves protein folding and increases soluble protein production.

Expression with a fusion tag

Fusion partners such as affinity or solubility tags can have a significant effect on the recovery of a protein. There are several solubility tags that are commonly used to improve soluble protein expression, including maltose binding protein (MBP), glutathione S-transferase (GST), thioredoxin (Trx) and small ubiquitin-like modifier (SUMO).


Co-expression of recombinant proteins with chaperonins and foldases can aid expression in multiple ways. Chaperonins such as cpn10 and cpn60 bind to and stabilise unfolded or partially folded proteins. Expression with foldases such as disulphide oxioreductase (DsbA) or disulphide isomerase (DsbC) can aid in producing soluble protein if formation of disulphide bonds is required.

Codon optimisation

Codon optimisation is the process of modifying codons in a gene sequence to match the codon usage bias of the host cell used for expression. This is frequently applied to heterologous proteins expressed in E. coli, as codon usage varies heavily between prokaryotes and eukaryotes. Software is used to optimise the codon sequences, followed by gene synthesis.


LB broth is the most commonly used media for cloning and bacterial expression. Although this media is simple and cheap to prepare, it does not support high-level biomass production. Other media formulations such as TB, 2YT or chemically defined media enable higher density cultures in shake flasks and bioreactors.


Antibiotics enable selection of recombinant clones as well as preventing contamination during cloning and expression. However, the use of antibiotics in large-scale expression is often decreased or omitted, due to the associated costs. Antibiotic selection pressure also increases the metabolic burden on the cell, impacting recombinant protein expression.