Rescue strategies
In even the best of circumstances, it is unusual to generate a soluble version of any given protein on the first attempt. As such, it is important to have a series of alternative approaches. Here we provide various suggestions in the order in which we would usually apply them.
Changing expression conditions. Adjustment of the expression conditions seldom results in radical changes but, as some optimization can be done quite easily, it is worth the effort. The first step is to lower the temperature to slow down protein production. Different types of media can also be tested; rich media, such as Terrific Broth, 2×YT or ZYP5052 (auto-induction), often support good expression. Changing the E. coli strain can also improve expression of a soluble protein51.
Expression of more variants of the protein sequence. As described above, it is important to test the expression of a range of constructs to identify those that express a soluble derivative. We suggest expressing as many as 10 constructs in the initial attempts. If this proves unsuccessful, then it may be advisable to explore additional constructs, particularly if one has knowledge that a structurally related protein can be expressed in soluble form.
Alternate tags. Our consensus strategy is to append an N-terminal histidine tag to each construct. If the histidine-tagged recombinant protein does not express or is insoluble, then the probability that it will be expressed in an active form with another N-terminal fusion partner is reduced considerably. Our advice, therefore, is not to iteratively append different N-terminal fusions but to first explore a C-terminal fusion to the histidine tag instead. Some proteins that are completely insoluble with an N-terminal histidine tag can be expressed in soluble form with a C-terminal histidine tag57.
Although we do not advise extensive sampling of other N-terminal fusions, this strategy can sometimes lead to production of soluble, stable fusion protein. If the aim is to study the function of the target protein, and the fusion protein is an acceptable reagent, then it may be an appropriate strategy. However, this approach has its caveats. In the absence of a robust and quantitative functional assay, one reasonably uses solubility as a proxy for function. However, proteins that are soluble only with a larger tag can be ‘dragged’ into solution by the tag, and revert to an insoluble form if the fusion partner is removed38–40.This indicates that the integrity of the recombinant protein as a fusion protein may be suspect. For example, wild-type GFP is mostly insoluble when expressed in E. coli at 37 °C but is largely expressed in the soluble fraction as an MBP fusion58. Nonetheless, bacterial colonies expressing the MBP-GFP fusions display only weak fluorescence, suggesting that the GFP is non-functional (G.S. Waldo; unpublished data). Accordingly, before any functional studies, considerable attention should be paid to whether a target protein appears to be soluble only because it is a passenger on a larger tag.
Coexpression of interacting proteins. Many proteins are obligate components of multiprotein assemblies and these often require an interacting protein for correct folding and stability21,59,60. Such proteins, and those with unstructured polypeptide chain segments, often cannot be expressed in E. coli in soluble form, but it has proven possible to improve the properties of these proteins by coexpressing the cognate interacting protein61–63. This strategy is only starting to be used in the large-scale projects, in those cases when entire families of interacting proteins are being studied.
Ligand supplementation. Many proteins can be stabilized by the binding of a small molecule—a principle that has found widespread application in generic screening for protein ligands64,65. This property can be exploited to increase the proportion of recombinant protein expressed in soluble form or to stabilize a protein during purification. If a sufficiently soluble, cell-permeable and avid ligand is available, one can use it to stabilize newly synthesized proteins and promote solubility66,67. This concept has also not yet been explored sufficiently in a systematic way.
Other expression hosts. If bacterial expression is unsuccessful to this point, other hosts should be considered. Common eukaryotic alternatives are the baculovirus expression system in insect cells68, the yeasts Pichia pastoris69 and Saccharomyces cerevisiae70, human cells71, or cell-free systems using prokaryotic or eukaryotic extracts72–76. These cell-free systems, which have been used extensively to generate thousands of purified proteins for structural studies77–79, can be used to produce proteins that are toxic to E. coli79 and can use PCR-amplified linear DNA fragments, without cloning into a vector, for screening and optimization.
All these other expression systems are reasonably simple to use, but they are somewhat more time-consuming to work with than are bacteria and require equipment less commonly found in a typical laboratory.
Coexpression of chaperones. Proper in vivo folding of a recombinant protein can be promoted by coexpression of molecular chaperones, which are typically produced from cotransformed plasmids carrying several chaperones with synergistic effects, such as the pG-Tf2 vector80—a combination of GroEL-GroES81 and trigger factor82. In our hands, chaperones have been used successfully only in isolated cases, and we know of no study of considerable size that has demonstrated broad efficacy.
Refolding. A commonly tried but only episodically successful protocol to rescue insoluble protein is to denature the protein and try to refold it in vitro. The method can be successful83,84, particularly for extracellular proteins. However, even the most robust protocols only refold a small fraction of the input protein, and it is difficult to purify the refolded fraction. The best procedures use an activity assay to monitor refolding, and affinity reagents that select any refolded, active protein. We would advise using refolding as a last resort for intracellular proteins.