Random Mutagenesis by PCR

David Wilson and Tony Keefe, March 2000

This also appears in "Current Protocols in Molecular Biology"

 

Error-prone PCR (EP-PCR) is the method of choice for introducing random mutations into a defined segment of DNA that is too long to be chemically synthesized as a degenerate sequence (UNIT 8.2A). The 5' and 3' boundaries of the mutated region may be defined by the choice of PCR primers. Accordingly it is possible to mutagenize an entire gene or merely a segment of a gene. The average number of mutations per DNA fragment can be controlled by the number of EP-PCR doublings performed.

The EP-PCR technique described here is based on the protocol of Cadwell and Joyce (1992). This reference also discusses the history of the technique and compares it to other in vitro and in vivo mutagenic methods. EP-PCR takes advantage of the inherently low fidelity of Taq DNA polymerase, which may be further decreased by the addition of Mn2+, increasing the Mg2+ concentration, and using unequal dNTP concentrations.

STRATEGIC PLANNING

After choosing a region of DNA to randomly mutagenize, one must decide on the desired level of mutagenesis that is best suited to the project. If the mutation rate is too low, it may not be possible to find the potentially rare variants of interest. If the mutation rate is too high, nearly all of the resulting library of molecules will carry multiple mutations and may therefore be inactive. The desired extent of mutation depends on the type of activity one is attempting to generate and the number of library members that can be screened. A reasonable approach in many instances is to generate a library such that a few unmutagenized molecules will be present in the collection of screened clones.

The average number of mutations per template increases as a function of the number of doublings in the EP-PCR reaction (Table 1). It should be noted that it is the number of doublings that is the determining factor, rather than the number of EP-PCR cycles. Each cycle of EP-PCR generally increases the amount of DNA by a factor of 1.7 - 1.9 until the DNA concentration reaches a plateau and then stops increasing altogether. The point at which this plateau occurs depends on the template and primer lengths and sequences, but is generally in the range of 5 - 50 ng/mL. It is not advisable to continue thermal cycles beyond the plateau point.

Prior to performing the actual EP-PCR reaction, it is important to run a sample reaction to determine the amplification efficiency under the EP-PCR reaction conditions. This should be evaluated for two reasons. First, if the amplification per cycle is too low (<1.7-fold increase in product DNA concentration per cycle), DNA fragments that contain one or both of the primer-binding sites, but are shorter than the desired product, may have a strong selective advantage for amplification. These shorter fragments are produced by mis-priming during normal or error-prone PCR. After several cycles, these shorter sequences may "take over" the EP-PCR reaction. This can be an especially severe problem when many cycles (>15) are to be performed. Second, the amplification per cycle must be known in order to calculate the number of EP-PCR cycles necessary to achieve the desired number of doublings.

The amount of DNA amplification per EP-PCR cycle can be determined by diluting a known amount of the unmutagenized PCR product, then amplifying it using the EP-PCR protocol, and occasionally removing portions of the reaction for quantitation on an ethidium bromide-stained agarose gel (UNIT 2.7). The amplification per cycle should generally be >1.7. The yield per cycle can be optimized by altering the annealing temperature. Also, the extension time should last for at least 3 min. If this does not bring satisfactory results, longer or shorter primers may have to be used. Primer lengths of 20 - 40 nucleotides usually produce acceptable results.

Table 1 outlines the average number of nucleotide substitutions per template as a function of the number of EP-PCR doublings and the length of the template. Table 2 shows what fraction of the resulting products will be completely free from mutation.

In most cases the mutagenized DNA of interest will encode a protein. The fraction of mutated amino acids will be higher than the fraction of mutated nucleotides by a factor of about 2.2. This is because a mutation in any of the three positions of a codon may result in an amino acid substitution. If the initial template is a random open reading frame (equal probability of each nucleotide at each position in each codon), mutation at the first position of a codon will cause an amino acid change 96% of the time; mutation at the second and third positions will cause amino acid changes 100% and 23% of the time, respectively (as calculated using the mutation frequencies in Table 5).

Table 1. The average number of mutations per DNA template as a function of template length and number of EP-PCR doublings.

EP-PCR

doublings

mutations per nucleotide position

100 bp

200 bp

400 bp

800 bp

1600 bp

5

0.0033

0.33

0.66

1.3

2.6

5.3

10

0.0066

0.66

1.3

2.6

5.3

11

20

0.013

1.3

2.6

5.3

11

21

30

0.020

2.0

4.0

7.9

16

32

50

0.033

3.3

6.6

13

26

53

Table 2. The fraction of un-mutated DNA templates as a function of template length and number of EP-PCR doublings.

EP-PCR

doublings

mutations per nucleotide position

100 bp

200 bp

400 bp

800 bp

1600 bp

5

0.0033

0.72

0.52

0.27

0.071

0.0050

10

0.0066

0.52

0.27

0.071

0.0050

2.5X10-5

20

0.013

0.26

0.070

0.0049

2.4X10-5

5.8X10-10

30

0.020

0.14

0.018

0.00033

1.1X10-7

1.3X10-14

50

0.033

0.035

0.0012

1.5X10-6

2.2X10-12

4.8X10-24

Table 3. The average number of amino acid mutations per open reading frame (ORF) as a function of ORF length and number of EP-PCR doublings.

EP-PCR

doublings

mutations per codon

33 aa

66 aa

133 aa

267 aa

533 aa

5

0.0076

0.25

0.50

1.0

2.0

4.0

10

0.015

0.50

1.0

2.0

4.0

8.0

20

0.030

1.0

2.0

4.0

8.1

16

30

0.045

1.5

3.0

6.0

12

24

50

0.076

2.5

5.0

10

20

40

Table 4. The fraction of ORF's encoding wild-type polypeptide as a function of ORF length and the number of EP-PCR doublings.

EP-PCR

doublings

mutations per codon

33 aa

66 aa

133 aa

267 aa

533 aa

5

0.0076

0.78

0.60

0.36

0.13

0.017

10

0.015

0.60

0.36

0.13

0.017

0.00030

20

0.030

0.36

0.13

0.017

0.00028

7.8X10-8

30

0.045

0.21

0.045

0.0021

4.3X10-6

1.8X10-11

50

0.076

0.073

0.0053

2.8X10-5

7.9X10-10

6.3X10-19

This EP-PCR protocol produces all types of substitution mutations, but the distribution is highly non-random. The results we obtained in one experiment are shown in Table 5. It should be noted that in the EP-PCR reaction, both top and bottom DNA strands are equally subject to mutagenesis, so mutations from G to A and from C to T, for example, are combined together.

Table 5. Observation of each type of substitution in a collection of 97 mutations generated by EP-PCR.

Type of mutation

Number times observed

A->T and T->A

34

G->A and C->T

26

A->G and T->C

24

A->C and T->G

6

G->C and C->G

5

G->T and C->A

2

 

 

BASIC PROTOCOL

Mutagenize a 400 bp DNA sequence for ten doublings, to achieve a mutation rate of 0.66% per NUCLEOTIDE position

Materials

PCR template, 400 bp in length

5' and 3' PCR primers, 100 然 each

dCTP and dTTP, 25 mM each, pH~7

dATP and dGTP, 5 mM each, pH~7

KCl, 2 M

MgCl2, 200 mM

Tris pH 8.3, 100 mM

MnCl2, 25 mM

Taq DNA Polymerase, 5U/無

100 or 600 痞 PCR tubes (Sarstedt)

Thermal Cycler (see UNIT 15)

TOPO T/A cloning kit (Invitrogen, Carlsbad, CA)

QIAprep kit (Qiagen)

1. Make up the following PCR reaction mixture on ice.

Concentration

Reagent Amount Stock in PCR reaction

Water 51 痞

Tris pH 8.3 10 無 100 mM 10 mM

KCl 2.5 無 2 M 50 mM

MgCl2 3.5 無 200 mM 7 mM

dCTP 4 無 25 mM 1 mM

dTTP 4 無 25 mM 1 mM

dATP 4 無 5 mM 0.2 mM

dGTP 4 無 5 mM 0.2 mM

5' primer 2 無 100 然 2 然

3' primer 2 無 100 然 2 然

template DNA 10 無 200 pg/mL 20 pg/mL

MnCl2 2 痞 25 mM 0.5 mM

Taq DNA Polymerase 1 痞 5U/無 0.05 U/無

Total 100痞

The MnCl2 should not be added until immediately before the thermal cycling is initiated. The Taq DNA polymerase should not be added until the thermal cycling reaction has reached the first annealing step.

1. Place in the thermal cycler and perform about 12 PCR cycles (UNIT 15), or enough to obtain a 1000-fold (10 doublings) increase in the amount of PCR product relative to the input template. The cycling conditions will vary depending on the template and primers, but reasonable starting conditions are: 94° for 1 min, 60° for 1min, 72° for 3 min.

The annealing temperature should be kept >50° if possible to avoid mis-priming, the frequency of which increases at the high divalent cation concentration used for EP-PCR. The three minute extension time reduces the selective amplification of shorter, undesirable sequences produced by mis-priming.

2. Run an ethidium bromide-containing agarose gel to confirm the amount and correct molecular weight of the product (UNIT 2.7).

3. Clone and sequence a sample of the resulting PCR DNA to determine the frequency of mutations in the product. This can be done using the TOPO T/A cloning kit (Invitrogen) and the QIAprep kit (Qiagen). For information on DNA sequencing, see UNIT 7.

To achieve higher levels of mutageneis, the template in the initial reaction will need to be diluted to a greater extent. Also, if more than ~15 cycles of EP-PCR are to be performed, a fresh aliquot of Taq polymerase should be added after the 15th cycle.

One problem that often occurs when attempting to achieve a large number of doublings is that PCR products that are smaller than the desired one "take over" the PCR reaction (see above). If this happens, one should first make sure that the EP-PCR conditions are optimized, resulting of an increase in DNA product of at least 1.7-fold per cycle. This may require increasing the extension time to over 3 min, especially when the desired product is >1kb. Another way to avoid conditions that selectively amplify shorter templates is to increase the denaturing time (up to 75 s). Also, the annealing temperature should be as high as possible to minimize the occurrence of mis-priming events. The highest annealing temperature that gives efficient amplification must be determined empirically. If these measures still fail to eliminate the problem, it may be necessary to perform a smaller number of cycles (using a higher starting template concentration), and then gel purify the full-length PCR product before continuing with more thermal cycling. In some cases it may be necessary to perform this gel purification step periodically (for example, every 8 cycles). One can use agarose gel purification (UNIT 2.7) which is very easy, sensitive and convenient. However, PAGE (UNIT 2.7) can accomplish a higher degree of purification.

ALTERNATE PROTOCOL

MUTAGENIZING A LIBRARY OF SEQUENCES

Sometimes it is desirable to mutagenize an entire collection of sequences simultaneously. The basic protocol above is appropriate in cases where the starting template is a unique sequence, but the following modifications are recommended when the starting template is itself a library.

The protocol above calls for the EP-PCR reaction to be initiated with a very small amount of template, but this amount may be insufficient to preserve the initial library complexity. To avoid complexity loss before and during the amplification process, one can start with a comparatively large template concentration and perform only four EP-PCR cycles, and then transfer ~10% of the resulting material into a fresh EP-PCR reaction. These "serial transfers" are continued until the desired number of doublings is achieved. One additional advantage of this method is that the progress of the EP-PCR reaction can be monitored throughout the entire procedure, a luxury that is not possible using the standard protocol described above.

The following protocol will give approximately 50 EP-PCR doublings and results in mutations in about 3.5% of the nucleotide positions in the DNA template. However, the actual mutagenic rate may vary with conditions and template.

 

  1. Make up the following EP-PCR reaction mixture on ice.

Concentration

Reagent Amount Stock in PCR reaction

Water 960 痞

Tris pH 8.3 150 無 100 mM 10 mM

KCl 37.5 無 2 M 50 mM

MgCl2 52.5 無 200 mM 7 mM

dCTP 60 無 25 mM 1 mM

dTTP 60 無 25 mM 1 mM

dATP 60 無 5 mM 0.2 mM

dGTP 60 無 5 mM 0.2 mM

5' primer 30 無 100 然 2 然

3' primer 30 無 100 然 2 然

Total 1500痞

 

2. Divide the EP-PCR reaction mixture into 16 aliquots (90 無 each); place in tubes suitable for 100 痞 PCR reactions. These may be stored at 4°C for a few hours.

 

3. Add 7 無 of the DNA library (30 ng/無) to tube 1 to give ~2 ng/無. Place the tube in the thermal cycler; once it has reached the annealing temperature, add the following (and mix):

Concentration

Reagent Amount Stock in PCR reaction

MnCl2 2 痞 25 mM 0.5 mM

Taq DNA Polymerase 1 痞 5U/無 0.05 U/無

The MnCl2 should not be added until immediately before the thermal cycling is initiated. The Taq DNA polymerase should not be added until the thermal cycling reaction has reached the annealing temperature for the first cycle.

4. Perform four cycles of EP-PCR amplification. During the final extension at 72°C, place the next tube containing the fresh EP-PCR mixture into the same PCR block. Before the final extension is complete but ensuring that the next tube has reached the extension temperature, transfer 10痞 of EP-PCR reaction mixture from the first tube into the second, and then add the following to the second tube and mix:

Concentration

Reagent Amount Stock in PCR reaction

MnCl2 2 痞 25 mM 0.5 mM

Taq DNA Polymerase 1 痞 5U/無 0.05 U/無

Remove the first EP-PCR reaction mixture from the block and store at 4°C.

5. Repeat step four 14 times. Analyze the PCR reaction using agarose gel electrophoresis (UNIT 2.7) after every fourth transfer, and quantitate the bands in successive PCR amplifications.

The numbers given here for starting DNA template concentration and transfer volume may need to be modified in accordance with results from pilot EP-PCR reactions, which serve to determine the amplification efficiency (see above). The DNA amplification per EP-PCR cycle should not decrease to below 1.7, even for the fourth cycle. It is also important that the amount of DNA at the end of the four EP-PCR cycles is not increasing from transfer to transfer. If this does occur, reduce the transfer volume.

Before the entire EP-PCR protocol is attempted, it is important to pilot the EP-PCR conditions to ensure that low molecular weight PCR products are not "taking over" the reaction, and that the amplification per cycle is at least 1.7. The optimal PCR amplification conditions may be different from normal PCR amplification performed upon the same library.

This serial transfer approach yields a succession of samples with increasing levels of mutagenesis. If one is uncertain about the optimal level of mutagenesis for a particular application, the samples from different stages of the EP-PCR procedure can be mixed prior to screening or selection.

 

TROUBLESHOOTING

Problem

Explanation

Solution

No EP-PCR product

observed

 

EP-PCR conditions

need optimizing

Optimize pilot EP-PCR

reaction, especially

with regard to annealing

temperature. Use

long extension

times (at least 3 min).

If this fails, test the

primers, template and

other reagents

under "normal" PCR

conditions to ensure

that there has not

been a primer design

or synthesis error, or

a degeneration in

reagent quality.

Multiple EP-PCR

products observed of

incorrect lengths

EP-PCR conditions

need optimizing

Optimize pilot EP-PCR

reaction, as above.

If this fails to solve the

problem, periodically

gel-purify the product

of correct length.

Brown precipitate

observed in EP-PCR

reaction mixture

Manganese salts are

precipitating out of

solution

Add manganese to

EP-PCR reaction

mixture just prior to

thermal cycling.

 

 

Successive transfers

contain decreasing

amounts of DNA when

visualized on an

agarose gel

Transfer volume is too

small

Increase transfer

volume

Successive transfers

contain increasing

amounts of DNA when

visualized on an

agarose gel

Transfer volume is too

large, possibly

because the efficiency

of the PCR reaction is

increasing as the most

easily amplified

sequences dominate

the mixture

Decrease transfer

volume

 

Literature Cited

Cadwell, R. C. and Joyce, G. F. 1992. Randomization of Genes by PCR Mutagenesis. PCR Meth. Appl. 2, 28-33.