AMPLYING THE POOL,ASSESSING AND PRESERVING ITS COMPLEXITY

Jack D. Pollard, Jr.

April 9, 1998


 Following thechemical synthesis and isolation, the pool molecules exist only as singlecopies of single stranded DNA. Typically, the pool is then amplified viaPCR to give double-stranded DNA that is both transcriptionally competentand capable of being archived for later use. PCR can be used to exponentiallyamplify pieces of synthetic DNA flanked by two priming regions, and amplificationsof a single molecule from pools of 1E16 molecules are routinely achieved.Final DNA concentrations ~ 1mg from 0.1 ml reaction volumes are typical,but the yield can vary from 0.1 to 10 mg. Very long and complex pools oftenrequire PCR amplification on the multiple-milliliter scale. Large scalePCR differs from conventional PCR in that it is typically conducted inwater baths with 15 ml, 17x120 mm screw-capped (Sarstedt 62.554.002) thermostabletubes that accommodate larger volumes instead of a thermocycler. Volumesas high as 2.5 liters have been conducted in a single PCR reaction witha circulating water bath setup. When attempting to amplify a pool, thecost involved in both time and money are considerable; therefore smallscale reactions to test various PCR conditions should be done before attemptinga larger amplification.

Initial poolcomplexity and PCR efficiency

Large random sequencepools are currently constructed by ligating several smaller pool moduleseach of which are obtained from chemical oligonucleotide synthesis andPCR. Chemically synthesized oligonucleotides may contain a number of modificationsincluding deletions, insertions, incompletely deprotected bases, or backbonelesions all of which affect PCR efficiency. Generally, less than 10 percentof synthetic pool molecules are extendable by PCR. Therefore, the amplificationefficiency (average number of doublings per cycle), extendability, andhence complexity of the pool should be assessed before attempting to amplifya newly synthesized pool (pool complexity = number of synthetic pool molecules* fraction of those synthetic pool molecules that are extendable).

Overall PCR efficiencyshould be optimized to avoid as many biasing effects as possible. Amplificationof a pool under conditions in which molecules are replicated with differentefficiencies can bias the pool toward molecules with an amplification advantageover their slower amplifying and possibly more functional counterparts.Pool bias can also be introduced if the pool is amplified to concentrationsthat allow multimers to form via inter-molecule hybridization (over-PCR).

Determining thepool complexity via PCR

1. Determine the poolcomplexity by performing a 0.1-ml PCR reaction with 2 nM of syntheticpool oligonucleotide as the template. Use the PCR buffer with a workingmagnesium concentration of 1.5 mM and 2mM primers. After10 to 15 cycles of amplification, check the length and purity of the PCRDNA on a sieving agarose gel (Nusieve, TBE buffer) to ensure a sharp, discreteband of the proper size. A fuzzy band is an indication of over-PCR, leadingto pool bias. If the band is fuzzy, start with a fresh sample of the syntheticpool oligonucleotide and PCR for fewer cycles.

2. Dilute the double-strandedPCR DNA 1/128, and perform another 0.1 ml PCR reaction using identicalreaction conditions, but monitor the progress of the reaction at each cyclepast cycle 3 by removing an aliquot during the last 10 seconds of the extensionstep. After 7 cycles assess the PCR efficiency by serially diluting a cycle7-sample in 1/2 increments to a final concentration of 1/128. Electrophoreseall the samples on a large agarose gel for easy comparison.

3. Calculate the averagePCR efficiency by determining which dilution lane is the first to lackdetectable DNA. For example, if the 1/64 dilution is the first to showno detectable DNA (implying that 6 doublings of the synthetic DNA tookplace yielding 64 fold more DNA) and if 7 cycles of PCR were performed,then the average number of doublings per cycle is ~1.81.

    (average efficiency)number of theoreticaldoublings(PCR cycles)= fold increase in DNA
Therefore, from the exampleabove (~1.81)7 = 64

 4. The PCR efficiencyup to a particular cycle may be determined by comparing earlier cyclesof PCR to the cycle 7 dilution series. If the pool’s average number ofdoublings per cycle is less than 1.8, then a selection bias will be introducedtoward molecules that amplify well. In that case the PCR conditions shouldbe modulated to enhance PCR efficiency (see small-scale optimization below).

5. After optimizingthe pool PCR conditions for >1.8 average number of doublings per cycle,determine the pool complexity by performing another 0.1 ml PCR reactionwith 2 nM of synthetic pool oligonucleotide template using the optimizedreaction conditions. After 7 or more cycles of PCR, calculate the amountof amplified DNA by running out an aliquot of the PCR reaction along withdsDNA mass markers (Gibco BRL). Calculate pool complexity as follows:

(grams of PCR DNA after N cycles of PCR)                         = grams of starting extendable ss DNA
(average number of doublings per cycle)Ncycles of PCR

(grams of starting extendable ss DNA)                            = moles of starting extendable ssDNA
(~330 g/mol * (number of bases in the full lengthproduct))

(moles of starting extendable ssDNA) * (~6.02E23 molecules/mole) = molecules of starting extendable ssDNA

molecules of starting extendable ssDNA                           = fraction of extendable ssDNA
starting molecules

fraction of extendable ssDNA * number of syntheticpool molecules = pool complexity

Determining thepool complexity via primer extension

Alternatively, thecomplexity of the pool can be determined by primer extension, which canbe performed before or after the large-scale amplification. If it is doneafter the actual amplification, primer extension can only reveal the complexityof the already amplified pool, but if done prior to the eventual work-upof double-stranded DNA, primer extension can also be used to optimize thefinal PCR conditions by defining the optimal time for Taq polymerase extensionand determining whether the synthesis yielded enough amplifiable single-strandedDNA to generate the desired pool complexity.

1. The 3’ primer mustfirst be labeled with gamma-32P ATP using T4 polynucleotidekinase (available from a variety of commercial sources) in a labeling reactionaccording to the protocol provided with the enzyme. The labeled oligonucleotideis phenol:chloroform and chloroform extracted and precipitated with anequal volume of 4.0 M ammonium acetate, ensuring that most of theunincorporated label remains in the supernatant.

2. Incubate the labeledprimer with an equimolar ratio of pool in an extension reaction under theidentical conditions to be used in the final amplification. The primerand template DNA (with dNTPs) in buffer are denatured and annealed at theselected temperatures (usually 94° C for the denaturation and around50° C for the annealing steps). Following the addition of Taq polymerase,the temperature is ramped to 72° C for a time ranging from 1 to 20minutes. It is best to use several samples to test the ideal extentiontime. Finally, the reaction is terminated by the addition of stop dye.

3. The samples aredenatured and run on a denaturing acrylamide gel with appropriate radiolabeledsize markers, and the gel is dried and exposed to a phosphorimager plate.The bands are quantified and the percent extension is calculated by dividingthe amount of labeled (extended) template by the amount of labeled primerthat went into the reaction.

Small scale PCRoptimization

PCR efficiency shouldbe optimized to balance the average number of doublings per cycle of PCRagainst the total volume in which the PCR will be preformed. A pool of1E15 molecules (~1.7E-9 moles) at a starting template concentration of2 nM will require 0.85 liters of PCR to be amplified. Therefore,amplifying the pool at the highest template concentration that still givesa reasonable average number of doublings per cycle is greatly desirable.At least 8 copies worth of a pool DNA should be generated if the pool isto be archived and its complexity preserved (see section below on preservingthe pool and its complexity).

1. Determine the numberof DNA doublings per cycle for the current protocol. Typically by startingwith 2 nM of synthetic pool oligonucleotide and ~2 mM primers,3-4 doublings may be achieved in 6 cycles of the following standard PCRcycling program:

95° C denaturation for 2 min (ensures themelting of single-stranded DNA secondary structure)
55° C annealing for 1 min (temperature dependenton primer composition)
72° C extension for 3 min (keeps the primersannealed to ensure complete extension and adds many non-templated A’s forcloning purposes)

2. Use the manufacturer’ssuggested units of Taq (2.5 units of BOEHRINGER MANNHEIM Taq) in a reactioncontaining 500 mM dNTPs and suggested buffer with a working concentrationof 1.5 mM magnesium.

Primer concentration,magnesium concentration, cycling conditions, and the amount of Taq areall parameters that can be varied during PCR.

3. Theoretically PCRcan proceed until the concentration of template approaches either thatof the primers or dNTPs. Therefore, use primer and dNTP concentrationswell above that used for the amplification of small amounts of longer DNAfragments. Primer concentrations higher than 1 mM and up to 5 mM have beenused; concentrations greater than 5 mM are generally not helpful. Startat 2.5 mM and scan both above and below that concentration in 0.5 mM increments.PCR reactions typically contain a vast excess of dNTPs, thereby increasingtheir concentration above 200 mM each is generally not helpful.

Deoxy-NTPs chelatemagnesium thereby changing the effective optimal magnesium concentration.Moreover, dNTP concentrations greater than 200 mM each increase the errorrate of the polymerase and millimolar concentrations of dNTPs actuallyinhibit Taq DNA polymerase.

4. Primer annealingis also modulated by the magnesium concentration in the buffer, and thefidelity of Taq decreases with increasing magnesium concentration.Start at magnesium as supplied in the PCR buffer (usually 1.5 mM)and scan in 1 mM increments toward 5 mM as a maximal concentration.

5. Primer concentration,annealing temperature, and cycling conditions are dependent upon both primersequence and length. The denaturation and extension temperatures are modulatedby properties of Taq, which will extend (although inefficiently)at temperatures as low as 65° C. This may be useful for sequences withlow primer melting temperatures. Extension temperatures greater than 72°C are possible, but remember that the primer must remain annealed in orderfor Taq to begin the extension. DNA denaturation at temperaturesabove 95° C is usually impractical since Taq’s half-life isgreatly reduced at higher temperatures. The primer annealing temperaturemay be calculated by using the following URL that includes nucleotide composition,stacking energies according to Turner’s rules, and empirical data in itscalculations:

http://paris.chem.yale.edu/extinct.html

An annealing temperature~ 5° C below the calculated annealing temperature is a good place tobegin the optimization. At a lower annealing temperature, the amplificationis more efficient, but mispriming and secondary structure problems aremore pronounced. At temperatures higher than the annealing temperature,the specificity is greater, but the overall yield of the reaction suffers.To determine the optimum annealing temperature for a given primer and magnesiumconcentration, scan both directions around the annealing temperature in5-degree increments.

Increasing the amountof Taq in a reaction is generally not helpful except when the extensionis conducted at temperatures outside Taq’s optimum temperature (70- 75° C). Typically, 2.5 U of BOEHRINGER MANNHEIM Taq is usedfor a 0.1-ml reaction. To optimize the amount, scan in 2.5 unit increments.Also, too much Taq can be harmful to structured single strandednucleic acids (Lyamichev V. et al 1993).

It is best to beginwith a single set of reaction conditions and explore many different variantsrelative to this one reference reaction. Then combine all advantageousalterations into a single reaction to identify any unintended side effects.There are other parameters which can be varied as well. ConsultCurrent Protocols Unit 15.1 formore suggestions.

Large scale PCRplanning

1. After determiningthe optimal PCR conditions on a small scale, prepare a batch of reagentsfor the large scale reaction. It is best to make and test the primers,dNTPs, and buffer on the small scale (0.1 ml) prior to attempting a largeramplification. If the trityl responses of primers 20 bases or shorter inlength are reasonable, it is usually not necessary to gel purify them.Primers longer than 20 nucleotides should generally be gel purified.

2. If the total amountof PCR is approximately 100 ml, it is best to use a commercial PCR machine(Perkin Elmer) which can accommodate as much as 30 ml of reaction. However,with very large volumes, the water bath cycling method, which can accommodateliters of PCR in a single reaction, should be used.

Large scale PCR differsfrom conventional PCR in that it is typically conducted in water bathswith 15-ml thermostable tubes to accommodate larger volumes. Constructfloating racks from the tubes’ packing material by cutting off the bottomof the Styrofoam racks. Re-enforce the rack by wrapping the edge in labtape. Construct a temperature probe by placing a thermometer through thetop of one of the Sarstedt tubes filled with 10 ml of water. Fill the othertubes with 10 ml of water as well and place them into the floating rack.

3. Set up a denaturing,annealing, and extension water bath according to the temperatures obtainedin the previous optimization procedure. Calibrate the equilibration timefor the baths, accounting for heat capacity of the water, by cycling theloaded rack through the denaturing, annealing, and extension baths at intervalsthat allow the tubes to mimic their temperature course in the PCR machine.Record the times required to reach the desired temperatures, then add thetime for each PCR step to determine a total time for that step in the cycle.Respective denaturing, annealing, and extension times of 5, 6, and 7 minare typical. Ramping temperature profiles are vastly different from thatof the PCR machine, yielding more amplification artifacts. Therefore, itis best to test the ramping conditions on a 10 ml scale before conductingthe final amplification.

4. Fill the rack withtubes of water, the sample reaction to amplify, and the temperature probe.TEST THE pH OF THE REACTION PRIOR TO ADDING THE TAQ. IT SHOULD BE AROUND8. Prior to the addition of Taq, denature the sample for 30 minutes,and then add the Taq after the first annealing step to promote properextension. Take aliquots at each cycle to monitor the progress of the reaction.Do not become alarmed if the solution becomes cloudy; the detergent inthe buffer causes the turbidity. The final extension should be about 20minutes long to ensure that all templates are completely double-stranded.PCR efficiency of 3-4 doublings in 5 cycles can usually be achieved.

Large scale amplificationand workup

1. Assess the efficiencyof the reaction and determine if modifications such as altered bath temperaturesor a different amount of Taq are necessary. When all preliminaryamplifications are in order, plan time for the large scale amplification.Since large scale reactions are quite expensive in terms of nucleotidesand enzyme, preparedness and planning for the large scale amplificationcannot be overemphasized. The amplification will probably consume an entireday. After the reaction, chelate the magnesium in the buffer by adding1.1 molar equivalents of ETDA pH 8.0. The tubes can be left at 4° Covernight.

2. Butanol extractthe samples to concentrate the reaction to a manageable volume. About 1/5of the aqueous layer is extracted into the organic butanol layer for everyvolume of butanol used. If too much butanol is added and the aqueous layeris completely extracted in the butanol, simply add more water and reconcentrate.Concentrate about 10 to 20 fold depending on the initial reaction volume.Phenol:chloroform extraction, followed by 2 successive chloroform extractionsshould yield DNA clean enough to precipitate easily. Be sure to save allof the organic layers in case of a mishap. It is best to use 50 ml Falcontubes for these extractions since they are conveniently sized and havea small surface area. Alternatively, a Teflon extraction funnel may beuseful since nucleic acids will not stick to its surface.

3. Precipitate theDNA in 13-ml spin tubes if possible; if larger tubes are required, preparea set of the Beckman 250 ml high speed centrifugation jars. Wash the jarsin 15 ml of 3 percent peroxide for 30 minutes and rinse with 3x100 ml ofdistilled water. This removes any residual DNases remaining from bacterialcell pelleting.

4. Resuspend the PCRDNA in TE buffer plus 50 mM of a salt such as potassium chloride.NEVER RESUSPEND A RANDOMIZED POOL IN WATER SINCE THE RANDOM SEGMENTS WILLDENATURE AND RENDER THE POOL TRANSCRIPTIONALLY INCOMPETENT. If the pooldoes become denatured, simply do another cycle of PCR. Quantitate the PCRDNA by using a DNA mass ladder and determine the overall PCR efficiency,number of pool molecules, and pool copies produced.

Preserving thepool and its complexity

The amount of DNA obtainedfrom large scale amplification is often referred to in terms of the numberof copies of the original starting synthetic pool’s complexity that wereobtained. For example, if the starting pool had a complexity of 1E15 moleculesand 8E15 total DNA molecules were recovered, then on average 8 copies ofthe original starting pool were obtained from the amplification. Sincepools are often time intensive and monetarily expensive to synthesis, preservingtheir complexity is a key concern. Following large scale amplification,at least 4 copies of a pool should be stored at —80° C in TE with 50mM of a salt such as potassium chloride. Archiving at least 4 copiesworth of pool DNA ensures the preservation of most of the pool’s complexity.The amount of pool complexity perserved can be calcualted by the followingequation:

percent of the pool complexity in a given sample =100 * (1-(( x — y)/x)x)

where x is total number of pool copies, and y is thenumber of pool copies archived.

Therefore, in the examplegiven above, if 4 of the 8 copies of the pool that were generated thoroughamplification are archived, then ~99.6 percent of the original startingpool’s complexity is preserved. Also, whenever manipulations of the poolare to be preformed (ligation, transcription, biotinylation, etc), be sureto use at least 4 copies of the pool to preserve its complexity duringthe manipulation.

Reagents:

10X Boehringer MannheimPCR Buffer
500 mM KCl
100 mM Tris*HCl, pH8.3 at 20°C
15 mM MgCl2