NucleicAcid Pools for invitroSelection : Design,Complexity, and Purification

Jack Pollard



In vitro nucleic acid selectionschemes utilize combinatory nucleic acid synthesis chemistry to createa large number of different sequences (pool) that may be capable of manyvaried functions. The pool may be designed for a specific purpose suchas a partially randomized (doped) selection to improve the function ofan already known motif or as a general lab tool to isolate new functionalmolecules. Following pool design, chemical nucleic acid synthesis on acommercial DNA synthesizer will yield a single stranded DNA pool that maythen be purified on a polyacrylamide gel. Subsequent amplification of thepool by PCR will yield double stranded DNA competent for transcriptionor further manipulation.

Designing the pool

Pools of nucleic acids used for invitro selections contain a randomized central core from which functionalsequences can arise. This random region is flanked by constant sequencesused as primer binding sites for enzymatic pool replication. Pools forRNA in vitro selection also have a transcriptional promoter regionat their 5' end.

Typical Pool Design

        T7 promoter                 Ban I site


                                                                   Sty I     Ava I site

Since a pool is extremely expensiveto synthesize, significant effort should be expended in designing it tobe as generally useful as possible. There are many subtle parameters toconsider that can greatly influence the outcome of a selection,such as degree of randomization, the amount of sequence space sampled,pool length, and pool modularity.

Completely random or partially randomized(doped) pools

Current combinatorial or "irrational"nucleic acid design methodologies focus on the ability to create largepools of sequences from which useful molecules may be culled (Breaker,R. R. 1997 and Jaeger, L. 1997). Pools can be produced which have eithera completely random distribution of nucleotides or have a biased distributionof nucleotides centered about a wild-type sequence.

Partially randomized (doped) poolconsiderations

Pools biased toward a particular sequencecan be useful for the elucidation of structural properties or the furtheroptimization of a previously isolated motif by exploring the "sequencespace" surrounding a given nucleic acid structure. The most important issuesin the synthesis of a doped pool are the level of randomization (probabilityof mutation/position) and the coverage of mutational sequence space aroundthe motif.

Generally, the optimal doping levelis related to the number of critical interactions thought to be importantto form the structure. For example, in vitro genetics have beenused to uncover the critical structural interactions between the HIV-revprotein and its mRNA binding target, the rev-responsive element (RRE) (Bartel,D.P. et al., 1991). By creating an RRE pool doped to 35 percent mutation/positionabout the wild-type RRE sequence (66-nucleotides), Bartel et al.were able to map quickly and efficiently a 20-nucleotide core-binding sitethrough base conservation and co-variation analysis. This library of ~1E13molecules had an average of 23 mutations/template (0.35 probability mutation/position*66 positions »23 mutations) and less than 1 in 1E12 sequences were completely wild-type.This design reflects their desire to retain much the of wild-type RRE structureand at the same time to sample sequence space nearby to determine whichinteractions were the most critical. Therefore, when designing a dopedpool, choose an average mutation level that corresponds to the number ofpositions thought to be required to form the structure of interest. Ifthe mutation level is too low, pool redundancy will be increased, and itmight not yield enough molecular diversity to be structurally or chemicallyuseful. If the mutation level is too high, sequence space about the wild-typemotif will only be sparsely sampled, and many of the highly mutated moleculeswill be non-functional because their sequences will be too far divergedfrom the wild-type.

Another consideration during dopedpool planning is the amount of mutational sequence space that can be practicallysampled around a given template sequence. A typical 1-mmoleoligonucleotide synthesis of a 100-base template will yield a pool of ~1E16 molecules of which 10 percent (1E15 molecules) is usually amplifiable.By first understanding the number of different mutation combinations thatare possible and how these combinations are distributed throughout a pool,the mutation level can be set such that the optimal degree of moleculardiversity available from a pool can then best be utilized.

Regardless of the partial randomizationor doping strategy used, the number of different mutation combinationsfor any given number of mutated bases in a single template is given by

3n{L!/[n!(L - n)!]}

where n is the number of mutations/templateof template length L.

For example in the case of the RREpool discussed earlier, there are ~ 2.17E9different 5 base mutations and ~ 1.25E16different 10-base mutations in a template 66bases in length.

The distribution of combinations ina pool can also be determined. To calculate what fraction of a given setof mutations (example 5 or 10 base mutants) are contained in a doped poolwith a certain mutational frequency (doping level), use the binomial probabilitydistribution

P (n,L,f) = {[L!/[n!(L - n)!]}(fn)(1 - f)(L-n)

where P is the fraction of thetemplate population of nucleotide length L containing n mutations/templatewhen f is the probability of mutation/position (doping level). Ifprimarily single mutations are desired, then f should be maximizedfor n = 1; if multiple mutations (e.g., doubles, or triples in asingle template) are necessary, then f should be correspondinglyhigher. If the doping strategy is optimized for n mutations, then"n-1" and "n+1" mutations will occur in roughly equal amountsand "n" mutations occur most frequently (see figure).


 Therefore, in the RRE example,for a pool of 1E13 molecules doped to 35 percent (f), few of the5 base simultaneous mutants are included (1E13 * P (5,66,0.35) »1.82E6 molecules of a possible ~2.17E09 different molecules), but if thepool were doped to only 18 percent (f), all 5 base simultaneousmutants (n) would be included (1E13 * P (5,66,0.18) »9.3E10 molecules of a possible ~2.17E09 different molecules). Note thatneither doping scheme can access all 10 base simultaneous mutants (n)with a pool of 1E13 molecules. Higher levels of mutation skew the mutantfrequency distribution, allowing the sampling of some regions of sequencespace at the exclusion of others. When constructing a doped pool carefullyconsider the objectives of the selection and tailor the doping scheme accordinglysince it can greatly influence the outcome.


Random sequence pool considerations

Random sequence pools are used whenno structural motif is known that performs the function of interest. Randompool sequence space is a vast landscape of possibilities of which onlya vanishingly small fraction can be sampled by either nature or man. Assuminga 4 monomer repertoire from which pools can be constructed, there are ~1.6E60unique individual sequences in sequence space bounded by a 100 base template(4100 »1.6E60); however, that amount of sequence space cannot be fully exploredsince it would be greater than an Avagadro’s number of Earth masses. Sonot all of sequence space can be sampled in the laboratory, but it is interestingto note that modern methods of chemical nucleic acid synthesis do allowfor the sampling of nearly as much information content as covered by theEarth’s biosphere of organisms. Consider that there are ~1E9 species inthe biosphere, and there are ~1E5 genes/species. If each of these genesis composed of 1E3 bases, then there are ~1E17 bases of information ina biosphere. By comparing that number to the amount of base informationin a typical 1-mmolesynthesis of a 100-base molecule (1E15 molecules * ~1E2 bases/molecule» 1E17 bases),it is easy to see that an entire biosphere’s worth of information can bemade in 6 hours and held in a single Eppendorf tube. Note though that itis the ordering of this array of information content that makes it biologicallyuseful.

Since a typical 1-mmoleoligonucleotide synthesis of a 100-base template will yield a pool of ~1E15 molecules, pools containing only about 25 or fewer random positions(415 »1.1E15) can be synthesized in quantities that allow for complete sequencespace sampling. For random sequences greater then 25 random positions,the critical parameter is not the fraction of sequence space sampled butthe probability of finding a functional sequence. Neither the general distributionof functional nucleic acid sequences in sequence space nor the amount ofsequence space that must be sampled to find a molecule capable of a givenactivity are known. When sampling a small fraction of sequence space, somefunctional sequences may be easily found experimentally since they areinherently low in informational complexity and statistically more abundantsince a simple structure may encoded by many "reading frames" in a longersequence. Alternatively, if a structure is highly complex and can be embeddedin only a few reading frames, it will be harder to find experimentallywhen sampling a small fraction of sequence space. However, all these complexstructures taken in aggregate may occupy a fraction of sequence space thatis large enough to be experimentally sampled. Sampling the largest numberof sequences possible is best since it allows for the isolation of manysimple solutions and possible access to complex ones as well. Active nucleicacid sequences are not extremely rare (recent reviews Fitzwater T., etal 1996 and Gold L. et al. 1995). Functional nucleic acids areroutinely isolated from pools of ~1E15 sequences with 90 random positions(a sampling of only ~6.5E-39 percent of all sequence bounded by 90 randomnucleotides).


Pool Length

The optimal length of the random regionfor in vitro selection is an active area of research (Sabeti, P.C. et al., 1997), where the fundamental parameters remain to be defined.Generally, pools used for the in vitro selection of nucleic aciddomains which bind ligands (aptamers) are shorter in length than thoseused to select nucleic acid catalysts (ribozymes). Pools with random regionsof 20 to 60 nucleotides are typically used to isolate aptamers, and poolswith random regions of 30 to 220 nucleotides are used to isolate ribozymes(recent reviews Fitzwater T., et al 1996 and Gold L. et al.1995). The synthesis of a pool is also a deciding factor in the lengthof the random region, since current DNA oligonucleotide synthesis chemistryis limited to sequences of less than 150 bases.

Short random regions have a sequencespace sampling advantage because pools with up to 25 random positions canbe synthesized in quantities that allow for a complete sampling of allpossible sequences, but longer pools have more "reading frames" to createa given structure. Also, note that pools with a random region greater than90 nucleotides can form self aggregates that precipitate from solutionupon prolonged incubation, and thereby require immobilization on a solidsupport prior to selection (Bartel, D. P. and Szostak, J. W., 1993 andLorsch, J. R. and Szostak, J. W., 1994). Therefore, random region lengthshould strike a balance between ease of construction and necessity of design.

Pool modularity and assembled poolcomplexity

Pools longer than 150 bases are typicallysynthesized in a modular fashion by the ligating smaller ones together(Bartel, D. P. and Szostak, J. W., 1993). Therefore, the constant sequenceflanking the randomized regions must be designed such that it can be replacedwith a new module. Consult the following section on primer considerationsfor more design details of the constant region. Mutually-primed-and-templatedsynthesis is another method of linking together two pool modules into amuch larger one. Segments of shorter DNAs can be stitched together viaan overlapping region at the 3’ end of each template that functions asa primer for the complementary stand synthesis (Current Protocols Unit8.2). Also, consider adding tagging features to the pool that will allowfor the determination of a sequence’s history and the monitoring of cross-contaminationbetween pools. The tagging can be accomplished by the addition of a fixedsequence not in a primer region. Restriction sites can also provide aneasy assay as long as they are compatible with the overall pool design.

Complexity of the assembled poolmodules

After assembling the pool modules,the complexity of the new pool will need to be assessed. The upper boundof the complexity of the assembled pool modules is equal to the productof the constituent modules (complexity of module A * complexity of moduleB = complexity of assembled pool). However, the practical complexity ofthe assembled pool is often equal to the total number of assembled moleculesthat are isolated since when pool modules of large complexity (>1E8 molecules)are used, it is not possible to synthesize enough input module moleculesto attain the upper bound of the assembled complexity.

Primer considerations

Generally, the constant sequences at the 5’ and 3’ endsof a pool function as primer binding sites and can be any sequence or length.Primers of 20 nucleotides in length are often convenient because theirspecificity avoids mispriming (420 »1.1E12), their melting temperatures are convenient for PCR, and they canbe easily synthesized in high yields that do not always require gel purification.Avoid priming sequences that interfere with other constant sequences inthe pool, such as restriction sites, tagging sites, or aptamer sequences.Designing the primers to possess a 3’ clamp of 5’-WSS-3’ (IUB codes), suchas ACC, insures good extension by polymerases.

It is helpful to design a restriction site into the constantsequence so that it can be replaced with a new priming sequence to recyclethe random region of an old pool into a new pool. Also, pools with morethan 100 random positions are typically synthesized in a modular fashionby the ligation of pools with smaller random regions (Bartel, D. P. andSzostak, J. W., 1993). Asymmetric restriction sites are very useful forthis task since they minimize intra-pool dimerization by self-ligation.Ava I (C|YCGRG), Ban I (G|GYRCC), and Sty I (C|CWWGG)are good enzyme choices since they have asymmetric 6-base restriction siteswhich will ensure infrequent cutting in the random regions. Also, theseenzymes are cost-effective for digesting large amounts of DNA. If possible,each primer should have two restriction sites (see typical pool designfigure).

T7 RNA polymerase promoter considerations

If an RNA pool is to be constructed, run-off RNA transcriptsfor in vitro selection are usually made with T7 RNA polymerase.Several promoters are known for T7 RNA polymerase (Milligan J.F., etal 1987), but the following minimal sequence typically gives good yields:

-17                   -1


Addition of a G and C residue at the-18 and -19 positions of the minimal promoter helps to close the DNA duplexand stabilize the 5’ end of the promoter region, thereby increasing transcriptionalyields. Transcription initiation is optimal when there are stretches ofpurines in the +1 and +2 positions, with GG being the best initiator (MilliganJ.F., et al 1987). Transcriptional yields also increase if uridinedoes not appear in the transcript before position 6. A typical pool designincorporating all the elements discussed was given earlier.

Once the pool construction constraintshave been considered, the remainder of the constant sequence must be chosen.The sequence can be chosen at random, but a codon matrix consisting ofall 64 nucleotide triplets is a useful planning device to avoid complementarityand mispriming problems. Remove all codons already present in the pooldesign (T7 promoter, restrictions sites, aptamers, etc.) from the matrix,as well as their complements, then add bases one at a time to the coredesign. For longer sequences it is impossible to design primers completelydevoid of codon repeats, but it is critical to avoid complementarity atthe 3’ ends of constant regions in order to minimize mispriming and thesubsequent accumulation of amplification parasites. Minimizing intra- andinter-primer as well as intra-pool complementarity reduces secondary structuralproblems and mispriming that can interfere with proper pool amplification.Computer programs such as the as Genetic Computer Group’s PRIME or theWhitehead’s PRIMER3 can be used in designing the constant regions. Oncethe final design is complete, the primers should be checked for complementarityor mispriming between themselves and other design elements using a programsuch as Genetic Computer Group’s STEMLOOP.

Incorporation of modified nucleosidesinto primers or other oligonucleotides

Depending on the selection scheme,it may be useful to incorporate a non-nucleoside or modified nucleotidemolecular tag such as biotin into a primer. This can be extremely usefulfor testing models of structural interactions between enzymes and nucleicacids, blocking enzymatic action on a molecule, or selecting labeled moleculesfrom a population of unlabeled ones. Chemical nucleic acid synthesis allowsthe incorporation of unnatural or modified bases and a variety of labelingmoieties into an oligonucleotide. Modified backbone chemistries such asphosphorothioates, phosphoroamidates, and phosphotriesters are also readilyavailable. In general, the bases themselves can be obtained commerciallyand are often handled like any other phosphoramidite; consult the companythat supplies the analog about any necessary modifications to synthesisprograms or reagents. Usually, the only required adjustment is that themodified base be at a somewhat higher concentration than regular phosphoramiditesto overcome problems associated with slow reactivity. Most of the methodsused to increase the yield of long and ribo-oligonucleotides can be appliedto the synthesis of modified nucleic acids (Current Protocols Unit 2.11).

When synthesizing modified oligonucleotides,the compatibility of different chemistries, placement of modificationsrelative to other chemical groups, and 5’-3’ directionality are all factorsto consider. Generally when end-labeling or modifying an oligonucleotide,a long flexible tether (often stretches of four deoxy-thymines) is addedto allow greater accessibility to the sequence. Adding deoxy-thymines 5’to the label (5’-TTTT-label-3’) can also aid in the size-separation oflabeled from unlabelled molecules. Note that some tagging phosphoramiditesallow for the enzymatic extension or kinasing of the modified oligonucleotidewhile others do not. Oligonucleotides may also be synthesized directlyon and permanently attached to solid glass supports (Cohen, G. et al.,1997)

Terminal transferase can be used asan alternative method of modified base incorporation at the 3’ end of anoligonucleotide (Ratliff, 1982; Current Protocols Unit 3.6). Thisenzyme can tolerate a variety of substrates and has been used to add deoxy-nucleotidetriphosphates derivatized at virtually every position (C-8 on adenine,any of the amino groups, C-5 on cytosine, O-6 on guanosine) to DNA. Italso functions (although less well) with RNA bases. It can use any DNAoligonucleotide that is at least 2 bases long [d(pXpX)] and contains afree 3' hydroxyl as a primer. A potential problem in preparing homogeneouspolynucleotides using terminal transferase is that the number of basesadded to the 3' end of the template is statistically random (with the exceptionof molecules such as cordycepin, which act as chain terminators due tothe absence of 3’ hydroxyl). To isolate a single species, the sample canbe gel purified (Current Protocols Unit 2.12). Polynucleotide phosphorylasemay also be used to incorporate modified bases at the 3’ end of an oligonucleotide(Gillam S. and Smith, M., 1980).

A more controlled means of introducingmodified nucleotides relies on T4 RNA ligase and substrates of the formA(5')ppX (where X can be virtually any molecule, including, for example,ribose or amino acids, in a pyrophosphate linkage with adenosine) (Uhlenbeckand Gumport, 1982). The minimal template for reactions of this form isa trinucleoside containing a free 3' hydroxyl. RNA reacts much better thanDNA, and single-strand molecules are better templates than double strandedones. Since the reaction requires 3' hydroxyl groups, substrates of theform A(5')ppXp will undergo only a single round of addition, unlike thesimilar reaction with terminal transferase. In some cases, the compoundA(5')ppXp can be generated directly from pXp and ATP by RNA ligase, althoughthe substrate requirements for X are much more strict than for the ligationreaction. Thus, while virtually any dinucleotide of the form A(5')ppX canbe added to an oligonucleotide, only a few compounds (primarily sterically"small" derivatives of natural bases) can be used by the enzyme to formA(5')ppXp from pXp.

T4 RNA ligase can catalyze the ligationof single-strand oligonucleotides in the presence of ATP and various analogs(Kinoshita, Y. et al., 1997). Templates prepared by terminal transferaseor by T4 RNA ligase containing modified nucleotides (or other adducts)at their 3' termini may be able to act as substrates in this reaction.This would allow modified nucleotides to be introduced into the middleof a longer chain. However, the substrate specificity of the enzyme forthe 3' hydroxyl donor is highly substrate-dependent and must be determinedempirically.


After pool design is complete, it isusually chemically synthesized although pools have been constructed fromgenomic DNA (Singer BS. et al 1997). Modern DNA synthesizers utilizephosphite triester chemistry and can routinely produce usable amounts ofDNA up to 150 nucleotides in length. Side reactions such as branching anddepurination increases upon repeated synthesis cycles and result in thissize limit. Since coupling efficiencies at each step of a chemical synthesisare ~ 98 percent, the typical yield of a 100-base 1-mmolesynthesis is 1E16 different molecules, of which ~ 10 percent are suitablefor amplification. Several strategies can enhance the yield for synthesesof sequences longer than 100 bases (see Current Protocols Unit 2.11). Ifa pool of sequences longer than ~150 nucleotides is needed, it can be madein modules which can then by ligated following PCR(Bartel, D. P. and Szostak,J. W., 1993). This procedure can also increase the complexity of the resultingpool through combinatorial mixing. Mutually-primed-and-templated synthesisis also an option for long pool syntheses when its limitations are compatiblewith the pool design (Current Protocols Unit 8.2). There is no need tosynthesize segments like the T7 promoter for RNA pools since it can beadded later by PCR. Note that prior to amplification of a synthesized pool,sequences exist only as single copies and can be easily lost. Special careshould be taken to wash and elute filters, tubes and tips repeatedly.

Most synthesizers can be programmedfor in-line degenerate mixing of bases, useful for the randomization ofa few positions. A potential problem with this method is incomplete mixing.Because of the extremely fast reaction of the activated phosphoramiditewith the free 5' hydroxyl, the sequence will be skewed toward whicheverphosphoramidite first enters the column. Therefore, while in-line mixingwill generate all base substitutions at a given position, the distributionof these substitutions may not be uniform. For either doped or degeneratepools that are to contain a statistically random distribution of nucleotides,it is better to manually mix the phosphoramidites and use this mixturefor the degenerate positions. A true random distribution is obtained bymixing a 3:3:2:2.4 molar ratio A:C:G:T phosphoramidites to compensate forthe faster coupling times of G and T phosphoramidites (Bartel, D. P., personalcommunication).

Pools biased toward a particular sequencecan be useful for the elucidation of structural properties or the furtheroptimization of a previously isolated sequence (see discussion above ondoped pool considerations). Doping can be accomplished by using phosphoramiditemixtures adjusted for the proper level of partial randomization for a givennucleotide. For example, if adenosine is to be doped to 10 percent mutation/adenosine,then a molar ratio of 33.43:1.50:1.00:1.21 of A:C:G:T phosphoramiditesshould be used.

Since problems inherent to chemicaloligonucleotide synthesis (such as depurination and branching) leave only10 percent of the synthesized molecules amenable to amplification, anypool should be sequenced prior to its use to confirm that all design elementsare properly arranged. This is especially important given that the insertionand deletion rates for chemical nucleic acid synthesis are non-negligible.The rate of insertions (presumably due to DMT cleavage via tetrazole) hasbeen measured to be as high as 0.4 percent per position, and the rate ofdeletions (presumably due to incomplete capping) has been found to be ashigh as 0.5 percent per position (Keefe, A. and Wilson, D., personal communication).Therefore, assuming a random, non-correlated mechanism of action, an oligonucleotide100 bases in length will only be the intended sequence about 40 percentof the time.


After the pool is synthesized and priorto amplification, it should be purified. The high resolution and capacityof denaturing polyacrylamide gels make them ideal for oligonucleotide poolpurification. Depending on the percentage of matrix used, denaturing polyacrylamidegels can resolve oligonucleotides <300 bases in length (see Table 1).Urea disrupts hydrogen bonding between bases and thus allows oligonucleotidesto be resolved almost exclusively on the basis of molecular weight, withonly a minimal impact by secondary structure. However, it should be notedthat oligonucleotides of equivalent length but different sequence stillmigrate slightly differently (Applied Biosystems, 1984). Thus, a pool willappear as a broader band than a homogeneous sequence.


1. After standard DNA oligonucleotidedeprotection and filtration from the CPG, lyophilize the remaining ammoniumhydroxide solution. Resuspend the lyophilized pellet in enough loadingbuffer to load approximately 20 percent of a 150-mer oligonucleotide1-mmole synthesisper 2 cm X 2 cm X 1.6 mm well. This will give sharp bands with good resolution.Up to 2-fold more material may be added, but the resolution will decrease.

2. To allow for good separation ofnear-full-length and non-full-length products, pick an acrylamide concentrationthat allows the single-stranded nucleic acid to migrate approximately one-halfto three-fourths through the gel when the loading dye reaches the bottomof the gel (Table 1). Since many of the N-1, N-2, etc. products are stillextendable by polymerases, it is wise to collect as many pool moleculesnear the proper length as possible. The longer the oligonucleotide, theless full-length product obtained. The desired band is generally the darkestone on the gel (excluding material that runs at the dye front); exceptunder conditions of incomplete deprotection, it should also be the slowestmigrating band. Any lighter bands containing partially protected oligonucleotideswill migrate considerably above the major fully deprotected band. If thestepwise efficiency of the synthesis is low, a smear may be seen insteadof a clear band. Generously cut out the top of the smear. Unnecessarilylong UV exposure will damage the oligonucleotide and should be avoided.Unpolymerized acrylamide absorbs strongly at 211 nm, which can cause shadowingat the edges and wells of the gel.

3. Following electrophoresis and excisionof the DNA from the gel, chop the gel slabs into fine particles by forcingthe gel through a small bore syringe to aid the diffusion of the oligonucleotidefrom the matrix. Place the crushed gel slab in a 15 ml spin tube capableof withstanding temperature extremes. Add 3 ml of TE for every 0.5 ml ofgel slab and place the sample at -80° C for 30 minutes or until itis frozen solid. Quickly thaw the tube in a hot water bath and let it soakat 90° C for 5 minutes. Elute the DNA on a rotary shaker overnightat room temperature. This freeze-rapid thaw approach (Chen Z. and Ruffner,D. E., 1996) greatly decreases elution time and increases yield by allowingice crystals to break apart the acrylamide matrix. Typically, 80 percentof a 20-mer oligonucleotide is recovered after 3 hours of rotary shaking,making this technique comparable to electroelution (Current Protocols Unit2.7).

Since elution is a diffusion-controlledprocess, higher elution volumes or serial elutions from the same gel slicecan increase recoveries. Note that longer oligonucleotides will diffusefrom the gel more slowly than shorter sequences. Samples of especiallylong synthetic DNAs and RNAs that are particularly resistant to elutionwith aqueous buffers may be eluted easily in 6 volumes of formamide (>5h at room temperature), followed by a brief elution with an aqueous buffer(approximately 1 h). Isoamyl-alcohol concentration brings the extractsto a convenient precipitation volume.

4. Following elution, concentrate thesample by extracting against an equal volume of n-butanol. Remove the upperbutanol layer and repeat until the aqueous volume is convenient for precipitation.About 1/5 of the aqueous layer is extracted into the organic butanol layerfor every volume of butanol used. If too much butanol is added and thewater is completely extracted in the butanol, simply add more water andrepeat the concentration. Precipitate the sample by adjusting the saltto 0.3 M and adding 3 volumes of ethanol. Wash the pellet as usualand lyophilize to dryness. Resuspend the synthetic pool in TE buffer toprotect against any damage through nuclease contamination or drastic pHchanges.

When dealing with pools, especiallyones that contain commonly used priming motifs or tagging sequences, itis essential to avoid cross contamination with other pools and other generationsof the same pool. Always use aerosol barrier tips when pipetting poolsdirectly. In general, aerosol tips are recommended for all applications.In the purification of the initial pool synthesis, gel plates should befree of contamination from other pools or primers. If possible, it is bestto have a set of gel plates devoted exclusively to pool purification toprevent primer contamination of pools. In a selection involving gel purification,rotating through 2 or 3 sets of gel plates lowers the background noiseof cross-generational pool contamination.


Loading buffer receipe

8M Urea
2 mM Tris, pH 7.5
20 mM EDTA

TE Buffer

10 mM Tris*Cl, pH 7.4

1 mM EDTA, pH 8.0
Table1 Concentrations of Acrylamide Giving Optimum Resolution of DNAFragments Using Denaturing PAGEa
Acrylamide(%)Fragment sizes  

separated (bases) 

Migrationof bromophenol blue (bases)migrationof xylene cyanol (bases)
302 to 8620
208 to 25828
1025 to 3512 55
835 to 451975
645 to 7026105
570 to 30035130
4100 to 500~50~230
aData from Maniatis et al., 1975.


Sabeti, P. C., Unrau, P. J., &Bartel, D. P., Accessing rare activities from random RNA sequences: theimportance of the length of molecules in the starting pool. Chem. Biol.1997 4: 767-774.

Lorsch, J. R., & Szostak, J. W.,In vitro evolution of new ribozymes with polynucleotide kinase activity.Nature. 1994 371: 31-36.

Bartel, D. P., & Szostak, J. W.Isolation of new ribozymes from a large pool of random sequences. Science.1993 261: 1411-1418.

Bartel, D. P., Zapp, M. L., Green,M. R., & Szostak, J. W. HIV-1 Rev regulation involves recognition ofnon-Watson-Crick base pairs in viral RNA. Cell. 1991 67: 529-536.

Chen, Z., & Ruffner, D. E., Modifiedcrush-and-soak method for recovering oligodeoxynucleotides from polyacrylamidegel. Biotechniques. 1996 21: 820-822.

Cohen, G., Deutsch, J., Fineberg, J.,& Levine, A., Covalent attachment of DNA oligonucleotides to glass.Nucleic Acids Res. 1997 24: 911-912.

Gillam, S., & Smith, M. Use ofE. coli polynucleotide phosphorylase for the synthesis of oligodeoxyribonucleotidesof defined sequence. Methods Enzymol. 1980 65: 687-701.

Kinoshita, Y., Nishigaki, K., &Husimi, Y., Fluorescence-, isotope- or biotin-labeling of the 5 '-end ofsingle-stranded DNA/RNA using T4 RNA ligase. Nucleic Acids Res. 1997 25:3747-3748.

Fitzwater T., Polisky B., A SELEX primer.Methods Enzymol. 1996 267:275-301.

Gold L., Polisky B., Uhlenbeck O.,Yarus M., Diversity of oligonucleotide functions. Annu Rev Biochem 199564:763-797.

 SingerBS., Shtatland T., Brown D., Gold L. Libraries for genomic SELEX. NucleicAcids Res. 1997 25:781-786.

Lyamichev V, Brow M.A., Dahlberg J.E.,Structure-specific endonucleolytic cleavage of nucleic acids by eubacterialDNA polymerases. Science. 1993 260:778-783

Applied Biosystems, Evaluation andPurification of Synthetic Oligonucleotides. 1984. User Bulletin, Issue#13.

Milligan J.F., Groebe D.R., WitherellG.W., Uhlenbeck O.C., Oligoribonucleotide synthesis using T7 RNA polymeraseand synthetic DNA templates. Nucleic Acids Res 1987 15:8783-8798

Maniatis, T., Jeffrey, A., and deSande,H.V. Chain length determination of small double-and single-stranded DNAmolecules by polyacrylamide gel electrophoresis1975.. Biochemistry 14:3787-3794.

Jaeger L.The New World of ribozymesCurr Opin Struct Biol 1997 7:324-335

Breaker, R.R. In vitro selection ofcatalytic polynucleotides 1997 Chemical Reviews, 97, 371-390.