Restriction Analysis to determine Pool Complexity

Jonathan Urbach

April 7, 1998


In general, pool complexity can be estimated by performing a restriction digest on the pool mixture at a given round of a selection, then running the labeled mixture out on a gel and counting the bands that appear. The number of bands will be a function of the frequency with which the enzyme's recognition site occurs in random DNA as well as pool length and complexity.

The frequency with which a restriction site occurs in random DNA is a function of the length of the recognition site. Longer recognition sites occur less frequently than shorter recognition sites. The relationship between the number of cuts and the number of gel bands depends on the method used to label the DNA, whether the DNA is labeled at one end or both, and whether the DNA is labeled before or after the restriction digest. A restriction site n bases long will occur on average once in 4n bases (4n=1/(cut frequency)). A pool containing c sequences of length l will have lc/4n cuts. Under conditions in which one DNA fragment is labelled per cut, lc/4n bands will be visible on the gel. So a pool of length 100 containing 1,000 sequences should have 24 six-base restriction sites, and depending on the labeling method, should show approximately 24 bands on a gel (in addition to the full length uncut band). It follows then that complexity c=(number of bands)x4n/(length). If digest of a pool of length 500 with an 8-cutter gives 40 bands, then the complexity is considered to be on the order of 5242. Since the number of cut & labeled fragments is not always equal to the number of cuts (as in the fill-in method), a more general expression for the complexity is c=(number of bands)/(length)(c&l frequency), in which (c&l frequency)=(fraction of cut sites labeled)/4n. Limitations of this technique arise from the fact that extremely rare sequences do not show up on a gel and are therefore not counted directly. Furthermore, when the number of bands expected is significantly more than the number of random bases in a pool, what shows up is a smear rather than a set of discrete bands. It therefore becomes difficult to measure complexities of greater than about 2x4n with this method (because the number of bands becomes greater than twice the length of the random region).

Label & Cut: Bartel Style Experiment

There are two versions of the complexity measurement experiment. The first was developed by Dave Bartel(1) and involved labeling one end of purified PCR DNA, then cutting with a restriction enzyme, or a cocktail of restriction enzymes. Double stranded PCR DNA may be labeled selectively at one end by several different methods. The DNA may be PCR amplified with a single radiolabeled primer, then purified. Alternatively, the DNA may be amplified with one primer blocked at the 5' end, either with a non-phosphorylatable base such as Bio-TEG. David Bartel used a cocktail containing three 4-cutter restriction enzymes to analyze the complexity of his n=220 pool in the various cycles of his ligase selection. Under these conditions, the number of observed bands is approximately equal to the complexity. The gel of his experiment is shown in Figure 1.

Restriction & Fill-In Method (J. Urbach Method) (2)

The other type of experiment is a variation of the Bartel Style experiment. In this technique, PCR DNA is purified on a gel, then cut with a restriction enzyme that leaves a recessed 3' end. Finally, the cut fragments are labeled specifically by filling in the recessed 3' end using Taq polymerase with a single a-labeled radioactive dNTP and the other three unlabeled dNTPs. If an asymmetric restriction site is used, then only one of the restriction fragments is labeled by the fill-in. This technique has the advantage that only cut DNA is labeled. This gives somewhat better sensitivity and allows the use of rarer cutters, since full length DNA will not overwhelm the rest of the bands.

The first important factor in getting this type of experiment to work is the purity of the PCR DNA. DNA used in restriction analysis of complexity should be purified on a native gel, preferably acrylamide, prior to digestion. Care should be taken to make sure that the DNA does not denature, so DNA pellets should be resuspended in buffer containing at least 25 mM salt (NaCl, KCl). Secondly, DNA should not be frozen and thawed repeatedly, since this tends to produce a higher background on gels with control DNA that has not been incubated with enzyme.

The second important factor is the labeling procedure. Klenow enzyme is inappropriate for this type of experiment because it has 3'->5' exonuclease activity, as does MMuLV Reverse Transcriptase. This leads to nonspecific labeling of uncut PCR DNA strands. Taq polymerase, however works well. For best results, it is advisable to use only extremely small amounts of Taq polymerase for this experiment, 0.36 units/100 ÁL rather than the 2.5 units/100 ÁL commonly used in PCR reactions, since at higher concentrations, Taq also labels nonspecifically and gives n+1 bands on gels. Labeling times with Taq polymerase are short, 5 minutes.

Lastly, the restriction analysis experiments are typically run on a 10% denaturing acrylamide gel at 74 watts to which hot buffer (65°C) has been added to the top well prior to loading. This is helpful in avoiding gel artifacts due to incomplete denaturation of the labeled PCR DNA. (See figure 2)


In this experiment, two sets of reactions are done using 2 different symmetric restriction sites. The first enzyme, Dde I, has a recognition site of CTNAG, which should give a cut capable of being labeled with dCTP-a-32P every 512 random bases (c&l frequency=1/512). The second enzyme, Bsu 36 I, has the recognition site CCTNAGG, which should give a cut capable of being labeled with dCTP-a-32P every 8192 random bases (c&l frequency=1/8192).


9x15 ÁL samples of 100 nM DNA, in 50 mM NaCl

Restriction Reaction Control Mixture:

10 µL NEB Buffer #3
40 µL H2O
50L ----> 5 µL in each sample tube

Dde I Reaction Mixture:

10 µL NEB Buffer #3
37.5 µL H2O
2.5 µL Dde I (50 units)
50µL ----> 5 µL in each sample tube

Bsu 36 I Reaction Mixture:

10 µL NEB Buffer #3
35 µL H2O
5 µL Bsu 36 I (50 units)
50µL ----> 5 µL in each sample tube

Labeling Reaction Mixture

75 µL 10x Taq Buffer (500 mM KCl, 100 mM Tris HCl pH 8.3, 0.5% Tween)
120 µL 25 mM MgCl2
3.75 µL 10 mM dATP
3.75 µL 10 mM dGTP
3.75 µL 10 mM TTP
5 µL dCTP-a-32P (10 Ci)
0.75 µL Taq (5 units/ÁL)
538 µL H2O
750 µL--------> 25 µL in each tube

Restriction Digest & Labeling Procedure:

5 µL of each DNA sample containing 0.5 pmol DNA is added to a 0.5 µL PCR tube containing 5 µL Control Mixture, a tube containing 5 µL Dde I Reaction Mixture, and a tube containing Bsu 36 I Reaction Mixture. To each tube is added 1 drop of mineral oil. Reactions are mixed by vortex, spun down in a centrifuge, and incubated 1 hour at 37°C.

To each tube is added 25 µL Labeling Reaction Mixture. Reactions are again mixed by vortex, pulsed in a centrifuge, and incubated 5 minutes at 72°C. Reactions are stopped by addition of 35 µL formamide(3) loading buffer. Labeling mixtures are again heated to 95°C to denature, then run on 10% sequencing gel which has had hot buffer (65°C) added to the top well prior to loading.

An example of the resulting restriction pattern is shown in figure 3.

(1)1 D. P. Bartel and J. W. Szostak, Science 261, 1411 (1993)
(2) J. M. Urbach and J. W. Szostak, unpublished results.
(3) Formamide Loading Buffer:
80% Formamide in Water
10 mM EDTA, pH 8.0
Bromphenol Blue and/or Xylene Cyanol, as desired.