The strategy for determining Affymetrix GeneChip annotation in the Sheen Lab

We blasted the Affymetrix target sequences from the NetAffx web site against the TIGR databases. The strategy used for determining the identity of NetAffx sequences is outlined below. A new list with matching GeneChip IDs and AGI names can be downloaded here. Detailed information for each gene can also be obtained through searching our 8K Affymetrix annotation database.

 

The strategy for determining the AGI name of the NetAffx sequences:

Blast search was started with 8246 target sequences (one probe set, 13434_at, has no target sequence in NetAffx database file) with an E-value of 0.000001.

1. BLAST search of each target sequence (8246) against TIGR cds (ATH1.cds) using the criterion of >97% match to the whole query length

found 5479 genes

2. BLAST search of each remaining target sequence (2767) against TIGR UTR sequences using the criteria of >97% match to the whole query length

found 547 genes

3. BLAST search of each remaining target sequence (2220) against TIGR cds using the criteria of 50b/>98% match

found 1725 genes

4. BLAST search of each remaining target sequence (495) against TIGR UTR sequences using the criteria of 30b/>98% match

found 170 genes

The identity for 7921 genes out of 8246 target sequences has been confirmed.

A list of the 7921 probe sets with AGI names can be found in Affy_allagi.xls.

 

5. BLAST search of the remaining target sequences (325) against Salk genome sequences using the criteria of 50b/>98% match found hits in intergenic regions for 161 target sequences.

6. BLAST search of the remaining target sequences (164) against Salk genome sequences using the criteria of 30b/>98% match found hits for another 23 target sequences.

In summary, the last two BLAST searches found matches for 184 target sequences. There are still 122 target sequences that have matches below our criteria and 19 without any hits at all against the whole Arabidopsis genome sequence.

A list of the rest 325 probe sets, either with hits (equal or below our criteria) or without hits, can be found in Affy_ATH_325.xls.