Search Gene Families
    Download The List

Integrated Arabidopsis Gene Functional Annotation


To facilitate global gene expression analysis and extract useful information from massive amount of data (e.g., generated by Affymetrix GeneChips and microarrays) to reveal biological insights, we have decided to initiate an effort to classify Arabidopsis genes based on related functions. Although KEGG and TAIR (including GO, AraCyc and gene family databases) have provided substantial information and TIGR and MIPS have offered systematic and updated Arabidopsis gene annotations, there is still a need of community effort to provide more "biologically oriented" information for each individual researcher to effectively analyze expression patterns of the whole plant genome using free or commercially available programs. For instance, there are numerous published research articles and reviews and websites that contain rich and useful information. However, it is impossible for a single person or a single lab to acquire all the information efficiently and use them effectively. Although the information could be systematically collected and organized to some extend by computer programs, its quality and accuracy will be the highest if more researchers can participate in contributing, evaluating and editing the information as an open resource. The ultimate goal of our small effort is to encourage plant scientists working on diverse areas and metabolic and regulatory pathways of Arabidopsis to share their knowledge about gene functions that they have become familiar with through their own work. We believe that this organized information of Arabidopsis gene functions will be applicable to other plant species and useful for comparative plant genome analyses when more plant genome sequences are completed in the future.


The master gene list integrating information collected from all sources in a tab delimited format and an introductory word document can be found here

Family/Source Number of Genes Contributor/Reference
I. Transcription Regulators (TRs)a 2863b  
A) Public Databases 3114c  
AtTFDB 1360
ChromDB 289
Mendel's List 1465
B) Literature Not counted  
II. Regulatory Pathways 558  
BR Signaling and Biosynthesis 74 Matthew Willmann
C2H4 Signaling 50 Sang-Dong Yoo
Circadian Clock 17 Elena Baena Gonzalez and Jen Sheen
Cytokinin 32 Jen Sheen and Yanxia Liu
Flowering 40 Wan-Ling Chiu
GA 33 Bernard Lam
Auxin 156 Shu-Hua Cheng
JA 33 Satoru Mita
Protein Degradation 36 Young-Hee Cho
Defence 29 Senthil Ramu
Sugar 19 Jen Sheen and Yanxia Liu
ABA 38 Jen Sheen
III. All Gene Families 20878b  
A) Public Databases 30020c  
TAIR AraCyc 1872;
TAIR Gene Ontology (GO) 18687;
TAIR Gene Families 4620d
Seed Genes 219
Cell Wall Genes    
CellWallGenomics 629
Glycosylphosphatidylinositol(GPI)-Anchored Proteins 248 Georg H.H. Borner et al. Plant Physiology, June 2003, Vol. 132, pp. 568577
Carbohydrate-Active enZYmes (CAZy) 934
PlantsT 1009
Protein Kinases 989
Protein Phosphotases 131
Lipid Genes 682
B) Literature and Web Sites 2327c  
Mutant Genes 620 Meinke et al. Plant Physiology, February 2003, Vol. 131, pp. 409418
Stress Consortium 203e
NBS-LRR 206 Blake C. Meyers et al. The Plant Cell, Vol. 15, 809834, April 2003;
F-box 694 Jennifer M. Gagne et al. PNAS August 20, 2002 vol. 99 no. 17 1151911524
Nitrogen metabolism, Organic Nitrogen Metabolism 149
Acyl-activating enzymes (AAE) 63 Shockey et al. Plant Physiology, June 2003, Vol. 132, pp. 10651076
Aldehyde Dehydrogenase(ALDH) 14
Glucosidase 46
b-galactocidase 18
Nitrogen Metabolism, Inorganic Nitrogen Metabolism 41
plastidic phosphate translocator (pPT) 35 Silke Knappe et al. Plant Physiology, March 2003, Vol. 131, pp. 11781190
P-Type ATPase    
Autoinhibited Calcium ATPase (ACA) 7 Ivan Baxter et al. Plant Physiology, June 2003, Vol. 132, pp. 618628
Endoplasmic reticulum [ER]-type Calcium ATPase (ECA) 3 Ivan Baxter et al. Plant Physiology, June 2003, Vol. 132, pp. 618628
SAC Domain-Containing Proteins 9 Ruiqin Zhong et al. Plant Physiology, June 2003, Vol. 132, pp. 544555
small GTPase 93 Vanessa Vernoud et al. Plant Physiology, March 2003, Vol. 131, pp. 11911208
subtilisin like serine proteases 58
Sugar Transporters 68 ?
Note: We have collected information for 20878 unique genes. All the gene lists are using AGI number as index. But in some cases, AGI number are not available.
a. Please refer to the TRs summary document for complete listing of resources and references.
b. These numbers represent unique genes.
c. These numbers do not represent unique genes.
d. We used "gene_family_tab.062803" obtained from TAIR's ftp site, which was published on June 28, 2003. There are a total of 4670 genes in the list, among which 245 without AGI numbers. In this case, we used gene name or genBank locus or genBank Bac locus in that order as the index. So the number of unique genes from TAIR gene families database is 4620. TAIR has recently updated their Gene Families database and added more families, which are not included in our integrated list.
e. Stress Consortium has 318 plant stress genes. We found AGI numbers for 203 Arabidopsis genes.