Annotating Gene Lists

Gene lists on the PhenoGen website can contain identifiers from any source and can be translated using a tool called iDecoder. This tool translates identifiers to and from the following identifier types:

The data used to translate these values comes from information downloaded from the following organizations:

Your uploaded gene lists can be a mixture of many ID types, but all identifiers in a gene list must be for the same organism. For example, you can upload a gene list that contains an Affymetrix probe set ID, an official gene symbol, an Entrez Gene ID, and a RefSeq Gene ID, and iDecoder translates them into the identifier types appropriate for the selected tool.

iDecoder is the underlying program for the annotation tools on the PhenoGen website that maps gene identifiers between databases and . For instance, if database 1 contains entry A and database 2 contains entry B, and both A and B refer to entry C in database 3, but not to each other, iDecoder identifies that A and B are related. The method is very efficient in unearthing previously unknown equivalent IDs.

There are two levels of annotation available on the PhenoGen website:

Basic Annotation

Basic annotation displays links to the most popular databases for each of the identifiers in a gene list. In addition to general annotation, the Basic Annotation tool also provides information on expression QTL (eQTL), based on mouse or rat data. A QTL column displays in the Basic Annotation table and a PhenoGen eQTL link displays when the gene list entry matches either a probe ID or gene symbol in the eQTL data.

Expression QTL (eQTL)

The purpose of expression QTLs is to determine the location in the genome that controls the transcription level of a gene. eQTLs are calculated using traditional QTL techniques where the quantitative trait of interest is the expression level of a gene as measured by microarray analysis. On the PhenoGen website, eQTLs have been calculated for both mouse and rat data. When you run Basic Annotation on gene lists for either of these species, the eQTLs are reported. See "Expression QTL Derivation".

In the expression QTL table, the physical location of the probe set ID is shown, along with the location of the marker that represents the maximum LOD score for that transcript. The location of the marker with the maximum LOD score indicates the region of transcriptional control. For this analysis, if the physical location of a gene is near the location of transcriptional control, the gene is considered to be cis (locally)-regulated. Otherwise, the gene is considered trans (distally)-regulated. The physical location of probe(set)s were obtained using the BLAT software (http://www.kentinformatics.com) and mapping on the NCBI m37 mouse genome assembly and the RGSCv3.4 genome assembly obtained from the UCSC Genome Browser (http://genome.ucsc.edu).

Allen Brain Atlas

You can obtain the regional expression pattern of a given gene in the brain of a C57BL/6J mouse by clicking on the link provided in the Allen Brain Atlas column in the Basic Annotation table. These links are available ONLY for mouse gene lists. The Allen Brain Atlas (Lein et al., 2007) is an open-access database of gene expression in the C57BL/6J brain tissue. This database was created by the Allen Institute for Brain Science (Seattle, Washington) and contains data for genome-wide RNA expression obtained using high-throughput, in situ hybridization. In addition to the expression data, the Atlas also has a number of tools available for analyzing and visualizing the in situ images. Click the Instructions link, at the top of the Allen Brain Atlas column, to see basic instructions for viewing images on the Allen Brain Atlas. Comprehensive help documentation is available online at: http://www.brain-map.org.

References
  1. Allen Brain Atlas [Internet]. Seattle (WA): Allen Institute for Brain Science. © 2004-2007. Available from: http://www.brain-map.org .
  2. Lein ES et al. (2007). Genome-wide atlas of gene expression in the adult mouse brain. Nature 445:168-176, doi:10.1038/nature05453.
  3. Markram H (2007). Industrializing neuroscience. Nature 445:160-161.

More Annotation

More annotation options allow you to select a customized set of databases for the annotation of a given gene list. After you select a gene list, you can perform annotation and select one or more of the different databases to obtain further annotation. You can download this information. See "Using More Annotation Options" for details.

See Also

Performing Annotation

Using More Annotation Options