![]() |
NIA Mouse Gene Index Ver. 3 | ![]() |
Version 3 of the NIA Mouse Gene Index differs from earlier versions in 3 respects

The NIA Mouse Gene Index ver. 3 was asssembled from sequences alignmed to the genome (October 2003 release) using our new All-Alignment-Assembly (AAA) algorithm. Start and end sites of each intron were examined for splicing consensus. We used canonical (GT-AG) as well as 2 major non-canonical (GC-AG and AT-AC) splicing consensuses, which were well validated (Burset et al. 2000).
The Gene Index has 145,083 U-clusters (transcription loci) and 218,812 transcripts. U-clusters were classified as genes if they had either ORF>=100 aa, or multiple exons separated by an intron with a splice site concensus, or a gene symbol for some member alignment annotated by RefSeq, GenBank, or Ensembl. Among 43,069 genes, 27,316 were protein coding (ORF>=100 aa or known function), 6,717 were non-coding genes or gene fragments with ORF < 100 aa, 959 had high repeat content (>90%), 1,842 were gene models from Ensembl and RefSeq-XM with no EST or mRNA support in our assembly, and 6,235 were gene duplications and/or pseudogenes.
Major data sets can be downloaded from here.
Transcript view
provides information on a particular transcript. A link to the U-cluster
returns to the genomic view. At the top there is a plot of the transcript
its open reading frame (ORF). Character "M" or "L" at the start of ORF indicates
that the first aminoacid is Methyonine or Lysin, respectively. A green bar next to "M" indicates
the presence of a Kozak consensus. Below the transcript there are members
of the transcript plotted on a white background: Refseq, Ensembl, Riken,
and NIA clusters. Individual ESTs are plotted on a
gray background (NIA library name is indicated on the right). At the
bottom of the page there are lists of protein domains and a list of GO-terms
associated with the gene symbol. Click on the transcript sequence to get to
the sequence view.
Sequence view
provides information on the nucleotide and protein sequence of a transcript.
In addition it lists protein domains, GO-terms, repeat and regions.
There are links to several sequence analysis tools: BLAST, BLAT, ORF finder.