NIA Mouse Gene Index, mm8
Field Annotations in Output Tables
All tables are tab-separated text files. Some fields contain comma-separated lists
File: U-annotation.txt - Gene annotations
U-cluster name
Annotation
Symbols, comma separated
File: U-members1.txt - Members and attributes
U-cluster name
Chromosome
Strand
Start position (starts from 0)
End position
Members, comma separated
Intronic
Sum of all exons lengths
Gene member sources, binary coded (NIA=1,Riken=2,Ensembl=4,RefseqXM=8,RefseqNM=16,GenBank=32,dbEST=64)
Number of exons
ORF length, aa
Primary gene=1, pseudogene/copy=2
Percent repeats
Potentially wrong strand
Number of introns with correct splice concensus
Is gene (or gene candidate)
Protein-coding gene
Gene symbol
File: U-genes.txt - List of genes with attributes
Gene name
Chromosome
Strand
Start position (starts from 0)
End position
Intronic
Sum of all exons lengths
Gene member sources, binary coded (NIA=1,Riken=2,Ensembl=4,RefseqXM=8,RefseqNM=16,GenBank=32,dbEST=64)
Number of exons
ORF length, aa
Primary gene=1, pseudogene/copy=2
Percent repeats
Number of introns with correct splice concensus
Protein-coding gene
Gene symbol
File: U-exons1.txt - Exon scheme
U-cluster name
Chromosome
Strand
Number of exons
Exon Sizes (comma separated)
Exon start positions (comma separated)
Number of exon forms
Sizes of exon forms (comma separated)
Start positions of exon forms (comma separated)
Exon number for exach exon form (comma separated)
Number of all possible introns (connections of exon forms)
Intron starts (comma separated)
Intron ends (comma separated)
Intron status (comma separated) proper splice sites=4; bad splice sites=2; retained intron=1.
ORF segment lengths, split into exons (comma separated)
ORF segment starts in the genome (comma separated)
File: T-major.txt - Major transcript for each gene
U-cluster name
Major transcript (mostly with. longest ORF)
U-cluster score
Transcript score
File: U-wrong-strand.txt - Gene potentially wrong strand
U-cluster name with potentially wrong strand
Better supported U-cluster with which the previous U-cluster overlaps
File: T-annotation.txt - Transcript annotations
Transcript name
Annotation
Symbols, comma separated
Source of annotation
File: T-members1.txt - Transcript members and attributes
Transcript name
Chromosome
Strand
Start position (starts from 0)
End position
Members, comma separated
Intronic
Lengths
Gene member sources, binary coded (NIA=1,Riken=2,Ensembl=4,RefseqXM=8,RefseqNM=16,GenBank=32,dbEST=64)
Number of exons
ORF length, aa
Primary gene=1, pseudogene/copy=2
Percent repeats
Potentially wrong strand
Number of introns with correct splice concensus
File: T-psl.txt - Transcript genome alignment (BLAT-psl)
n_match
n_mismatch
repeat_match
N-count
QgapCount
QgapBases
TgapCount
TgapBases
strand
Qname
Qsize
Qstart (start from 0)
Qend
Tname
Tsize
Tstart (start from 0)
Tend
blockCount
blockSizes
qStarts (start from 0)
tStarts (start from 0)
Intron status (1=splice sites, 0=no splice sites, -9999=non intron gap, 2=clone-link gap)
File: T-psl-member.txt - Transcript member alignment (BLAT-psl)
n_match
n_mismatch
repeat_match
N-count
QgapCount
QgapBases
TgapCount
TgapBases
strand
Qname
Qsize
Qstart (start from 0)
Qend
Tname
Tsize
Tstart (start from 0)
Tend
blockCount
blockSizes
qStarts (start from 0)
tStarts (start from 0)
File: Torf-param.txt - Transcript ORF attributes
Transcropt name
ORF start, bp (start from 0)
ORF end, bp
ORF break point, bp (if ORF consist of 2 overlapping parts, then this is the end of 2 parts)
ORF length, aa
First aminoacid
Kozak consensus: 0=weak or undetermined, 1=adequate, 2=strong, 3=optimal
File: T-repeat.txt - Transcript sequence repeats
Transcript name
Transcript length
Total repeat length
Repeat block lengths (comma separated)
Repeat block starts (start from 0, comma separated)
File: ATSlist.txt - Alternative transcription/splicing (ATS)
Transcript name
Chromosome
Strand
Start of ATS unit
End of ATS unit
Number of exons
Position class (not important)
logLength class (not important)
ATS Type
Insert in the main transcript
ATS code
Contains introns without good donor-acceptor motifs (0/1)
in ORF (0/1)
Main block starts (start from 0, comma separated)
Main block lengths (comma separated)
Alternative block starts (start from 0, comma separated)
Alternative block lengths (comma separated)
Correct main transcript (=1 if the main transcript has > supporting sequences than alternative transcript, otherwise =0)
Number of supporting sequences
File: delete.txt - Alignments deleted because of interference with other genes
Alignment names (sequence name: copy number)
File: truncated.txt - Alignments truncated because of interference with other genes
n_match
n_mismatch
repeat_match
N-count
QgapCount
QgapBases
TgapCount
TgapBases
strand
Qname
Qsize
Qstart (start from 0)
Qend
Tname
Tsize
Tstart (start from 0)
Tend
blockCount
blockSizes
qStarts (start from 0)
tStarts (start from 0)
First block included (start from 0)
Last block included
File: Uold-new.txt - Correspondence between U-clusters in ver.4 and ver.5 (multiple to one)
U-cluster in gene index ver. 4
U-cluster in gene index ver. 5
File: Ukeys.txt - Correspondence between U-clusters in previous run of ver. 5
U-cluster in the previous run of ver. 5 (before 07/14/2005)
U-cluster in the current ver. 5 (07/14/2005)
File: Tkeys.txt - Correspondence between transcripts in previous run of ver. 5
Transcript name in the previous run of ver. 5 (before 07/14/2005)
Transcript name in the current ver. 5 (07/14/2005)