Field Annotations in Output Tables
All tables are tab-separated text files. Some fields contain comma-separated lists
File: U-annotation.txt - Gene annotations
- U-cluster name
- Annotation
- Symbols, comma separated
File: U-members1.txt - Members and attributes
- U-cluster name
- Chromosome
- Strand
- Start position (starts from 0)
- End position
- Members, comma separated
- Intronic
- Sum of all exons lengths
- Gene member sources, binary coded (NIA=1,Riken=2,Ensembl=4,RefseqXM=8,RefseqNM=16,GenBank=32,dbEST=64)
- Number of exons
- ORF length, aa
- Primary gene=1, pseudogene/copy=2
- Percent repeats
- Potentially wrong strand
- Number of introns with correct splice concensus
- Is gene (or gene candidate)
- Protein-coding gene
- Gene symbol
File: U-genes.txt - List of genes with attributes
- Gene name
- Chromosome
- Strand
- Start position (starts from 0)
- End position
- Intronic
- Sum of all exons lengths
- Gene member sources, binary coded (NIA=1,Riken=2,Ensembl=4,RefseqXM=8,RefseqNM=16,GenBank=32,dbEST=64)
- Number of exons
- ORF length, aa
- Primary gene=1, pseudogene/copy=2
- Percent repeats
- Number of introns with correct splice concensus
- Protein-coding gene
- Gene symbol
File: U-exons1.txt - Exon scheme
- U-cluster name
- Chromosome
- Strand
- Number of exons
- Exon Sizes (comma separated)
- Exon start positions (comma separated)
- Number of exon forms
- Sizes of exon forms (comma separated)
- Start positions of exon forms (comma separated)
- Exon number for exach exon form (comma separated)
- Number of all possible introns (connections of exon forms)
- Intron starts (comma separated)
- Intron ends (comma separated)
- Intron status (comma separated) proper splice sites=4; bad splice sites=2; retained intron=1.
- ORF segment lengths, split into exons (comma separated)
- ORF segment starts in the genome (comma separated)
File: T-major.txt - Major transcript for each gene
- U-cluster name
- Major transcript (mostly with. longest ORF)
- U-cluster score
- Transcript score
File: U-wrong-strand.txt - Gene potentially wrong strand
- U-cluster name with potentially wrong strand
- Better supported U-cluster with which the previous U-cluster overlaps
File: T-annotation.txt - Transcript annotations
- Transcript name
- Annotation
- Symbols, comma separated
- Source of annotation
File: T-members1.txt - Transcript members and attributes
- Transcript name
- Chromosome
- Strand
- Start position (starts from 0)
- End position
- Members, comma separated
- Intronic
- Lengths
- Gene member sources, binary coded (NIA=1,Riken=2,Ensembl=4,RefseqXM=8,RefseqNM=16,GenBank=32,dbEST=64)
- Number of exons
- ORF length, aa
- Primary gene=1, pseudogene/copy=2
- Percent repeats
- Potentially wrong strand
- Number of introns with correct splice concensus
File: T-psl.txt - Transcript genome alignment (BLAT-psl)
- n_match
- n_mismatch
- repeat_match
- N-count
- QgapCount
- QgapBases
- TgapCount
- TgapBases
- strand
- Qname
- Qsize
- Qstart (start from 0)
- Qend
- Tname
- Tsize
- Tstart (start from 0)
- Tend
- blockCount
- blockSizes
- qStarts (start from 0)
- tStarts (start from 0)
- Intron status (1=splice sites, 0=no splice sites, -9999=non intron gap, 2=clone-link gap)
File: T-psl-member.txt - Transcript member alignment (BLAT-psl)
- n_match
- n_mismatch
- repeat_match
- N-count
- QgapCount
- QgapBases
- TgapCount
- TgapBases
- strand
- Qname
- Qsize
- Qstart (start from 0)
- Qend
- Tname
- Tsize
- Tstart (start from 0)
- Tend
- blockCount
- blockSizes
- qStarts (start from 0)
- tStarts (start from 0)
File: Torf-param.txt - Transcript ORF attributes
- Transcropt name
- ORF start, bp (start from 0)
- ORF end, bp
- ORF break point, bp (if ORF consist of 2 overlapping parts, then this is the end of 2 parts)
- ORF length, aa
- First aminoacid
- Kozak consensus: 0=weak or undetermined, 1=adequate, 2=strong, 3=optimal
-
File: T-repeat.txt - Transcript sequence repeats
- Transcript name
- Transcript length
- Total repeat length
- Repeat block lengths (comma separated)
- Repeat block starts (start from 0, comma separated)
File: delete.txt - Alignments deleted because of interference with other genes
- Alignment names (sequence name: copy number)
File: truncated.txt - Alignments truncated because of interference with other genes
- n_match
- n_mismatch
- repeat_match
- N-count
- QgapCount
- QgapBases
- TgapCount
- TgapBases
- strand
- Qname
- Qsize
- Qstart (start from 0)
- Qend
- Tname
- Tsize
- Tstart (start from 0)
- Tend
- blockCount
- blockSizes
- qStarts (start from 0)
- tStarts (start from 0)
- First block included (start from 0)
- Last block included