Source:
NIA
RefSeq(NM)
RefSeq(XM)
Ensembl
GenBank
dbEST
| Other options:
intronic status
gene status
primary vs. copy
gene symbol
Protein-coding genes have ORF ≥100 aa or a known function (excluding gene copies, repeats and models, see below)
Gene candidates = multi-exon or ORF ≥100 aa but not protein-coding genes; include:
(a) gene copies (=pseudogenes and possibly duplicated functional genes)
(b) repeats = sequences with >90% repeat
(c) gene models = have no EST/mRNA support
(d) non-coding genes = multiple exons but not protein-coding (see above)
Non-genes = single-exon genes with ORF <100 aa.
Primary U-clusters = (not gene copies) those that have >30% of first alignments
Good symbols = non-numeric gene symbols (e.g., not including 4932411A10Rik or LOC385574)
|