help  | faq  | software  | BAR

Data

ThaleMine integrates data from a large number of sources into a single data warehouse.

This page lists the data that are included in the current release and it is manually curated; its contents are not indexed in our keyword search.

More data sets will be added in future releases, please contact us if there are any particular data you would like to see included.


Data Category Data Source PubMed

Genome

TAIR10 Genome assembly (5 chromosomes plus chloroplast and mitochondrial assemblies) NCBI- Release TAIR10 (2018/04/06) Arabidopsis Genome Initiative - PubMed: 11130711
Araport11 GFF3 data from TAIR TAIR- Release Araport11 (2016/06/17) Cheng et al., 2016 - PubMed: 27862469

Proteins

High-quality, manually annotated, non-redundant protein sequence database. Swiss-Prot- Release 2023_03 UniProt Consortium - PubMed: 17142230
Computationally analysed records, enriched with automatic annotation TrEMBL- Release 2023_03
Protein family and domain assignments to proteins InterPro- Release v95.0 Mitchell et al., 2019 - PubMed: 30398656

Homology

Orthologue relationships based on the inferred speciation and gene duplication events in the phylogenetic tree. Panther- Release 17.0 Mi et al - PubMed: 23193289
Paralogue relationships based on the inferred speciation and gene duplication events in the phylogenetic tree. Panther- Release 17.0 Mi et al - PubMed: 23193289
Phytozome Homologs generated with InParanoid Phytozome- realtime Goodstein et al - PubMed: 22110026

Curation

Manually curated TAIR functional descriptions TAIR- Release 20220630 Huala et al - PubMed: 11125061
Manually curated TAIR gene aliases TAIR- Release 20220630 Huala et al - PubMed: 11125061

Gene Ontology

GO annotations from Gene Ontology Gene Ontology- Release 2023-06-11 Berardini et al., 2004 - PubMed: 15173566
Gene Ontology Consortium - PubMed:10802651
Several electronic and manual GO annotation methods utilized by UniProt UniProt- Release 2023_03 UniProt Consortium - PubMed: 17142230

Interactions

Curated set of genetic and physical interactions for Arabidopsis thaliana BioGRID- Release 4.4.233 Chatr-Aryamontri et al., 2014 - PubMed: 25428363
Curated binary and complex protein-protein interactions for Arabidopsis thaliana IntAct- Downloaded 20230706 Kerrien et al., 2012 - PubMed: 22121220

Expression

Electronic Fluorescent Pictograph (eFP) Visualization paints gene expression information from one of the AtGenExpress data sets or other compendia for a desired gene onto a diagrammatic representation of Arabidopsis thaliana plants. BAR eFP Webservice- realtime Winter et al., 2007 - PubMed: 17684564
Brady et al., 2009 - PubMed: 19401381

Co-Expression

Co-regulated gene relationships deduced from microarray and RNA-seq data via ATTED-II web services ATTED-II Co-expression- realtime Obayashi et al., 2014 - PubMed: 24334350

Publications

Curated associations between publications and genes from UniProt UniProt- Release 2023_03 UniProt Consortium - PubMed: 17142230
Publications from InterPro InterPro- Release v95.0 Mitchell et al., 2019 - PubMed: 30398656
Publications from NCBI NCBI- Downloaded 20230706 Maglott et al., 2007 - PubMed: 17148475

GeneRIF

Concise phrase describing gene function and publication associated with NCBI Gene records NCBI- Downloaded 20230706 Maglott et al., 2007 - PubMed: 17148475