help  | faq  | software  | BAR


ThaleMine integrates data from a large number of sources into a single data warehouse.

This page lists the data that are included in the current release and it is manually curated; its contents are not indexed in our keyword search.

More data sets will be added in future releases, please contact us if there are any particular data you would like to see included.

Data Category Data Source PubMed


TAIR10 Genome assembly (5 chromosomes plus chloroplast and mitochondrial assemblies) NCBI- Release TAIR10 (2018/04/06) Arabidopsis Genome Initiative - PubMed: 11130711
Araport11 GFF3 data from TAIR TAIR- Release Araport11 (2016/06/17) Cheng et al., 2016 - PubMed: 27862469


High-quality, manually annotated, non-redundant protein sequence database. Swiss-Prot- Release 2019_11 UniProt Consortium - PubMed: 17142230
Computationally analysed records, enriched with automatic annotation TrEMBL- Release 2019_11
Protein family and domain assignments to proteins InterPro- Release v76.0 Mitchell et al., 2019 - PubMed: 30398656


Orthologue and paralogue relationships based on the inferred speciation and gene duplication events in the phylogenetic tree. Panther- Release 14.1 Mi et al - PubMed: 23193289
Phytozome Homologs generated with InParanoid Phytozome- realtime Goodstein et al - PubMed: 22110026

Gene Ontology

GO annotations from Gene Ontology Gene Ontology- Release 2020-01-01 Berardini et al., 2004 - PubMed: 15173566
Gene Ontology Consortium - PubMed:10802651
Several electronic and manual GO annotation methods utilized by UniProt UniProt- Release 2019_11 UniProt Consortium - PubMed: 17142230


Curated set of genetic and physical interactions for Arabidopsis thaliana BioGRID- Release 3.5.180 Chatr-Aryamontri et al., 2014 - PubMed: 25428363
Curated binary and complex protein-protein interactions for Arabidopsis thaliana IntAct- Downloaded 20200121 Kerrien et al., 2012 - PubMed: 22121220


Electronic Fluorescent Pictograph (eFP) Visualization paints gene expression information from one of the AtGenExpress data sets or other compendia for a desired gene onto a diagrammatic representation of Arabidopsis thaliana plants. BAR eFP Webservice- realtime Winter et al., 2007 - PubMed: 17684564
Brady et al., 2009 - PubMed: 19401381


Co-regulated gene relationships deduced from microarray and RNA-seq data via ATTED-II web services ATTED-II Co-expression- realtime Obayashi et al., 2014 - PubMed: 24334350


Curated associations between publications and genes from UniProt UniProt- Release 2019_11 UniProt Consortium - PubMed: 17142230
Publications from InterPro InterPro- Release v77.0 Mitchell et al., 2019 - PubMed: 30398656
Publications from NCBI NCBI- Downloaded 20200121 Maglott et al., 2007 - PubMed: 17148475


Concise phrase describing gene function and publication associated with NCBI Gene records NCBI- Downloaded 20200121 Maglott et al., 2007 - PubMed: 17148475