| Size: 6721 Comment:  | Size: 11623 Comment:  | 
| Deletions are marked like this. | Additions are marked like this. | 
| Line 3: | Line 3: | 
| = Enrichment Map Genesets = | = Enrichment Map Gene Sets = | 
| Line 5: | Line 5: | 
| [[http://baderlab.org/Software/EnrichmentMap|EnrichmentMap]] is a Cytoscape plugin developed in the Baderlab to help visualize, navigate and analyze functional enrichment results as generated from programs such as [[http://www.broadinstitute.org/gsea/index.jsp|Gene Set Enrichment Analysis(GSEA)]], [[http://www.psb.ugent.be/cbd/papers/BiNGO/Home.html|BiNGO]], or [[http://david.abcc.ncifcrf.gov/summary.jsp|David]].  Some enrichment programs, such as GSEA, allow the user to search against their own gene set database.  As annotation (gene set) sources are regularly updated as new information is discovered we set up an automated system to update our gene set collections so we are always using the most up-to-date annotations. If you use these gene sets, please cite our [[http://www.ncbi.nlm.nih.gov/pubmed/21085593|Enrichment Map paper]]. | |
| Line 7: | Line 10: | 
| * Enrichment Map Genesets are a set of Gene Set files in GMT format (compatible with [[http://www.broadinstitute.org/gsea/index.jsp| GSEA]]) updated '''monthly''' from original source locations available with: | * Gene Set Files can be downloaded from : [[http://download.baderlab.org/EM_Genesets/]] * Enrichment Map Gene Sets are a set of Gene Set files in GMT format (compatible with [[http://www.broadinstitute.org/gsea/index.jsp| GSEA]]) updated '''monthly''' from original source locations available with: | 
| Line 11: | Line 15: | 
| * The GMT File format contains one gene set per line. Each line contains: | * The GMT File format contains one Gene Set per line. Each line contains: | 
| Line 14: | Line 18: | 
| * Name = Gene set Name | Gene set Source | Gene set Source identifier * example --> ATP-dependent protein binding|GO|GO:0043008 '''OR''' arginine biosynthesis IV|HUMANCYC|ARGININE-SYN4-PWY * Description = Gene set Name * example --> ATP-dependent protein binding '''OR''' arginine biosynthesis IV * Gene = identified by one of the three possible identifiers (Engrez gene id, !UniProt accession or gene symbols) | * Name = Gene Set Name | Gene Set Source | Gene Set Source identifier * Example --> ATP-dependent protein binding|GO|GO:0043008 '''OR''' arginine biosynthesis IV|HUMANCYC|ARGININE-SYN4-PWY * Description = Gene Set Name * Example --> ATP-dependent protein binding '''OR''' arginine biosynthesis IV * Gene = identified by one of the three possible identifiers (Entrez gene id, !UniProt accession or gene symbols) == Current Stats == * Human - http://download.baderlab.org/EM_Genesets/current_release/Human/Summary_Geneset_Counts.txt * Mouse - http://download.baderlab.org/EM_Genesets/current_release/Mouse/Summary_Geneset_Counts.txt | 
| Line 22: | Line 30: | 
| || '''Source''' || '''File Origin''' || '''File Type''' || '''ID extracted''' || '''Frequency source is updated''' ||  '''Number of pathwayss''' || '''Notes''' || || KEGG || KEGG ftp site (July 2011) || gmt || symbol || static as of July 1, 2011 || 236 || Not available in biopax, available in flatfile, translated into gmt files || || Msigdb - c2|| static (needs to be updated manually) || gmt || Entrez gene || sporadically || Biocarta - 217<<BR>> Other - 47 || Only need other and Biocarta as all other sources are currently covered || || NCI || [[http://pid.nci.nih.gov/download.shtml|NCI]] || biopax || Entrez gene || sporadically || 219 pathways || || || {X} Biocarta || [[http://pid.nci.nih.gov/download.shtml|NCI]] || biopax || Entrez gene || static || 386 pathways || '''Biopax 3 - Complete Mess!''' - currently getting from Msigdb || || IOB || directly from IOB - static (July 2011) || biopax || Entrez gene || sporadically || 35 pathways - <<BR>> 10 are the same as !CellMap,<<BR>> 1 is the same as !NetPath|| need biopax pathways fixed so species info is correct but information is still extractable. || || !NetPath || [[www.netpath.org/browse]] (scripted grab of file numbered 1-25) || biopax || Entrez gene || static || 25 pathways - <<BR>> 12 are cancer pathways (10 are !CellMap) <<BR>> 13 are immunity pathways || need biopax pathways fixed so species info is correct but information is still extractable. || || !HumanCyc || scripted grab of zipped release from password protected website. || biopax || Uniprot || updated periodically || 249 Pathways || available in biopax level 2 and level 3 || || Reactome || scripted grab of zipped release from website || biopax || Uniprot || updated release || 1117 pathways (release 37) || No way of getting version of release from biopax file || || GO || scripted grab from EBI ftp site (human) || GAF || Uniprot || released once a month || 13,034 no GO IEA <<BR>> 15,181 with GO IEA || source is direct from original curator of annotations || || msigdb - c3 <<BR>> Specialty GMTs <<BR>> mirs, transcription factors || grab from Msigdb || gmt || Entrez gene || sporadically || 221 miRs <<BR>> 616 TFs || || | || '''Source''' || '''File Origin''' || '''File Type''' || '''ID extracted''' || '''Frequency source is updated''' ||  '''Number of pathways''' || || [[http://www.genome.jp/kegg/|KEGG]] ([[#ref1|1]]) || KEGG ftp site (July 2011) || GMT || Symbol || static as of July 1, 2011 || 236 || || [[http://www.broadinstitute.org/gsea/msigdb/index.jsp|Msigdb - c2]] ([[#ref2|2]]) <<BR>> (other + Biocarta) || manual download from Msigdb || GMT || Entrez gene || sporadically || Biocarta - 217<<BR>> Other - 47 || || [[http://pid.nci.nih.gov/|NCI]] ([[#ref3|3]]) || scripted download of zipped release from website || BioPAX || Entrez gene || sporadically || 219 pathways || || [[http://www.ibioinformatics.org/|Institute of Bioinformatics (IOB)]] || received directly from IOB - static (July 2011) || BioPAX || Entrez gene || sporadically || 35 pathways - <<BR>> 10 are the same as !CellMap,<<BR>> 1 is the same as !NetPath|| || [[http://www.netpath.org/browse/|NetPath]]([[#ref4|4]]) <<BR>> [also from IOB] || scripted download of files numbered 1-25 || BioPAX || Entrez gene || static || 25 pathways - <<BR>> 12 are cancer pathways (10 are !CellMap) <<BR>> 13 are immunity pathways || || [[http://humancyc.org/|HumanCyc]] ([[#ref5|5]]) || scripted download of zipped release from password protected website. || BioPAX || !UniProt || updated periodically || 249 Pathways || || [[http://www.reactome.org/ReactomeGWT/entrypoint.html|Reactome]] ([[#ref6|6]]) || scripted download of zipped release from website || BioPAX || !UniProt || updated release || 1117 pathways (release 37) || || [[http://www.ebi.ac.uk/GO/|GO]] ([[#ref7|7]]) || scripted download from EBI ftp site (human) || GAF || Uniprot || released once a month || 13,034 no GO IEA <<BR>> 15,181 with GO IEA || || [[http://www.broadinstitute.org/gsea/msigdb/index.jsp|Msigdb - c3]] ([[#ref2|2]]) <<BR>> Specialty GMTs <<BR>> mirs, transcription factors || manual download from Msigdb || GMT || Entrez gene || sporadically || 221 miRs <<BR>> 616 TFs || | 
| Line 35: | Line 42: | 
| || '''Source''' || '''File Origin''' || '''File Type''' || '''ID extracted''' || '''Frequency source is updated''' ||  '''Number of pathwayss''' || '''Notes''' || || Reactome || scripted grab of zipped release from website || biopax || Uniprot || updated release || 946 pathways (release 37) || No way of getting version of release from biopax file || || GO || scripted grab from MGI ftp site (human) || GAF || MGI || released once a month || 14,563 no GO IEA <<BR>> 15,041 with GO IEA || source is direct from original curator of annotations || || KEGG || ''translated from Human using Homologene'' || gmt || Entrezgene || static as of July 1, 2011 || 236 || Not available in mouse specific format || || Msigdb - c2|| ''translated from Human using Homologene'' || gmt || Entrez gene || sporadically || total 880:<<BR>> Kegg -186<<BR>> Reactome - 430<<BR>> Biocarta - 217<<BR>> Other - 47 || Only need other and Biocarta as all other sources are currently covered || || NCI || ''translated from Human using Homologene'' || gmt || Entrez gene || sporadically || 219 pathways || || || IOB || ''translated from Human using Homologene'' || gmt || Entrez gene || sporadically || 35 pathways - <<BR>> 10 are the same as !CellMap,<<BR>> 1 is the same as !NetPath|| need biopax pathways fixed so species info is correct but information is still extractable. || || !NetPath || ''translated from Human using Homologene'' || gmt || Entrez gene || static || 25 pathways - <<BR>> 12 are cancer pathways (10 are !CellMap) <<BR>> 13 are immunity pathways || need biopax pathways fixed so species info is correct but information is still extractable. || || !HumanCyc || ''translated from Human using Homologene'' || gmt || Entrez gene || updated periodically || 249 Pathways || available as Mousecyc in biopax but when we parsed it we got a fraction of the pathways that are in human so chose to convert the human files instead || | || '''Source''' || '''File Origin''' || '''File Type''' || '''ID extracted''' || '''Frequency source is updated''' ||  '''Number of pathways''' || || [[http://www.reactome.org/ReactomeGWT/entrypoint.html|Reactome]] ([[#ref6|6]]) || scripted download of zipped release from website || BioPAX || !UniProt || updated release || 946 pathways (release 37) || || [[http://www.informatics.jax.org/mgihome/GO/project.shtml|GO]] ([[#ref7|7]]) || scripted download from MGI ftp site (mouse) || GAF || MGI || released once a month || 14,563 no GO IEA <<BR>> 15,041 with GO IEA || || [[http://www.genome.jp/kegg/|KEGG]] ([[#ref1|1]]) || ''translated from Human using Homologene'' || GMT || Entrez gene || static as of July 1, 2011 || 236 || || [[http://www.broadinstitute.org/gsea/msigdb/index.jsp|Msigdb - c2]] ([[#ref2|2]]) <<BR>> (other + Biocarta)|| ''translated from Human using Homologene'' || GMT || Entrez gene || sporadically || total 880:<<BR>> Kegg -186<<BR>> Reactome - 430<<BR>> Biocarta - 217<<BR>> Other - 47 || || [[http://pid.nci.nih.gov/|NCI]] ([[#ref3|3]]) || ''translated from Human using Homologene'' || GMT || Entrez gene || sporadically || 219 pathways || || [[http://www.ibioinformatics.org/|Institute of Bioinformatics (IOB)]] || ''translated from Human using Homologene'' || GMT || Entrez gene || sporadically || 35 pathways - <<BR>> 10 are the same as !CellMap,<<BR>> 1 is the same as !NetPath|| || [[http://www.netpath.org/browse/|NetPath]] ([[#ref4|4]]) <<BR>> [also from IOB] || ''translated from Human using Homologene'' || GMT || Entrez gene || static || 25 pathways - <<BR>> 12 are cancer pathways (10 are !CellMap) <<BR>> 13 are immunity pathways || || [[http://humancyc.org/|HumanCyc]] ([[#ref5|5]]) || ''translated from Human using Homologene'' || GMT || Entrez gene || updated periodically || 249 Pathways || | 
| Line 45: | Line 52: | 
| == Specialty Gene Sets == * The bulk of our genesets are groupings from similar biological processes, pathways and functional annotations but there are a few additional collections of sets that we don't group with them. They include: 1. '''miRs''' - sets consisting of all the targets for a given microRNA. * miR genesets are retrieved from Msigdb c3 collection. 1. '''Transcription Factors''' - sets consisting of all the targets for a given transcription factor. * TF genesets are retrieved from Msigdb c3 collection. 1. '''Disease Phenotype''' - sets consisting of all known proteins associated with the given disease. * Disease phenotype genesets are retrieved from the [[http://www.human-phenotype-ontology.org/|Human phenotype ontology]]. Genes associated with a particular disease are annotated to it. In addition, in the same style as the Gene Ontology, the relationship between each disease is stored creating an ontology of diseases. Annotations are up-propagated to related disease terms. 1. '''Drugs Targets''' - sets consisting of all the known or predicted targets for a given drug. * Drug target information is retrieved from [[http://stitch.embl.de/|STITCH]]. STITCH is a resource containing the amalgamation of many different databases. A score is attached to each protein - chemical interaction. '''For the purpose of our genesets we only include protein - chemical interactions that have a combined score greater than 900''' | |
| Line 63: | Line 80: | 
| * In each <identifier> directory There are amalgamated gene set files: * AllPathways - contains all pathway sources in the Pathways directory * GOPathways - contains all GO (mf, bp, cc) and all Pathway sources in the Pathways directory. | * In each <identifier> directory There are amalgamated Gene Set files: * !AllPathways - contains all pathway sources in the Pathways directory * GOPathways - contains all GO (MF, BP, CC) and all Pathway sources in the Pathways directory. | 
| Line 67: | Line 84: | 
| == Creating customized Genesets == 1. Download the desired gene set files you would like to use in your customized set. (For example Human_IOB_Entrezgene.gmt Human_NetPath_Entrezgene.gmt ) | == Creating customized Gene Sets == 1. Download the desired gene set files you would like to use in your customized set and concatenate the files.<<BR>>For example, to combine Human_IOB_Entrezgene.gmt Human_NetPath_Entrezgene.gmt, you can use the following linux command: | 
| Line 73: | Line 90: | 
| == References == 1. <<Anchor(ref1)>> Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M. '''KEGG for integration and interpretation of large-scale molecular data sets.''' Nucleic Acids Res. 2011 Nov 10. PMID: 22080510 <<BR>> [[http://www.ncbi.nlm.nih.gov/pubmed/22080510|Pubmed]] 2. <<Anchor(ref2)>> Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. '''Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.''' Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50. PMID: 16199517 <<BR>> [[http://www.ncbi.nlm.nih.gov/pubmed/16199517|Pubmed]] 3. <<Anchor(ref3)>> Schaefer CF, Anthony K, Krupa S, Buchoff J, Day M, Hannay T, Buetow KH. '''PID: the Pathway Interaction Database.''' Nucleic Acids Res. 2009 Jan;37(Database issue):D674-9. PMID: 18832364 <<BR>> [[http://www.ncbi.nlm.nih.gov/pubmed/18832364|Pubmed]] 4. <<Anchor(ref4)>> Kandasamy K, ''et a'' '''!NetPath: a public resource of curated signal transduction pathways.'''Genome Biol. 2010 Jan 12;11(1):R3. PMID: 20067622<<BR>> [[http://www.ncbi.nlm.nih.gov/pubmed/20067622|Pubmed]] 5. <<Anchor(ref5)>> Romero P, Wagg J, Green ML, Kaiser D, Krummenacker M, Karp PD. '''Computational prediction of human metabolic pathways from the complete human genome.''' Genome Biol. 2005;6(1):R2. Epub 2004 Dec 22. PMID: 15642094 <<BR>> [[http://www.ncbi.nlm.nih.gov/pubmed/15642094|Pubmed]] 6. <<Anchor(ref6)>> Croft D, O'Kelly G, Wu G, Haw R, Gillespie M, Matthews L, Caudy M, Garapati P, Gopinath G, Jassal B, Jupe S, Kalatskaya I, Mahajan S, May B, Ndegwa N, Schmidt E, Shamovsky V, Yung C, Birney E, Hermjakob H, D'Eustachio P, Stein L. '''Reactome: a database of reactions, pathways and biological processes''' Nucleic Acids Res. 2011 Jan;39(Database issue):D691-7. PMID: 21067998 <<BR>> [[http://www.ncbi.nlm.nih.gov/pubmed/21067998|Pubmed]] 7. <<Anchor(ref7)>> Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. '''Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.''' Nat Genet. 2000 May;25(1):25-9. PMID: 10802651 <<BR>> [[http://www.ncbi.nlm.nih.gov/pubmed/10802651|Pubmed]] | 
Enrichment Map Gene Sets
Contents
EnrichmentMap is a Cytoscape plugin developed in the Baderlab to help visualize, navigate and analyze functional enrichment results as generated from programs such as Gene Set Enrichment Analysis(GSEA), BiNGO, or David. Some enrichment programs, such as GSEA, allow the user to search against their own gene set database. As annotation (gene set) sources are regularly updated as new information is discovered we set up an automated system to update our gene set collections so we are always using the most up-to-date annotations.
If you use these gene sets, please cite our Enrichment Map paper.
Summary
- Gene Set Files can be downloaded from : http://download.baderlab.org/EM_Genesets/ 
- Enrichment Map Gene Sets are a set of Gene Set files in GMT format (compatible with GSEA) updated monthly from original source locations available with: - Entrez gene ids
- UniProt accessions 
- Gene symbols
 
- The GMT File format contains one Gene Set per line.  Each line contains: - Name (tab) Description (tab) Gene (tab) Gene (tab) ...
- In our format: - Name = Gene Set Name | Gene Set Source | Gene Set Source identifier - Example --> ATP-dependent protein binding|GO|GO:0043008 OR arginine biosynthesis IV|HUMANCYC|ARGININE-SYN4-PWY 
 
- Description = Gene Set Name - Example --> ATP-dependent protein binding OR arginine biosynthesis IV 
 
- Gene = identified by one of the three possible identifiers (Entrez gene id, UniProt accession or gene symbols) 
 
- Name = Gene Set Name | Gene Set Source | Gene Set Source identifier 
 
Current Stats
- Human - http://download.baderlab.org/EM_Genesets/current_release/Human/Summary_Geneset_Counts.txt 
- Mouse - http://download.baderlab.org/EM_Genesets/current_release/Mouse/Summary_Geneset_Counts.txt 
Sources
- Human 
| Source | File Origin | File Type | ID extracted | Frequency source is updated | Number of pathways | 
| KEGG ftp site (July 2011) | GMT | Symbol | static as of July 1, 2011 | 236 | |
|  Msigdb - c2 (2)  | manual download from Msigdb | GMT | Entrez gene | sporadically |   Biocarta - 217 | 
| scripted download of zipped release from website | BioPAX | Entrez gene | sporadically | 219 pathways | |
| received directly from IOB - static (July 2011) | BioPAX | Entrez gene | sporadically |  35 pathways -   | |
| scripted download of files numbered 1-25 | BioPAX | Entrez gene | static |   25 pathways -  | |
| scripted download of zipped release from password protected website. | BioPAX | UniProt | updated periodically | 249 Pathways | |
| scripted download of zipped release from website | BioPAX | UniProt | updated release | 1117 pathways (release 37) | |
| scripted download from EBI ftp site (human) | GAF | Uniprot | released once a month |  13,034 no GO IEA  | |
|  Msigdb - c3 (2)  | manual download from Msigdb | GMT | Entrez gene | sporadically |  221 miRs  | 
- Mouse 
| Source | File Origin | File Type | ID extracted | Frequency source is updated | Number of pathways | 
| scripted download of zipped release from website | BioPAX | UniProt | updated release | 946 pathways (release 37) | |
| scripted download from MGI ftp site (mouse) | GAF | MGI | released once a month |  14,563 no GO IEA  | |
| translated from Human using Homologene | GMT | Entrez gene | static as of July 1, 2011 | 236 | |
|  Msigdb - c2 (2)  | translated from Human using Homologene | GMT | Entrez gene | sporadically |   total 880: | 
| translated from Human using Homologene | GMT | Entrez gene | sporadically | 219 pathways | |
| translated from Human using Homologene | GMT | Entrez gene | sporadically |  35 pathways -   | |
| translated from Human using Homologene | GMT | Entrez gene | static |   25 pathways -  | |
| translated from Human using Homologene | GMT | Entrez gene | updated periodically | 249 Pathways | 
Specialty Gene Sets
- The bulk of our genesets are groupings from similar biological processes, pathways and  functional annotations but there are a few additional collections of sets that we don't group with them.  They include: - miRs - sets consisting of all the targets for a given microRNA. - miR genesets are retrieved from Msigdb c3 collection.
 
- Transcription Factors - sets consisting of all the targets for a given transcription factor. - TF genesets are retrieved from Msigdb c3 collection.
 
- Disease Phenotype - sets consisting of all known proteins associated with the given disease. - Disease phenotype genesets are retrieved from the Human phenotype ontology. Genes associated with a particular disease are annotated to it. In addition, in the same style as the Gene Ontology, the relationship between each disease is stored creating an ontology of diseases. Annotations are up-propagated to related disease terms. 
 
- Drugs Targets - sets consisting of all the known or predicted targets for a given drug. - Drug target information is retrieved from STITCH. STITCH is a resource containing the amalgamation of many different databases. A score is attached to each protein - chemical interaction. For the purpose of our genesets we only include protein - chemical interactions that have a combined score greater than 900 
 
 
File Structure
< > denotes directory
- <Release> - directory is named according to date sets were updated. - <Species> - <Identifier> - (either Entrez gene, UniProt, Gene symbol) - <GO> - BP = biological process
- MF = molecular function
- CC = Cellular component
- All = BP + MF + CC
- no_GO_IEA - indicates that the file excludes GO annotations with evidence codes - 'IEA' (inferred from electronic annotation), 'ND' (No biological data available), 'RCA' (inferred from reviewed computational analysis) 
- with_GO_IEA - indicates that the file includes GO annotations with evidence codes - 'IEA' (inferred from electronic annotation), 'ND' (No biological data available), 'RCA' (inferred from reviewed computational analysis) 
 
- <Pathways> 
- <miRs> 
- <TF> 
- <Disease phenotypes> 
 
 
 
- In each <identifier> directory There are amalgamated Gene Set files: - AllPathways - contains all pathway sources in the Pathways directory 
- GOPathways - contains all GO (MF, BP, CC) and all Pathway sources in the Pathways directory.
 
Creating customized Gene Sets
- Download the desired gene set files you would like to use in your customized set and concatenate the files. 
 For example, to combine Human_IOB_Entrezgene.gmt Human_NetPath_Entrezgene.gmt, you can use the following linux command:
cat Human_IOB_Entrezgene.gmt Human_NetPath_Entrezgene.gmt > MyCustomizedSet.gmt
References
- Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 2011 Nov 10. PMID: 22080510 
 Pubmed
- Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50. PMID: 16199517 
 Pubmed
- Schaefer CF, Anthony K, Krupa S, Buchoff J, Day M, Hannay T, Buetow KH. PID: the Pathway Interaction Database. Nucleic Acids Res. 2009 Jan;37(Database issue):D674-9. PMID: 18832364 
 Pubmed
- Kandasamy K, et a NetPath: a public resource of curated signal transduction pathways.Genome Biol. 2010 Jan 12;11(1):R3. PMID: 20067622 
 Pubmed
- Romero P, Wagg J, Green ML, Kaiser D, Krummenacker M, Karp PD. Computational prediction of human metabolic pathways from the complete human genome. Genome Biol. 2005;6(1):R2. Epub 2004 Dec 22. PMID: 15642094 
 Pubmed
- Croft D, O'Kelly G, Wu G, Haw R, Gillespie M, Matthews L, Caudy M, Garapati P, Gopinath G, Jassal B, Jupe S, Kalatskaya I, Mahajan S, May B, Ndegwa N, Schmidt E, Shamovsky V, Yung C, Birney E, Hermjakob H, D'Eustachio P, Stein L. Reactome: a database of reactions, pathways and biological processes Nucleic Acids Res. 2011 Jan;39(Database issue):D691-7. PMID: 21067998 
 Pubmed
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000 May;25(1):25-9. PMID: 10802651 
 Pubmed
