| Size: 5850 Comment:  | Size: 12969 Comment:  | 
| Deletions are marked like this. | Additions are marked like this. | 
| Line 1: | Line 1: | 
| === Suggested readings === | #acl CscGroup:read,write,revert = Pathway and Network Analysis Service = == Cancer Stem Cell program == ---- ----- == Veronique Voisin == [email protected] located at TMDT 8th floor on Tuesday ---- ----- == Introduction about the service: == The Pathway and Network Analysis Service is freely available to all Cancer Stem Cell program members. High-throughput genomic experiments (e.g. gene expression, protein expression, molecular interactions, large-scale genetic screens and other omics data) lead to the identification of large gene lists. The interpretation of results and the formulation of consistent biological hypotheses from these gene lists are challenging. Computational approaches can aid interpretation by relating the gene lists to knowledge about the biological system. To help researchers interpret their results, we are developing a new consulting and analysis service for pathway and network analysis. Analysis will be conducted in close collaboration with researchers on each project (Cancer Stem Cell research program) to ensure correct input data and effective interpretation of results. ---- ----- == Information about Pathway and Network Analysis == * Suggested readings: | 
| Line 15: | Line 42: | 
| === What you can expect === * TODO | |
| Line 25: | Line 52: | 
| You can use the service if you a member of Cancer Stem Cell program (http://www.cancerstemcells.ca). (Note:ask Melissa if I can use this URL). If you have Do you have large gene lists coming from large-scale 'omics' (e.g. genomics) projects that are ready to be analyzed (see section Data input requirement below), you can book an appointment with us for an initial meeting. | You can use the service if you are a member of Cancer Stem Cell program (http://www.cancerstemcells.ca). (Note:ask Melissa if I can use this URL). If you have large gene lists coming from large-scale 'omics' (e.g. genomics) projects that are ready to be analyzed (see section Data input requirement below), you can book an appointment with us for an initial meeting. === What you can expect === | 
| Line 29: | Line 61: | 
| 1. Look at the calendar below to see my available times the day you want to meet (30 min to 1 h meeting). Be aware that I'm available for meetings only on Tuesdays! | 1. Look at the calendar below to see my available times the day you want to meet (30 min to 1 h meeting). Be aware of that I'm available for meetings only on Tuesdays! | 
| Line 33: | Line 65: | 
| *You must cancel a meeting 24 hours in advance. send an e-mail at [email protected] to cancel an appointment. === Data input requirement === ---- ----- | *You must cancel a meeting 24 hours in advance. Send an e-mail at [email protected] to cancel an appointment. === Data input requirement: please have these data and information ready === * During the first initial meeting, we are going to discuss : * the biological question(s) you want to answer * the experimental design * the platform you used to generate your data (e.g Affymetrix or Illumina, the chip model,...) * the quality controls and the input data format * Your data should have been statistically analyzed: * The data should have been normalized * Some control quality plots should have been done: * Box-plot of intensity (before and after normalization) * Principal Component Analysis (PCA) * Hierarchical clustering of samples (performed on all the data) * Please provide a powerpoint presentation with a figure for each analysis * An appropriate statistical test testing your hypothesis (your biological question) should have been performed * for example : moderated t-test, paired t-test, ANOVA,... * If you need support for your statistical analyses, please contact Shaheena Bashir (Ph.D. in Statistics) at [email protected]. Located at MaRS TMDT 15th floor, she offers free consultation for statistical analyses for Cancer Stem Cell program. She will analyze your data and output the results in the right format for subsequent enrichment analyses. You are encouraged to contact Shaheena as soon as you plan your experiment: these genomics technologies are very sensitive to noise and a well designed experiment is very important for best results. It will ensure better quality data. * You need to provide us with 1 file (.txt) : * Name your file as follow: yourname_date.txt (example: veronique_March21.txt) * Please rename your file with a new date if you resubmit your file * Please follow the format description: * the first column corresponds to Entrez IDs * An Entrez ID is a numerical value that uniquely identified genes. * For example the Entrez ID for Myc (myelocytomatosis oncogene [ Mus musculus ]) is 17869: http://www.ncbi.nlm.nih.gov/gene/17869. * the second column corresponds to a unique array identifier (ProbesetID for Affymetrix and sampleID for Illumina). * the third column corresponds to gene names. * the fourth and fifth columns contain the statistical values : * the statistical values are the ones that enable you to tell if a gene is significantly differentially expressed or not, it could be for example the t value followed by the p-value if you applied a t-test. * the whole table is ordered by the '''absolute value of the fourth column''' ( t value in this example) in a decreasing order. * the additional columns contain the transformed (log2 for example) and normalized (RMA or quantile normalization for example) values for each sample (= each chip if gene expression data). * Example: ||Entrez ID||Probeset ID||Gene Symbol||Gene Description||t value||p-value||sample1||sample2||sample3|| ||17218||10572906||Mcm5|| minichromosome maintenance deficient 5, cell division cycle||44.0079||0.001||9.13084||9.7166||8.76638|| ||27279||10448307||Tnfrsf12a||tumor necrosis factor receptor superfamily, member 12a||-41.815||0.001||8.58977||9.29698||8.80844|| ||13215||10582809||Tk1||thymidine kinase 1||39.9456||0.001||8.94519||9.56513||8.38612|| ||12937||10384145||H2afv||H2A histone family, member V||-33.6475||0.001||10.574||10.7741||10.5401|| ||207277||10526848||A430033K04Rik||A430033K04Rik||33.3352||0.001||8.25088||8.4121||8.2783|| * Note: Each row of the table should correspond to a different gene. If several rows correspond to the same gene (same Entrez ID), there are 2 possibilities to remove the redundancy: * for a same gene, only the row corresponding to the best t-value is conserved * for a same gene, the average of the different normalized values is calculated before the t-test is applied * the choice has to be made before the statistical data are performed. We can discuss it during the initial meeting. | 
| Line 51: | Line 129: | 
| private and public available https://www.google.com/calendar/[email protected]&ctz=America/Toronto&gsessionid=sNRY31YmOPjvR9zF6N9llA OR only public events available | |
| Line 62: | Line 136: | 
| == View and play with your data on your own computer == | == How to explore an interactive Enrichment Map on your computer == | 
| Line 70: | Line 144: | 
| === View and manipulate the map using the .cys file that we give you === * Put the .cys file in the directory of your choice * In Cytoscape, go to Open, File and browse the directories to locate your file and click Open. === Tips === TODO | === Explore the Enrichment map using the .cys file that we give you === * Download the .cys file * Put the .cys file in the directory of your choice * In Cytoscape, go to Open, File and browse the directories to locate your file and click Open. *Explore the map * The “Parameters” tab in the “Results Panel” on the right side of the window contains a legend mapping the colours to the phenotypes and displaying the parameters used to create the map (cut-off values and data files). * The “Network” tab in the “Control Panel” on the left lists all available networks in the current session and at the bottom has a overview of the current network which allows to easily navigate in a network even at higher zoom levels by dragging the blue rectangle (the current view) over the network. *Clicking on a node (the circle that represents a gene set) will open the “EM Geneset Expression Viewer” tab in the “Data Panel” showing a heatmap of the expression values of all genes in the selected gene set. *Clicking on an edge (the line between two nodes) will open the “EM Overlap Expression Viewer” tab in the “Data Panel” showing a heatmap of the expression values of all genes both gene sets that are connected by this edge have in common. *If several nodes and edges are selected (e.g. by dragging a selection box around the desired gene sets) the “EM Geneset Expression Viewer” will show the union of all genes in the selected gene sets and the “EM Overlap Expression Viewer” will show only those genes that all selected gene sets have in common. *The “Geneset Summary” tab in the “Results Panel” on the right contains information about which nodes and edges are selected. === Tips === * Click on “View / Show Graphics Details” to see the map details even on low zoom-levels | 
| Line 113: | Line 202: | 
| ||date||project ||lab|| data received || data format OK for analysis|| GSEA|| First Map|| Analysis report|| additional analysis|| status||priority|| ||Feb 28 ||EZ01 ||Zacksenhaus|| Feb 22 || Feb 23|| Feb 24|| Feb 25|| -|| -|| interpreting the map||1|| | ||project ||lab|| data received || data checked; OK for analysis|| GSEA|| First Map|| Analysis report|| additional analysis|| status||priority|| ||EZ01 ||Zacksenhaus|| Feb 22 || Feb 23|| Feb 24|| Feb 25|| -|| -|| writing the report||1|| ||JD02-map1 ||Dick|| - || -|| -|| -|| -|| -|| -||?|| ||JD02-map2 ||Dick|| - || -|| -|| -|| -|| -|| -||?|| ||JD03 ||Dick|| - || -|| -|| -|| -|| -|| -||?|| ||JD04 ||Dick|| - || -|| -|| -|| -|| -|| -||?|| ||JD05 ||Guidos|| - || -|| -|| -|| -|| -|| -||?|| | 
Pathway and Network Analysis Service
Cancer Stem Cell program
Veronique Voisin
- located at TMDT 8th floor on Tuesday
Introduction about the service:
- The Pathway and Network Analysis Service is freely available to all Cancer Stem Cell program members. High-throughput genomic experiments (e.g. gene expression, protein expression, molecular interactions, large-scale genetic screens and other omics data) lead to the identification of large gene lists. The interpretation of results and the formulation of consistent biological hypotheses from these gene lists are challenging. Computational approaches can aid interpretation by relating the gene lists to knowledge about the biological system. To help researchers interpret their results, we are developing a new consulting and analysis service for pathway and network analysis. Analysis will be conducted in close collaboration with researchers on each project (Cancer Stem Cell research program) to ensure correct input data and effective interpretation of results.
Information about Pathway and Network Analysis
- Suggested readings: - GSEA - Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50.
 
- Enrichment Map: - Enrichment map: a network-based method for gene-set enrichment visualization and interpretation. Merico D, Isserlin R, Stueker O, Emili A, Bader GD. PLoS One. 2010 Nov 15;5(11):e13984.
 
 
- GSEA 
How to use the service
Who can use the service
- You can use the service if you are a member of Cancer Stem Cell program (http://www.cancerstemcells.ca). (Note:ask Melissa if I can use this URL). If you have large gene lists coming from large-scale 'omics' (e.g. genomics) projects that are ready to be analyzed (see section Data input requirement below), you can book an appointment with us for an initial meeting. 
What you can expect
How to book an appointment
- Look at the calendar below to see my available times the day you want to meet (30 min to 1 h meeting). Be aware of that I'm available for meetings only on Tuesdays!
- Send me an e-mail at [email protected] to indicate when you want to meet and the purpose of the meeting. 
- I will send you an e-mail back to confirm the appointment.
- If we meet for the first time, I encourage you to send me a paper that best describe your work prior to our meeting.
- You must cancel a meeting 24 hours in advance. Send an e-mail at [email protected] to cancel an appointment. 
Data input requirement: please have these data and information ready
- During the first initial meeting,  we are going to discuss : - the biological question(s) you want to answer
- the experimental design
- the platform you used to generate your data (e.g Affymetrix or Illumina, the chip model,...)
- the quality controls and the input data format
 
- Your data should have been statistically analyzed: - The data should have been normalized
- Some control quality plots should have been done: - Box-plot of intensity (before and after normalization)
- Principal Component Analysis (PCA)
- Hierarchical clustering of samples (performed on all the data)
- Please provide a powerpoint presentation with a figure for each analysis
 
- An appropriate statistical test testing your hypothesis (your biological question) should have been performed - for example : moderated t-test, paired t-test, ANOVA,...
 
- If you need support for your statistical analyses, please contact Shaheena Bashir (Ph.D. in Statistics) at [email protected]. Located at MaRS TMDT 15th floor, she offers free consultation for statistical analyses for Cancer Stem Cell program. She will analyze your data and output the results in the right format for subsequent enrichment analyses. You are encouraged to contact Shaheena as soon as you plan your experiment: these genomics technologies are very sensitive to noise and a well designed experiment is very important for best results. It will ensure better quality data. 
 
- You need to provide us with 1 file (.txt) :  - Name your file as follow: yourname_date.txt (example: veronique_March21.txt)
- Please rename your file with a new date if you resubmit your file
- Please follow the format description: - the first column corresponds to Entrez IDs - An Entrez ID is a numerical value that uniquely identified genes.
- For example the Entrez ID for Myc (myelocytomatosis oncogene [ Mus musculus ]) is 17869: http://www.ncbi.nlm.nih.gov/gene/17869. 
 
- the second column corresponds to a unique array identifier (ProbesetID for Affymetrix and sampleID for Illumina).
- the third column corresponds to gene names.
- the fourth and fifth columns contain the statistical values :  - the statistical values are the ones that enable you to tell if a gene is significantly differentially expressed or not, it could be for example the t value followed by the p-value if you applied a t-test.
- the whole table is ordered by the absolute value of the fourth column ( t value in this example) in a decreasing order. 
 
- the additional columns contain the transformed (log2 for example) and normalized (RMA or quantile normalization for example) values for each sample (= each chip if gene expression data).
 
- the first column corresponds to Entrez IDs 
- Example:
 
| Entrez ID | Probeset ID | Gene Symbol | Gene Description | t value | p-value | sample1 | sample2 | sample3 | 
| 17218 | 10572906 | Mcm5 | minichromosome maintenance deficient 5, cell division cycle | 44.0079 | 0.001 | 9.13084 | 9.7166 | 8.76638 | 
| 27279 | 10448307 | Tnfrsf12a | tumor necrosis factor receptor superfamily, member 12a | -41.815 | 0.001 | 8.58977 | 9.29698 | 8.80844 | 
| 13215 | 10582809 | Tk1 | thymidine kinase 1 | 39.9456 | 0.001 | 8.94519 | 9.56513 | 8.38612 | 
| 12937 | 10384145 | H2afv | H2A histone family, member V | -33.6475 | 0.001 | 10.574 | 10.7741 | 10.5401 | 
| 207277 | 10526848 | 33.3352 | 0.001 | 8.25088 | 8.4121 | 8.2783 | 
- Note: - Each row of the table should correspond to a different gene. If several rows correspond to the same gene (same Entrez ID),  there are 2    possibilities to remove the redundancy: - for a same gene, only the row corresponding to the best t-value is conserved
- for a same gene, the average of the different normalized values is calculated before the t-test is applied
- the choice has to be made before the statistical data are performed. We can discuss it during the initial meeting.
 
 
- Each row of the table should correspond to a different gene. If several rows correspond to the same gene (same Entrez ID),  there are 2    possibilities to remove the redundancy: 
Service SOP
Standard input requirements
Calendar
How to explore an interactive Enrichment Map on your computer
Download the sofware you need (see below for download and tutorial information)
- Cytoscape
- Enrichment Map plugin
- WordCLoud plugin
Explore the Enrichment map using the .cys file that we give you
- Download the .cys file - Put the .cys file in the directory of your choice
- In Cytoscape, go to Open, File and browse the directories to locate your file and click Open.
 
- Explore the map - The “Parameters” tab in the “Results Panel” on the right side of the window contains a legend mapping the colours to the phenotypes and displaying the parameters used to create the map (cut-off values and data files).
- The “Network” tab in the “Control Panel” on the left lists all available networks in the current session and at the bottom has a overview of the current network which allows to easily navigate in a network even at higher zoom levels by dragging the blue rectangle (the current view) over the network.
- Clicking on a node (the circle that represents a gene set) will open the “EM Geneset Expression Viewer” tab in the “Data Panel” showing a heatmap of the expression values of all genes in the selected gene set.
- Clicking on an edge (the line between two nodes) will open the “EM Overlap Expression Viewer” tab in the “Data Panel” showing a heatmap of the expression values of all genes both gene sets that are connected by this edge have in common.
- If several nodes and edges are selected (e.g. by dragging a selection box around the desired gene sets) the “EM Geneset Expression Viewer” will show the union of all genes in the selected gene sets and the “EM Overlap Expression Viewer” will show only those genes that all selected gene sets have in common.
- The “Geneset Summary” tab in the “Results Panel” on the right contains information about which nodes and edges are selected.
 
Tips
- Click on “View / Show Graphics Details” to see the map details even on low zoom-levels
Online tutorials
All the software used are freely available (open-source) and easy to install on your computer.
Gene-Set Enrichment Analysis (GSEA)
- http://www.broadinstitute.org/gsea/index.jsp : - Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes). We use GSEA at the first analysis step. As we will perform this analysis for you, you don't need to download GSEA.
 
Cytoscape
- Cytoscape is an open source bioinformatics software platform for visualizing molecular interaction networks and biological pathways and integrating these networks with annotations, gene expression profiles and other state data Download Cytoscape 2.8.0 from http://www.cytoscape.org/download.html (need to enter, name, institution, e-mail address but no account necessary) Cytoscape tutorial: http://cytoscape.wodaklab.org/wiki/Presentations/Basic 
 
- Cytoscape is an open source bioinformatics software platform for visualizing molecular interaction networks and biological pathways and integrating these networks with annotations, gene expression profiles and other state data 
EnrichmentMap
- http://baderlab.org/EnrichmentMap - Enrichment Map is a visualization method for gene set enrichment results which helps quickly find general functional themes in genomics data. Enrichment Map works as a plug-in for Cytoscape. To install it, download the zipped file, move it to the Cytoscape plugin directory and unzip it.
 
WordCloud
- WordCloud is a Cytsocape plugin that generates a word tag cloud from a user-defined node selection, summarizing an attribute of choice. It notably eases the interpretation of the Enrichment Map. Download the WordCloud plugin and put the file in the Cytoscape plugin directory, unzip it and put the WordCloud.jar file in the plugin directory. 
 
GeneMANIA
- GeneMANIA is a free public resource that offers a simple, intuitive web interface that shows the relationships between genes in a list and analyzes and extends the list to include other related genes. You can use GeneMANIA to find new members of a pathway or complex, find additional genes you may have missed in your screen or find new genes with a specific function. You also can use GeneMANIA as a Cytoscape plugin. You can find the GeneMANIA tutorial at: http://genemania.org/pages/help.jsf 
 
List of projects
- This section summarizes the current projects, and the analysis status for each project is very regularly updated. You can see progress in the analysis of your project and see the different priorities assigned to each project.
| project | lab | data received | data checked; OK for analysis | GSEA | First Map | Analysis report | additional analysis | status | priority | 
| EZ01 | Zacksenhaus | Feb 22 | Feb 23 | Feb 24 | Feb 25 | - | - | writing the report | 1 | 
| JD02-map1 | Dick | - | - | - | - | - | - | - | ? | 
| JD02-map2 | Dick | - | - | - | - | - | - | - | ? | 
| JD03 | Dick | - | - | - | - | - | - | - | ? | 
| JD04 | Dick | - | - | - | - | - | - | - | ? | 
| JD05 | Guidos | - | - | - | - | - | - | - | ? | 
? Link to results and reports ?
