| Size: 5224 Comment:  | Size: 6966 Comment:  | 
| Deletions are marked like this. | Additions are marked like this. | 
| Line 3: | Line 3: | 
| LOLA is currently in beta-testing. Version 1.0-beta is now available for download. | LOLA is currently in beta-testing. Version 1.1-beta is now available for download. | 
| Line 7: | Line 7: | 
| === Latest Download === | === Latest Release === | 
| Line 9: | Line 9: | 
| LOLA Version 1.0 Beta: [attachment:lola-1.0-beta.tar Download] | LOLA Version 1.1 Beta: [attachment:lola-1.0-beta.tar Download] ==== Release Notes ==== ''Version 1.1 beta:'' * Logo tree - basic functionality for PDZ or other terminal binding motifs not requiring profile alignment | 
| Line 29: | Line 34: | 
| LOLA accepts the BRAIN peptide file format. A peptide file describes a protein containing a specific domain, and provides known peptide ligands of this domain obtained by an experimental technique. | LOLA accepts one or more ''peptide file'' as shown in the example below. A peptide file describes a protein containing a specific domain, and provides known peptide ligands of this domain obtained by an experimental technique. | 
| Line 33: | Line 38: | 
| Here's a sample: | Example: | 
| Line 35: | Line 40: | 
| Gene Name       WWP2 Accession Refseq:NP_008945 Entrez:11060 | Gene Name       DLG1 Accession Refseq:NP_004078 | 
| Line 39: | Line 44: | 
| Domain Number   4 Domain Type WW Interpro ID IPR001202 Technique Peptide Chip Domain Sequence PALPPGWEMKYTSEGVRYFVDHNTRTTTFKDPRPG Domain Range 444-478 | Domain Number   3 Domain Type PDZ Interpro ID IPR001478 Technique Phage Display High Valency Domain sequence KVVLHRGSTGLGFNIVGGEDGEGIFISFILAGGPADLSGELRKGDRIISVNSVDLRAASHEQAAAALKNAGQAVTIVA Domain Range 466-525 | 
| Line 47: | Line 52: | 
| 1       NRLDLPPYETFEDX  1 2 RWDRPPPYVAPPSX 1 3 XXGYTGPPRPPPYG 1 4 XXPHPQPPPYGHCV 1 5 XXNFPAPPPYPGES 1 6 MTPYRSPPPYVPPX 1 7 PGTAPPPYTVGPGY 1 8 YVQPAPPPYPGPMG 1 9 YVQAPPPPYPGPMG 1 10 YVQPPAPPYPGPMG 1 11 TCICPPDYMQVNXX 1 12 HSPPLPPYTPPTLX 1 13 SRGLPPPYDLTWVN 1 14 CGTYPPSYNLTFXX 1 15 DDCQPPAYTYNNXX 1 16 SDLQPPNYYEVMXX 1 17 GLMRPPAYCDAKXX 1 18 YTDAPPAYSELYXX 1 19 STFKPPAYEDVVXX 1 20 SRGMPSYEEAVMAX 1 21 ATSFPPSYESVTXX 1 22 APSAPPSYEETVXX 1 23 NCDPPPTYEEATXX 1 24 LPEPPPPYEFSCXX 1 25 EPENPPPYEEAMXX 1 | 1       XLHFWRESSV      66 2 XXRLWKQTSL 3 3 ILKIWRETSL 3 4 KRTIWRETSL 2 A KNLRSNSMLG 2 6 HLKFWRSTRV 2 7 AHSKWRSTSV 2 8 XXXHRRETTV 1 9 VISRWRQTSL 1 10 TTWLGRQTRV 1 11 SRSSYRETSV 1 12 XXXSRRETSV 1 13 RLFRYRETSL 1 B PIRKRWTMTL 1 15 XXXNHRETSV 1 16 KIVRWKNTSV 1 17 KHRTWYETSV 1 18 XXXXFKQTSV 1 19 ARPKWRTTRV 1 20 ALPRRRETSV 1 | 
| Line 78: | Line 78: | 
| '''NOTE:''' This section is in a 2 column format. Field names must be separated from their values with a single TAB character. Multiple TABs or spaces are not accepted. | |
| Line 80: | Line 82: | 
| '''Accession:*''' A space-separated list of database accession identifier for the protein or corresponding gene. | '''Accession:*''' A space-separated list of database accession identifiers for the protein or corresponding gene. | 
| Line 84: | Line 86: | 
| '''NCBITaxonomyID:''' Taxon identifier from NCBI's Taxonomy repository. | '''NCBITaxonomyID*:''' Taxon identifier from NCBI's Taxonomy repository. | 
| Line 86: | Line 88: | 
| '''Domain Number:''' A number that represents the position of the domain sequence within the protein. For proteins containing multiple instances of the domain, this number helps distinguish the position of these instances. | '''Domain Number*:''' A number that represents the position of the domain sequence within the protein. For proteins containing multiple instances of the domain, this number helps distinguish the position of these instances. Set to "0" if instance information is not known. | 
| Line 90: | Line 92: | 
| '''InterproID:''' The Interpro database identifier for the domain. | '''Interpro ID:''' The Interpro database identifier for the domain. | 
| Line 102: | Line 104: | 
| Describes the experimentally determined peptide ligands. The peptides sequences must be in '''multiple alignment format'''. The sequences should contain '''no gaps''', and should be padded with the '''X''' symbol on both sides, where required, such that all sequences have identical length. | Describes the experimentally determined peptide ligands. The peptide sequences must be in '''multiple alignment format'''. The sequences should contain '''no gaps''', and should be padded with the '''X''' symbol on both sides, where required, such that all sequences have identical length. '''NOTE:''' This section is in a 5-column format. Column headers and values must be separated with a single TAB character. Multiple TABs or spaces are not accepted. | 
| Line 106: | Line 110: | 
| '''PeptideName:*''' A ''unique'' numerical symbol assigned to each peptide ligand. To omit a peptide, set to a non-numeric value (e.g. "A"). '''Values in this column must be unique.''' | '''!PeptideName:*''' A ''unique'' numerical symbol assigned to each peptide ligand. To omit a peptide, set to a non-numeric value (e.g. "A"). '''Values in this column must be unique.''' | 
| Line 110: | Line 114: | 
| '''CloneFrequency:''' Applies only to phage display data: the observed frequency of the peptide in the cloning step. | '''!CloneFrequency:''' Applies only to phage display data: the observed frequency of the peptide in the cloning step. | 
| Line 112: | Line 116: | 
| '''QuantData:''' A number that relatively or absolutely quantifies the protein-ligand interaction. E.g. The optical density (OD) from a protein chip experiment. | '''!QuantData:''' A number that relatively or absolutely quantifies the protein-ligand interaction. E.g. The optical density (OD) from a protein chip experiment. | 
| Line 114: | Line 118: | 
| '''ExternalIdentifier:''' A database identifier for the peptide. E.g. from the DOMINO repository. | '''!ExternalIdentifier:''' A database identifier for the peptide. | 
| Line 116: | Line 120: | 
| === Using Project Files === To open several peptide files at once, simply link them all in a single '''project file'''. A project file is a text file containing the absolute paths of multiple peptide files. Opening the project file in LOLA will open each of the underlying peptide files in a single step, allowing logos to be constructed for multiple profiles. Example: {{{ #ProjectFile /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/APBA3-1.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/CASK-1.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DLG1-1.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DLG1-2.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DLG1-3.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DLG2-3.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DLG3-2.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DLG4-3.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DVL2-1.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/ERBB2IP-1-hi.pep.txt }}} '''NOTE:''' The first line of the project file '''must''' contain the text "#ProjectFile". | |
| Line 119: | Line 144: | 
| * Generate a "logo tree" by hiearchically clustering logos | * Generate a "logo tree" by hiearchically clustering logos [Initial PDZ (no alignment) use case completed in version 1.1] | 
LOLA (LOgos Look Amazing) is a tool for generating sequence logos using Position Weight Matrix based protein profiles. LOLA allows you to generate custom sequence logos by setting parameters such as logo height, trim percentage, and residue colour scheme. You can then save logos in various formats including PDF, PNG, and JPEG.
LOLA is currently in beta-testing. Version 1.1-beta is now available for download.
Latest Release
LOLA Version 1.1 Beta: [attachment:lola-1.0-beta.tar Download]
Release Notes
Version 1.1 beta:
- Logo tree - basic functionality for PDZ or other terminal binding motifs not requiring profile alignment
Requirements
Java Runtime Environment (JRE) 1.5 or later is required to run LOLA. All other dependencies are included in the download.
Installing and Running
- Extract the TAR file. This will create a directory named "lola".
- On Linux, open the command shell and run "lola.sh" from the "lola" directory.
- On Windows, double click "lola.bat".
- On Mac, double click lola-1.0-beta.jar
- You can open a single peptide file, or a project file linked to multiple peptide files.
Here's a view of LOLA after opening a PDZ domain project file:
attachment:lolaScreenShot.png
Input Format
LOLA accepts one or more peptide file as shown in the example below. A peptide file describes a protein containing a specific domain, and provides known peptide ligands of this domain obtained by an experimental technique.
The peptide file consists of a Header Section that describes the protein and domain sequence, and a Peptide Section that lists and describes the peptide ligands.
Example:
Gene Name DLG1 Accession Refseq:NP_004078 Organism Homo Sapiens (Human) NCBITaxonomyID 9606 Domain Number 3 Domain Type PDZ Interpro ID IPR001478 Technique Phage Display High Valency Domain sequence KVVLHRGSTGLGFNIVGGEDGEGIFISFILAGGPADLSGELRKGDRIISVNSVDLRAASHEQAAAALKNAGQAVTIVA Domain Range 466-525 Comment PeptideName Peptide CloneFrequency QuantData ExternalIdentifier 1 XLHFWRESSV 66 2 XXRLWKQTSL 3 3 ILKIWRETSL 3 4 KRTIWRETSL 2 A KNLRSNSMLG 2 6 HLKFWRSTRV 2 7 AHSKWRSTSV 2 8 XXXHRRETTV 1 9 VISRWRQTSL 1 10 TTWLGRQTRV 1 11 SRSSYRETSV 1 12 XXXSRRETSV 1 13 RLFRYRETSL 1 B PIRKRWTMTL 1 15 XXXNHRETSV 1 16 KIVRWKNTSV 1 17 KHRTWYETSV 1 18 XXXXFKQTSV 1 19 ARPKWRTTRV 1 20 ALPRRRETSV 1
Header Section
Describes the protein, domain, and experiment. Required fields are indicated with a *.
NOTE: This section is in a 2 column format. Field names must be separated from their values with a single TAB character. Multiple TABs or spaces are not accepted.
- Gene Name:* An identifier that represents the gene or protein sequence. Not required to be unique. - Accession:* A space-separated list of database accession identifiers for the protein or corresponding gene. - Organism: Description of taxon of the protein. - NCBITaxonomyID*: Taxon identifier from NCBI's Taxonomy repository. - Domain Number*: A number that represents the position of the domain sequence within the protein. For proteins containing multiple instances of the domain, this number helps distinguish the position of these instances. Set to "0" if instance information is not known. - Domain Type:* The formal name of the domain, e.g. WW, PDZ, SH3. - Interpro ID: The Interpro database identifier for the domain. - Technique: The experimental method used to identify potential ligands of the protein. - Domain Sequence:* The amino-acid sequence of the domain region. - Domain Range: The amino-acid position range for the domain region within the protein. - Comment: Notes, additional information, personal comments pertaining to this file. 
Peptide Section
Describes the experimentally determined peptide ligands. The peptide sequences must be in multiple alignment format. The sequences should contain no gaps, and should be padded with the X symbol on both sides, where required, such that all sequences have identical length.
NOTE: This section is in a 5-column format. Column headers and values must be separated with a single TAB character. Multiple TABs or spaces are not accepted.
Required fields are indicated with a *.
- PeptideName:* A unique numerical symbol assigned to each peptide ligand. To omit a peptide, set to a non-numeric value (e.g. "A"). Values in this column must be unique. - Peptide:* The peptide ligand sequence. - CloneFrequency: Applies only to phage display data: the observed frequency of the peptide in the cloning step. - QuantData: A number that relatively or absolutely quantifies the protein-ligand interaction. E.g. The optical density (OD) from a protein chip experiment. - ExternalIdentifier: A database identifier for the peptide. 
Using Project Files
To open several peptide files at once, simply link them all in a single project file. A project file is a text file containing the absolute paths of multiple peptide files. Opening the project file in LOLA will open each of the underlying peptide files in a single step, allowing logos to be constructed for multiple profiles.
Example:
#ProjectFile /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/APBA3-1.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/CASK-1.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DLG1-1.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DLG1-2.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DLG1-3.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DLG2-3.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DLG3-2.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DLG4-3.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/DVL2-1.pep.txt /Users/moyez/research/ppi/profiles/PDZ/Human/SidhuPhage/ERBB2IP-1-hi.pep.txt
NOTE: The first line of the project file must contain the text "#ProjectFile".
Future Developments
- Generate a "logo tree" by hiearchically clustering logos [Initial PDZ (no alignment) use case completed in version 1.1]
- Allow colours to be selected for individual amino-acids
- Add support for nucleic acids
- Additional visualization options (e.g. font, axis labels)
Contact
If you have any questions or feedback, please email Moyez Dharsee at [email protected].
