If you are working on smaller peptides and want to register them into your CDD Vault by means of their abbreviated sequence, here are some guidelines you can follow to register your molecules.
You can register peptides containing natural amino acids with their commonly used 1- or 3-letter abbreviations. Note, that you cannot mix 1- and 3- letter formats in the same sequence, neither can you have both 1- and 3- letter sequences within a single import file. Use "either/or", not both.
The sequences should be registered without any additional characters (no spaces or dashes) and the sequence order is from N-to-C terminal.
* Valid examples are:
* Invalid examples would be:
There are few items you should keep in mind when registering sequences:
- If you register a sequence in single letters, the chemical cartridge automatically converts it to 3-letter sequences and the molecule image will show "H-" at the N-terminal, and "-OH" at the C-terminal. E.g.:
- After about 10 amino acids, the structure depiction reduces in size.
- When registering sequences with 35+ amino acids, CDD Vault will not render a structure image (a message will be displayed: "Structure to large to render"). More than about 50 amino acids are not supported and may result in a registration error.
As alternative option, one can use "Structureless registration" in CDD Vault for large molecules. For further reading on this topic, please visit:
Knowledgebase article "Is it possible to add structures to structureless molecules?"
In order to keep the original character string of your sequence, you might want to consider adding a batch field in your vault, naming it something like "Sequence N->C". The character limit on a Text-type Batch Field is relatively large (65,000 characters) so you should be safe in saving the sequence of biomolecules as large as enzymes. It could be advantageous to have this as a a required field. Any number of alternatives are possible, e.g. having a text field containing the structure in HELM notation.
When registering larger sequences, or sequences with unnatural AAs, sidechains, siRNAs, and the likes, structureless registration should work well.
In all cases, it is advisable to have a unique external identifier in case you desire to have batch-level control over your peptides. This could be either an ID by name, number, or the sequence itself.
Settings for your vault could contain following entries (Settings > Vault > Batch Fields):
An batch entry would then look something like this:
Tips & Tricks
In case you don't have separate batch fields for your letter sequence, you can always retrieve the originally registered sequence:
Click on your structure to open the copy box. On the bottom you will see the text "Original". Click there and the box will give you the original registration text (as MOL structure format; copy/past this into an editor and you should see your original input sequence again).
For single compound (sequence) registration, mixing multiple annotations, as well as combining with full out drawn chemical structures, is certainly possible (again, note: single letters will be converted to 3-Letter annotation).
E.g. within the editor, you can create disulfide bridges by drawing a line from a CYS to another CYS.
Changes to a further chemically modified amino acid are also possible. Click on an AA residue and have it expanded, followed by adding any modifications, see e.g. the below shown a His with added F and c-propyl. Of course you could simply draw the full structure from scratch.
This can be done within the Vault's Chemaxon JS editor, alternatively in e.g. Chemaxon's standalone Marvin Sketch followed by a copy/paste. SDF files created from e.g. Sketch may also be used for import via the file-import section or into the editor.