If enabled in the Vault, users can register biological entities, like nucleotide and amino acid sequences, either manually (one-at-a-time) through the interface or by using the Data Import wizard. Whichever method is used, the properties described below are automatically calculated by CDD Vault for every biological entity registered.
Nucleotide Sequence Properties
Amino Acid Sequence Properties
Nucleotide Sequence Properties
Length
The number of nucleotide bases in the sequence
Molecular weight
The sum of the molecular weights of the nucleotides and an end weight correction. The compounds are treated as single-stranded sequences with hydroxyl ends (no phosphate on either end).
The compound is assumed to be DNA unless uracil is present. Molecular weight cannot be calculated for sequences containing ambiguous codes (i.e. N)
For DNA, the equation follows
[number of A] * 313.210 + [number of T] * 304.195 + [number of C] * 289.184 + [number of G] * 329.209 – 61.964
For RNA, the equation follows
[number of A] * 329.209 + [number of U] * 306.167 + [number of C] * 305.183 + [number of G] * 345.208 – 61.964
GC-content
The percentage of explicit guanine and cytosine in the sequence
ATU-content
The percentage of explicit adenine and thymine/uracil in the sequence
Amino Acid Sequence Properties
Length
The number of amino acids in the sequence
Molecular weight
The sum of the molecular weights of the amino acids plus the weights of the ends. The molecules are treated as sequences with a hydroxyl end and a hydrogen end. Molecular weight cannot be calculated for sequences containing ambiguous codes (i.e. B, Z, X)
Hydrophobicity
The percentage of explicit hydrophobic amino acids (A, C, F, I, J, L, M, V, W) in the sequence
Hydrophilicity
The percentage of explicit hydrophilic amino acids (B, D, E, K, N, Q, R, Z) listed in the sequence