Properties calculated during Biological Entity (Nucleotide and Amino Acid) Registration – CDD Support

If enabled in the Vault, users can register biological entities, like nucleotide and amino acid sequences, either manually (one-at-a-time) through the interface or by using the Data Import wizard. Whichever method is used, the properties described below are automatically calculated by CDD Vault for every biological entity registered.

Nucleotide Sequence Properties

Amino Acid Sequence Properties

Nucleotide Sequence Properties

Length

The number of nucleotide bases in the sequence

Molecular weight

The sum of the molecular weights of the nucleotides and an end weight correction. The compounds are treated as single-stranded sequences with hydroxyl ends (no phosphate on either end).

The compound is assumed to be DNA unless uracil is present. Molecular weight cannot be calculated for sequences containing ambiguous codes (i.e. N)

For DNA, the equation follows

[number of A] * 313.210 + [number of T] * 304.195 + [number of C] * 289.184 + [number of G] * 329.209 – 61.964

For RNA, the equation follows

[number of A] * 329.209 + [number of U] * 306.167 + [number of C] * 305.183 + [number of G] * 345.208 – 61.964

GC-content

The percentage of explicit guanine and cytosine in the sequence

ATU-content

The percentage of explicit adenine and thymine/uracil in the sequence

Amino Acid Sequence Properties

Length

The number of amino acids in the sequence

Molecular weight

The sum of the molecular weights of the amino acids plus the weights of the ends. The molecules are treated as sequences with a hydroxyl end and a hydrogen end. Molecular weight cannot be calculated for sequences containing ambiguous codes (i.e. B, Z, X)

Hydrophobicity

The percentage of explicit hydrophobic amino acids (A, C, F, I, J, L, M, V, W) in the sequence

Hydrophilicity

The percentage of explicit hydrophilic amino acids (B, D, E, K, N, Q, R, Z) listed in the sequence