Information submitted through the support site is private but is not hosted within your secure CDD Vault. Please do not include sensitive intellectual property in your support requests.

Molecule Validation Rules during Registration

During the validation step of the data import (import step 3), CDD Vault performs a number of checks to ensure integrity and uniqueness of uploaded structures and molecule identifiers.

Checks are performed sequentially and a validation report is provided prior to approval of data for final import.

This article review the validation rules applied by CDD. See corresponding articles for some of the common errors, suspicious events and noteworthy events displayed in the validation report.

 

Read this article for molecule registration mapping (import step 2).

 

Salt and solvent of crystallization handling

There is an optional vault-level setting to disable chemical registration with salt stripping.

  • Only supported salts and solvents are stripped and stored in an automatically created batch field.
  • Only one salt form, AND one solvent of crystallization are supported.
  • Stoichiometry that resolves to a simple stoichiometric ratio of cores, salts, and solvents is supported.
  • If any of the above import validations fail, the structure will not be imported with an unrecognized structure error.

 

Core structure handling

  • Structures are neutralized (unless chemical registration is turned off in the vault).
  • Structure is checked to be in a valid format. Invalid structures fail import validation with unrecognized structure error.
    • MOL
    • canonical SMILES
    • CXSMILES
    • IUPAC
    • polymer notation is not supported
  • The chiral flag will automatically be set to absolute stereochemistry, but enhanced stereochemistry features will be preserved.
  • Original imported structure is saved as the original MOL or SMILES. This structure is displayed on the search page, on the molecule record, and in the exported structure image.
  • A copy of the original structure is standardized as follows
    • extra atom labels are removed
    • extra annotation is removed
    • atom numbers are removed
    • aromatization is standardized using ChemAxon basic method 
    • explicit hydrogens are removed
    • chiral flag is set for absolute stereochemistry
    • converted to CXSMILES
  • Uniqueness validation is performed using standardized structure CXSMILES.

 

Stoichiometry

  • Salt, solvent of crystallization and core structure counts are recorded as  batch information and used to calculate formula weight.
  • Stoichiometric ratios are confirmed to be whole numbers.

 

Duplicate detections

  • Duplicates are detected across the vault  (all projects) and within the uploaded file.
  • Enantiomers are detected and registered as distinct molecules:
    • Pure Isomers (R, S)
    • Unknown stereochemistry (wavy bond)
    • Unspecified stereochemistry
  • Possible tautomers are detected based on InChiKey and a warning is provided. Such duplicates will be registered as distinct molecules, if the file is committed.

 

Synonyms and Structure conflict detection

  • Synonyms that already exist in the Vault for another structure will result in an error.

 

Well conflict detection

  • A well that has already been assigned another molecule or another batch of the same molecule will result in an error.

 

Batch identification and error detection

  • Registration of a molecule that already exists in the Vault will result in a new Batch.
  • Any unique batch identifier that has been previously assigned to another structure/batch, will result in an error.
Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.