You are on step 3 of the import of a compound library. You have completed the initial import and are now reviewing the validation or QC import report, which shows noteworthy events, suspicious events, and errors. What do these events mean, how do you correct them, and should you now commit the import? In this article we focus on the main structure of the report and error events.
Existing Molecule Name and Structure Conflict
New Molecule Name and Structure Conflicts
Read these articles for noteworthy and suspicious events.
The third and last step of the import process provides you with a detailed report of events that are going to happen if you choose to commit/complete the import. Once the data are committed to the database, it is more difficult to undo, so the report provides you with an opportunity to cancel the entire import, and to cancel individual event types within an otherwise OK file.
Import summary:
- In the top left corner of the report, you see the file name with a link to review the mapping from the previous import step, and to save a mapping template.
- In the top right corner of the report, you see the project name where this file is imported, and the file owner's name.
- The yellow panel shows import progress and status of the file: after the initial import is complete, it shows the total number of records (rows of a csv file, and individual records of the SDF file) that will be imported and/or rejected when the import is finalized.
The following report sections contain a breakdown of the imported and rejected records by category. In this article, we focus on the most commonly encountered events found in the "Errors" category during molecule registration. The other categories are noteworthy events, and suspicious events.
Errors:
"Unrecoverable. Associated records will not be imported."
Events in this section could not be imported because of some grave error, or lack of project permissions. Errors may not be resolved during the import, and are automatically rejected so that all of the records associated with the error are rejected. For example, if the error involves a conflicting molecule synonym assigned to two distinct structures, both structures will be automatically rejected.
- Click on the arrow next to the event type to expand the section.
- Read the short description, and click on the "learn more" link for additional details about the meaning of this event.
- Scroll to the right of the file preview section to see each record's description of the event- it may contain some of the most specific details. The preview section only shows 10 representative rows/records.
- Download a complete file containing all rows/records under the event type. The last column/field of each record will contain a specific description.
- By default, each event is going to be rejected during the final import. You will need to download the error report, correct the underlying problem, and import this report again.
Existing Molecule Name and Structure conflict
"Record rejected because a molecule with the same name or synonym and a different structure already exists"
"Molecule names must be unique to each structure. Double-check the name of the new molecule, and make sure the structure is the same as the existing molecule. For example, enantiomers, tautomers or salt forms are recognized as different structures."
The screenshot above shows that 8 existing molecule name and structure conflicts are found in the file. The database compares existing structures in your vault with incoming structures, as well as all unique molecule identifiers (synonyms). When there is a conflict between an existing molecule identifier or structure with an incoming identifier or structure, this error is produced. The record is rejected because a molecule with the same name or synonym and a different structure already exists: it may be that someone made a mistake in the structure or the name, or that the name refers to a different stereoisomer or tautomer of the same structure.
Here are the steps to investigate and resolve this error:
- Expand the event section and "Download all" records as a file.
- Copy the SMILES or MOL from the downloaded file and perform a structure search in CDD- you may be able to find the existing molecule.
- Copy the molecule name or synonym from the downloaded file and perform a keyword search in CDD- this is another way to find the conflicting molecule.
- Compare the names and structures you found with the downloaded file, and decide what corrections are needed.
- Import the corrected file back into CDD using the same mapping as the original file. This time you should see no error, but some other "Noteworthy" event instead.
New Molecule Name and Structure Conflicts
"Record rejected because multiple lines of the file had the same molecule name or synonym with different structures."
"Molecule names must be unique. Download the error report to see which lines are conflicting. Look for small inconsistencies in the structure, for example, enantiomers, tautomers, salt forms."
The screen-shot above indicates that 4 molecules within the import file have conflicting names and structures. This error is produced when a molecule with the same name or synonym and a different structure was encountered on another row in the file: it may be that someone made a mistake in the structure or the name, or that the name (synonym) refers to a different stereoisomer or tautomer of the same structure. Download the error report to see which lines conflict with which.
Here are the steps to investigate and resolve this error:
- Expand the event section and "Download all" records as a file.
- Edit the file to update the molecule names or synonyms and structures.
- Import the corrected file back into CDD using the same mapping as the original file. This time you should see no error, but some other "Noteworthy" event instead.
Unrecognized structure
"Record rejected because the structure is either invalid, a polymer or an unrecognized mixture. "
"If this is a chemically valid structure, it is not recognized by CDD for one of the following reasons: the salt form is unrecognized, this is a mixture of at least three unique components, or it is a polymer. Contact support@collaborativedrug.com for assistance."
The screenshot above shows that there is one molecule with an invalid structure. This error comes up when the structure is invalid for several different reasons. A detailed review of the each structure is needed to determine the problem. Here are some common issues and their resolutions:
- The structure is not a valid format, such as an invalid MOL file or incorrectly formed SMILES: update the structure so that it's valid before reimporting.
- The structure is a mixture of several non-salt or hydrate components: register as mixture.
- The structure is a polymer: edit the structure to store a monomer or dimer, and add a note in a separate field.
- You can't tall what the problem is: contact support@collaborativedrug.com
Commit/Reject Data Import
Regardless of the accept/reject settings you may have updated above, you will need to finalize the import by pressing "commit data import", or cancel the entire import by pressing "reject data import".
Of course, it's up to you how you choose to address the issues that the import report brings up. The general rule of thumb is that if less than 20% of the rows/recors in your file are problematic, reject only the problematic events, and commit the entire file. If more the 20%, then reject the entire file and start over! Remember that until you have pressed this final "commit" button, no data has been actually inserted into the database, and it is very easy to undo and cancel. After the import is committed, molecules, batches and plates can be edited or deleted manually, but not in bulk mode. If you do need to edit a large number of records, please contact our support.
Learn about Noteworthy events
Learn about Errors