Bulk import allows you to programmatically import data into CDD Vault. The use of an existing mapping template is one option for mapping the data in an import file into the CDD Vault using this API call. (Mapping templates are created through the CDD Vault web interface.) Another option for mapping data via this API call is to create field-header mappings directly from the POST slurps API call.
Once a file has been uploaded through the API, data from the import is committed immediately unless there are errors or warnings. By default, any import errors or warnings (Suspicious Events) will cause the import to be REJECTED. (Note: there is a parameter that can be passed that will cause the import to remain active so that the user can log in to the Import Data tab and resolve the issue(s) within the CDD Vault web interface.
Note: You can import either protocol data or register molecules/entities using the API call described below. Any file type supported via the Import Data wizard (sdf, csv, xlsx, and/or zip) can be used with this POST Slurps API call.
Step 1: Upload a Data File and Assign Mapping Parameters
POST /api/v1/ vaults/<vault_id>/slurps
This call will initiate the import. To import data, both a data file and a JSON object are required and should be provided under the ‘file’ and ‘json’ parameters.
Parameters:
JSON | |||||||||||||||||||
project | required The name or id of a single project. Names should be passed as strings and ids must be passed as integers. | ||||||||||||||||||
autoreject | optional Designate if unresolved ambiguous events, suspicious events, or errors will cause the import to be automatically rejected (default behaviour) or be left active in the Import Data tab for the user to resolve within the interface. | ||||||||||||||||||
ambiguous_events_resolution |
optional Designate how to resolve ambiguous events, if any occur.
|
||||||||||||||||||
suspicious_events_resolution |
optional Designate how to resolve suspicious events, if any occur.
|
||||||||||||||||||
ignore_errors |
optional (false by default) If true, ignore any errors. If false, see autoreject parameter for behavior. |
||||||||||||||||||
mapping_template | optional The name or id of a mapping template that matches the attached file. If not provided, a mapping template that matches the file will be used. If there is more than one matching template, an error will be raised. The following parameters can be included to provide a mapping template on-the-fly. This "virtual" mapping template can be temporary or saved for future use (see persist=true). | ||||||||||||||||||
|
|||||||||||||||||||
runs | optional Either a single run detail object which will be applied to all new runs, or an array of run detail objects that will be applied to new runs in the same order as specified by the mapping template. | ||||||||||||||||||
|
Notes on Date/Time
The CDD Vault API accepts ISO 8601 date/time formats in any API call that allows a date-type parameter. For example, the full date and timestamp may be used in GET calls that support a date parameter. You may still simply provide a date-only parameter like "created_after=2020-05-20".
You may also specify a date + timestamp, like "created_after= 2020-05-20 14:53:12", to indicate "20 May 2020 14:53:12 PDT" (PDT is based on the user's time zone setting). The timestamp portion can also include a UTC (Coordinated Universal Time) offset, like "created_after= 2020-05-27T14:48:40-07:00" which indicates that the time specified is -7 hours from the UTC time
Notes on Mapping Template Syntax
A syntax is available for creating field-header mappings directly from the POST slurps API call. Simply pass JSON that includes details of how each field in your file should be mapped. These mappings can be submitted with each import and they can be saved as a mapping template for future imports.
The parameters required for a successful data import are included in JSON which is submitted via the POST slurps API call. In addition to the previously allowed criteria, you can also specify the type of import and the field-header pairs. By default, the mapping will only be used for the specific import but it can be saved as a mapping template with a name.
As an example, the following JSON can be passed with the POST Slurps API call to register new Molecules (or new Batches) in CDD Vault. Note: the csv file that is also passed has 4 columns - SMILES, Vendor, Person and Date.
{
"project": "Internal Data",
"autoreject": "true",
"mapping_template": {
"registration_type": "CHEMICAL_STRUCTURE",
"header_mappings": [
{
"header": {"name": "SMILES", "position": 0},
"definition": { "id": 2, "type": "InternalFieldDefinition::MoleculeStructure" }
},
{
"header": {"name": "Vendor", "position": 1},
"definition": {"id": 40714,"type": "InternalFieldDefinition::BatchFieldDefinition"}
},
{
"header": {"name": "Person","position": 2},
"definition": {"id": 40724, "type":"InternalFieldDefinition::BatchFieldDefinition"}
},
{
"header": {"name": "Date","position": 3},
"definition": {"id": 40713,"type":"InternalFieldDefinition::BatchFieldDefinition"}
}
],
"mapping_options": {"slurp_type": "Register with structures"},
"persist": "true",
"name": "New Molecule Reg Template"
}
}
In this partial example, the following JSON can be passed with the POST Slurps API call to register new Molecules (or new Batches) in CDD Vault without structures. This is useful to either (1) create new No-Struct Molecules or (2) add a new Batch to an existing Molecule using the Molecule Name. Note: the csv file that is also passed has the 1st column as the MoleculeName column. Of course, other Batch fields could be included and mapped.
{
"project": "Internal Data",
"autoreject": "true",
"mapping_template": {
"registration_type": "CHEMICAL_STRUCTURE",
"header_mappings": [
{
"header": {
"name": "MoleculeName",
"position": 0
},
"definition": {
"id": 3,
"type": "InternalFieldDefinition::MoleculeSynonym"
}
},
.
.
.
"mapping_options": {
"slurp_type": "Register without structures"
},
"persist": "true",
"name": "New Batch Reg Template"
}
}
As another example, the following JSON can be passed with the POST slurps API call to register a new Run of a Protocol. Note: the csv file that is also passed has 3 columns - Molecule-Batch ID, Inhibition and SEM.
{
"project": "Internal Data",
"autoreject": "true",
"mapping_template": {
"registration_type": "",
"header_mappings": [
{
"header": {"name": "Molecule-Batch ID","position": 0},
"definition": {"id": 65061,"type": "InternalFieldDefinition::MoleculeBatchIdentifier"}
},
{
"header": {"name": "Inhibition","position": 1},
"definition": {"id": 283130,"type": "ReadoutDefinition"},
"run_grouping": 1
},
{
"header": {"name": "SEM","position": 2},
"definition": {"id": 283131,"type": "ReadoutDefinition"},
"run_grouping": 1
}
],
"mapping_options": {"slurp_type": "Add readouts"},
"persist": "true",
"name": "New Inh Import Template"
},
"runs":[{"run_date":"2022-04-29","place":"API", "person":"APIScientist"}]
}
As a final example, the following JSON can be passed with the POST slurps API call to create Plates, associate Batches to the Wells on the Plates and registers a new Run of a Protocol. Note: the csv file that is also passed has 5 columns - MoleculeBatchID, Plate, Well, Conc and Raw.
{
"project": "Internal Data",
"autoreject": "true",
"mapping_template": {
"registration_type": "",
"header_mappings": [
{
"header": {"name": "MoleculeBatchID","position": 0},
"definition": {"id": 65061,"type": "InternalFieldDefinition::MoleculeBatchIdentifier"}
},
{
"header": {"name": "Plate","position": 1},
"definition": {"id": 148, "type": "InternalFieldDefinition::PlateName"}
},
{
"header": {"name": "Well","position": 2},
"definition": {"id": 147," type": "InternalFieldDefinition::WellLocation"}
},
{
"header": {"name": "Conc","position": 3},
"definition": {"id": 249536,"type": "ReadoutDefinition"},
"run_grouping": 1
},
{
"header": {"name": "Raw","position": 4},
"definition": {"id": 249540,"type": "ReadoutDefinition"},
"run_grouping": 1
}
],
"mapping_options": {"slurp_type": "Add readouts"},
"persist": "true",
"name": "New Plate DR Import Template"
},
"runs":[{"run_date":"2023-01-03","place":"API", "person":"APIScientist"}]
}
Helpful Hints
- Remember, you can get the list of existing mapping templates using the GET mapping_Templates API call.
- GET mapping_Templates/<template_id> API call provides all of the details of an existing mapping template including the header-mappings. Use this as guidance when creating your new mapping templates.
- The GET protocols/<protocol_id> API returns the internal IDs of each of the readout definitions within that Protocol - these correspond to the IDs used in the header_mappings (definition) section of the above JSON.
- The
GET Fields
API call is also helpful in determining how to dynamically map data fields in aPOST Slurps
API call.
This GET Fields
API call will provide you with the “type” and “name” values of all fields within a Vault. The JSON returned by this API call is organized into the following sections of fields:
- Internal
- Batch
- Molecule
- Protocol
If you are creating dynamic mappings for data files being imported via the POST Slurps API call, this GET Fields
API call will be useful in discovering the details of how each field in your data file should be mapped. For example, when using the header_mappings parameter in a POST Slurps
API call, you need to include the ID and type of data field being mapped to and this new GET Fields
API call provides these details.
Curl Example
curl -H "X-CDD-Token:$TOKEN" --form-string 'json={"project": "34888", "autoreject":"false", "mapping_template": "36708","runs":[{"run_date": "2017-05-26","place": "basement lab"}, {"run_date": "2017-05-28","person":"Lab assistant"}]}' -F 'file=@path/to/file.csv' 'https://app.collaborativedrug.com/api/v1/vaults/23/slurps'
Caveat when using curl:
Curl appears to have a bug when using large files where, if the JSON comes after the file, it can be cutoff and repeated, leading to an error response of an unexpected token.
We recommend you ensure placement of the JSON prior to the file to avoid this.
Ruby REST Client Example
RestClient.post 'https://app.collaborativedrug.com/api/v1/vaults/23/slurps', {:file => File.new("path/to/file.csv"), :json => '{"project":"34888", "autoreject":"true", "mapping_template":"36708", "runs":{"run_date":"2010-10-26"}}'}, {:"X-CDD-Token" => ‘<API_TOKEN>’ }
Python Requests Example
url = 'https://app.collaborativedrug.com/api/v1/vaults/23/slurps'
headers = {"X-CDD-Token": api_token}
data = {
'project':34888
,
'runs': {'run_date': '2001-01-01', 'conditions': '33C'}
}
files = {'file': open('file.csv', 'rb')}
response = requests.post(url, headers=headers, data={'json': json.dumps(data)}, files=files)
R Example
Please visit the R scripting language page for a Bulk Import example script.
Response
{
"id": 845005082,
"class": "slurp",
"state": "queued_for_processing",
"api_url": "https://app.collaborativedrug.com/api/v1/vaults/23/slurps/845005082"
}
Notes on the AUTOREJECT parameter
The autoreject parameter is set to true by default so that imports initiated via the API will be REJECTED when Suspicious Events or Errors occur. This allows other file imports in the queue to continue processing.
POST /api/v1/vaults/<vault_id>/slurps
{
"project":"MyProject",
"mapping_template":"SomeMappingTemplate",
"autoreject":"true",
"runs":[
{"run_date":"2021-12-25","place":"SCRNLab", "person":"DrGina"}
]
}
Change the autoreject parameter to false to maintain the old behaviour where an import with Suspicious Events or Errors must be handled via the CDD Vault Import Data interface.
POST /api/v1/vaults/<vault_id>/slurps
{
"project":"MyProject",
"mapping_template":"SomeMappingTemplate",
"autoreject":"false",
"runs":[
{"run_date":"2021-12-25","place":"SCRNLab", "person":"DrGina"}
]
}
Step 2: Check Import Status
GET /api/v1/ vaults/<vault_id>/slurps/<slurp_id>
show_events | Boolean. Include in the returned JSON the details for the import_warnings (ambiguous_events and suspicious_events) and import_errors. |
It is likely that Step #1 above will suffice for most bulk imports since the data is committed automatically if there are no errors or warnings. If Step #1 is not successful, the user will receive an email notification alerting them of any errors or warnings. However, programmatically checking the status of a bulk import is an option.
Once a file has been uploaded for import, you can check the import status using the ‘slurp’ id. A JSON representation of the slurp will be returned, including the slurp’s current state, number of records processed, and number of errors and warnings. If there are any errors or warnings, the response will also include a link to the import summary and a message letting you know to go and resolve them.
The “state” of a bulk import typically progresses from “mapping” to “committed” in the order listed below. The “mapping” state is completed by the time a non-error response is returned from creating a slurp.
All possible slurp states:
- mapping
- queued_for_processing
- processing
- processed
- queued_for_committing
- committing
- committed
- canceled
- rejected
- invalid
API bulk imports will automatically commit if there are no Errors nor Suspicious Events. In the case where the API import generates Errors or Suspicious Events, these will need to be resolved within the CDD Vault web interface.
Example
Using Curl
curl -H X-CDD-Token:$Token ‘https://app.collaborativedrug.com/api/v1/vaults/23/slurps/845005082’
Response
{
"id": 845005082,
"class": "slurp",
"state": "committed",
"api_url": "https://app.collaborativedrug.com/api/v1/vaults/23/slurps/845005082",
"total_records": 1,
"records_processed": 1,
"records_committed": 1,
"import_warnings": 0,
"import_errors": 0
}
Response with Import Errors
{
"id": 845005095,
"class": "slurp",
"state": "processed",
"api_url": "https://app.collaborativedrug.com/api/v1/vaults/23/slurps/845005095",
"total_records": 1,
"records_processed": 1,
"records_committed": 0,
"import_warnings": 0,
"import_errors": 1,
"message": "This slurp has generated import errors or warnings. Please use web application to resolve them.",
"web_url": "https://app.collaborativedrug.com/vaults/23/slurps/845005095"
}
Step 3: View Protocol and Run Information for the Import
GET /api/v1/vaults/<vault_id>/protocols?slurp=<slurp_id>
Once an import has been committed, you can return additional JSON results that will expose the Protocol and Run(s) of data that were imported.
Example
Using Curl
curl -H X-CDD-Token:$Token ‘https://app.collaborativedrug.com/api/v1/vaults/23/protocols?slurp=845005082’
Response
{
"count": {
"247571": 1
},
"offset": 0,
"page_size": 50,
"objects": [
{
"id": 28889,
"class": "protocol",
"created_at": "2017-04-18T16:16:51.000Z",
"modified_at": "2018-11-26T18:34:27.000Z",
"name": "Inhibition",
"category": "Enzyme",
"data_set": {
"name": "Sandbox",
"id": 23
},
"readout_definitions": [
{
"id": 320811,
"class": "readout definition",
"created_at": "2017-04-18T16:18:09.000Z",
"modified_at": "2017-04-18T16:18:09.000Z",
"name": "Inhibition",
"unit_label": "%",
"data_type": "Number",
"precision_type": "significant figure",
"precision_number": 3,
"description": "Inhibition Results",
"protocol_condition": false
},
{
"id": 320812,
"class": "readout definition",
"created_at": "2017-04-18T16:18:46.000Z",
"modified_at": "2017-04-18T16:18:46.000Z",
"name": "SEM",
"unit_label": "%",
"data_type": "Number",
"precision_type": "significant figure",
"precision_number": 3,
"description": "Standard Error of Mean",
"protocol_condition": false
},
{
"id": 320813,
"class": "readout definition",
"created_at": "2017-04-18T16:21:45.000Z",
"modified_at": "2017-04-18T16:21:45.000Z",
"name": "Average Inhibition",
"unit_label": "%",
"data_type": "Number",
"precision_type": "significant figure",
"precision_number": 3,
"aggregation": "batch",
"description": "Average Inhibition",
"protocol_condition": false,
"calculation": 7369
},
{
"id": 320821,
"class": "readout definition",
"created_at": "2017-04-18T16:23:45.000Z",
"modified_at": "2017-04-18T16:23:45.000Z",
"name": "Average SEM",
"unit_label": "%",
"data_type": "Number",
"precision_type": "significant figure",
"precision_number": 3,
"aggregation": "batch",
"protocol_condition": false,
"calculation": 7370
}
],
"calculations": [
{
"id": 7369,
"class": "custom calculation",
"created_at": "2017-04-18T16:21:45.000Z",
"modified_at": "2017-10-18T20:41:31.000Z",
"inputs": {
"input_readout_definitions": [
320811
]
},
"outputs": {
"output_readout_definition": 320813
},
"formula": "average([320811])",
"aggregate_readouts_by": "batch and run"
},
{
"id": 8986,
"class": "custom calculation",
"created_at": "2017-10-18T19:29:42.000Z",
"modified_at": "2017-10-18T20:41:32.000Z",
"inputs": {
"input_readout_definitions": [
320813
]
},
"outputs": {
"output_readout_definition": 346952
},
"formula": "[320813]*1"
},
{
"id": 7370,
"class": "custom calculation",
"created_at": "2017-04-18T16:23:45.000Z",
"modified_at": "2017-10-18T20:41:38.000Z",
"inputs": {
"input_readout_definitions": [
320812
]
},
"outputs": {
"output_readout_definition": 320821
},
"formula": "average([320812])",
"aggregate_readouts_by": "batch and run"
}
],
"hit_definitions": [
{
"id": 1888,
"class": "hit definition",
"created_at": null,
"modified_at": null,
"readout_definition_id": 320813,
"operator": ">=",
"value": 30,
"color": "green"
}
],
"runs": [
{
"id": 247571,
"class": "run",
"created_at": "2018-11-26T18:34:27.000Z",
"modified_at": "2018-11-26T18:34:27.000Z",
"run_date": "2018-11-26",
"person": "User ABC"
}
],
"projects": [
{
"name": "Data",
"id": 6847
}
],
"owner": "User ABC"
}
]
}
Step 4: Resolve Any Errors via the CDD Vault Data Import Tab
Log into CDD Vault, click the Data Import tab and locate the Report link for your API import file (or simply use the web_url from the Import Status JSON). Review the details and either Commit or Reject the import. For information on handling Errors and Suspicious Events, please see the Import Validation Knowledge Base article.