Information submitted through the support site is private but is not hosted within your secure CDD Vault. Please do not include sensitive intellectual property in your support requests.

Slurps [GET, POST] - i.e. Bulk Import of Data via Files

Bulk import allows you to programmatically import data into CDD Vault. The use of an existing mapping template is one option for mapping the data in an import file into the CDD Vault using this API call. (Mapping templates are created through the CDD Vault web interface.) Another option for mapping data via this API call is to create field-header mappings directly from the POST slurps API call. 

Once a file has been uploaded through the API, data from the import is committed immediately unless there are errors or warnings. By default, any import errors or warnings (Suspicious Events) will cause the import to be REJECTED. (Note: there is a parameter that can be passed that will cause the import to remain active so that the user can log in to the Import Data tab and resolve the issue(s) within the CDD Vault web interface.

 Note: You can import either protocol data or register molecules/entities using the API call described below. Any file type supported via the Import Data wizard (sdf, csv, xlsx, and/or zip) can be used with this POST Slurps API call.

 

Step 1: Upload a Data File and Assign Mapping Parameters

POST /api/v1/ vaults/<vault_id>/slurps

This call will initiate the import. To import data, both a data file and a JSON object are required and should be provided under the ‘file’ and ‘json’ parameters.

 

Parameters:

JSON  
project required The name or id of a single project. Names should be passed as strings and ids must be passed as integers.
autoreject optional Designate if unresolved ambiguous events, suspicious events, or errors will cause the import to be automatically rejected (default behaviour) or be left active in the Import Data tab for the user to resolve within the interface.
ambiguous_events_resolution

optional Designate how to resolve ambiguous events, if any occur.

none Default Do not resolve ambiguous events. See autoreject parameter for behavior.
reject Automatically reject all ambiguous events.
new_molecule Automatically create new molecules for all ambiguous events.
new_batch Automatically create new batches on the first matching molecule for all ambiguous events.
suspicious_events_resolution

optional Designate how to resolve suspicious events, if any occur.

none Default Do not resolve suspicious events. See autoreject parameter for behavior.
reject Automatically reject all suspicious events.
accept Automatically accept all suspicious events.
ignore_errors

optional (false by default)

If true, ignore any errors. If false, see autoreject parameter for behavior.

mapping_template optional The name or id of a mapping template that matches the attached file. If not provided, a mapping template that matches the file will be used. If there is more than one matching template, an error will be raised. The following parameters can be included to provide a mapping template on-the-fly. This "virtual" mapping template can be temporary or saved for future use (see persist=true).
Mapping Template details
name required when persist = true
Unique name of the mapping template.
registration_type required - except when slurp_type is Add readouts then this should be left blank
When registering or updating molecules/entities, you must include (in all caps): CHEMICAL_STRUCTURE, MIXTURE, NUCLEOTIDE, AMINO_ACID, or OTHER.
mapping_options {“slurp_type”} required
Register with structures, Register without structures, Update, Add or update, Add readouts
persist optional (set to false by default)
Boolean: true or false.
header_mappings

required

Details on how the columns/fields in the data file map into the Vault.

 

header

  • name (name of the column/field)
  • position (position within the file, beginning with 0. For example column A in a csv file would be represented by a position value of 0.)
 

definition

  • id (internal id of destination field/readout)
  • type (optional for Molecule/Batch fields, required for Protocol readouts) - available options include:
    •  "InternalFieldDefinition::MoleculeStructure"

    • "InternalFieldDefinition::MoleculeBatchIdentifier"

    • "InternalFieldDefinition::MoleculeFieldDefinition"

    • "InternalFieldDefinition::BatchFieldDefinition"

    • "ReadoutDefinition"

 

Run_grouping (required for readout definitions) - an arbitrary number, identifies columns/fields that will be imported into the same run. Should not be used in any InternalFieldDefinition sections.

runs optional Either a single run detail object which will be applied to all new runs, or an array of run detail objects that will be applied to new runs in the same order as specified by the mapping template.
Run details
run_date optional
Use YYYY-MM-DDThh:mm:ss±hh:mm. Default is today’s date.
place optional
This field is called “Lab” within the CDD Vault web interface. No default value provided.
person optional
Default is user’s full name.
conditions optional
No default value provided.

 

Notes on Date/Time 

The CDD Vault API accepts ISO 8601 date/time formats in any API call that allows a date-type parameter. For example, the full date and timestamp may be used in GET calls that support a date parameter. You may still simply provide a date-only parameter like "created_after=2020-05-20".

You may also specify a date + timestamp, like "created_after= 2020-05-20 14:53:12", to indicate "20 May 2020 14:53:12 PDT" (PDT is based on the user's time zone setting). The timestamp portion can also include a UTC (Coordinated Universal Time) offset, like "created_after= 2020-05-27T14:48:40-07:00" which indicates that the time specified is -7 hours from the UTC time

 

Notes on Mapping Template Syntax

 A syntax is available for creating field-header mappings directly from the POST slurps API call.  Simply pass JSON that includes details of how each field in your file should be mapped. These mappings can be submitted with each import and they can be saved as a mapping template for future imports.

The parameters required for a successful data import are included in JSON which is submitted via the POST slurps API call. In addition to the previously allowed criteria, you can also specify the type of import and the field-header pairs. By default, the mapping will only be used for the specific import but it can be saved as a mapping template with a name.

As an example, the following JSON can be passed with the POST Slurps API call to register new Molecules (or new Batches) in CDD Vault.  Note: the csv file that is also passed has 4 columns - SMILES, Vendor, Person and Date.

{
  "project": "Internal Data",
  "autoreject": "true",
  "mapping_template": {
    "registration_type": "CHEMICAL_STRUCTURE",
    "header_mappings": [
      {
        "header": {"name": "SMILES", "position": 0},
        "definition": { "id": 2, "type": "InternalFieldDefinition::MoleculeStructure" }
      },
      {
        "header": {"name": "Vendor", "position": 1},
        "definition": {"id": 40714,"type": "InternalFieldDefinition::BatchFieldDefinition"}
      },
      {
        "header": {"name": "Person","position": 2},
        "definition": {"id": 40724, "type":"InternalFieldDefinition::BatchFieldDefinition"}
      },
      {
        "header": {"name": "Date","position": 3},
        "definition": {"id": 40713,"type":"InternalFieldDefinition::BatchFieldDefinition"}
      }
    ],
    "mapping_options": {"slurp_type": "Register with structures"},
    "persist": "true",
    "name": "New Molecule Reg Template"
  }
}

In this partial example, the following JSON can be passed with the POST Slurps API call to register new Molecules (or new Batches) in CDD Vault without structures. This is useful to either (1) create new No-Struct Molecules or (2) add a new Batch to an existing Molecule using the Molecule Name. Note: the csv file that is also passed has the 1st column as the MoleculeName column. Of course, other Batch fields could be included and mapped.

{
"project": "Internal Data",
"autoreject": "true",
"mapping_template": {
"registration_type": "CHEMICAL_STRUCTURE",
"header_mappings": [
{
"header": {
"name": "MoleculeName",
"position": 0
},
"definition": {
"id": 3,
"type": "InternalFieldDefinition::MoleculeSynonym"
}
},
.
.
.
"mapping_options": {
"slurp_type": "Register without structures"
},
"persist": "true",
"name": "New Batch Reg Template"
}
}

 

As another example, the following JSON can be passed with the POST slurps API call to register a new Run of a Protocol.  Note: the csv file that is also passed has 3 columns - Molecule-Batch ID, Inhibition and SEM.

{
  "project": "Internal Data",
  "autoreject": "true",
  "mapping_template": {
    "registration_type": "",
    "header_mappings": [
      {
        "header": {"name": "Molecule-Batch ID","position": 0},
        "definition": {"id": 65061,"type": "InternalFieldDefinition::MoleculeBatchIdentifier"}
      },
      {
        "header": {"name": "Inhibition","position": 1},
        "definition": {"id": 283130,"type": "ReadoutDefinition"},
        "run_grouping": 1
      },
      {
        "header": {"name": "SEM","position": 2},
        "definition": {"id": 283131,"type": "ReadoutDefinition"},
        "run_grouping": 1
      }
    ],
    "mapping_options": {"slurp_type": "Add readouts"},
    "persist": "true",
    "name": "New Inh Import Template"
  },
 "runs":[{"run_date":"2022-04-29","place":"API", "person":"APIScientist"}]
}

 

As a final example, the following JSON can be passed with the POST slurps API call to create Plates, associate Batches to the Wells on the Plates and registers a new Run of a Protocol.  Note: the csv file that is also passed has 5 columns - MoleculeBatchID, Plate, Well, Conc and Raw.

{
  "project": "Internal Data",
  "autoreject": "true",
  "mapping_template": {
    "registration_type": "",
    "header_mappings": [
      {
       "header": {"name": "MoleculeBatchID","position": 0},
        "definition": {"id": 65061,"type": "InternalFieldDefinition::MoleculeBatchIdentifier"}
      },
      {
       "header": {"name": "Plate","position": 1},
       "definition": {"id": 148, "type": "InternalFieldDefinition::PlateName"}
      },
      {
      "header": {"name": "Well","position": 2},
      "definition": {"id": 147," type": "InternalFieldDefinition::WellLocation"}
      },
      {
       "header": {"name": "Conc","position": 3},
       "definition": {"id": 249536,"type": "ReadoutDefinition"},
        "run_grouping": 1
     },
      {
      "header": {"name": "Raw","position": 4},
      "definition": {"id": 249540,"type": "ReadoutDefinition"},
        "run_grouping": 1
      }
    ],
    "mapping_options": {"slurp_type": "Add readouts"},
    "persist": "true",
   "name": "New Plate DR Import Template"
  },
 "runs":[{"run_date":"2023-01-03","place":"API", "person":"APIScientist"}]
}

Helpful Hints

  • Remember, you can get the list of existing mapping templates using the GET mapping_Templates API call.
  • GET mapping_Templates/<template_id> API call provides all of the details of an existing mapping template including the header-mappings. Use this as guidance when creating your new mapping templates.
  • The GET protocols/<protocol_id> API returns the internal IDs of each of the readout definitions within that Protocol - these correspond to the IDs used in the header_mappings (definition) section of the above JSON.
  • The GET Fields API call is also helpful in determining how to dynamically map data fields in a POST Slurps API call.

 

This GET Fields API call will provide you with the “type” and “name” values of all fields within a Vault. The JSON returned by this API call is organized into the following sections of fields:

  • Internal
  • Batch
  • Molecule
  • Protocol

If you are creating dynamic mappings for data files being imported via the POST Slurps API call, this GET Fields API call will be useful in discovering the details of how each field in your data file should be mapped. For example, when using the header_mappings parameter in a POST Slurps API call, you need to include the ID and type of data field being mapped to and this new GET Fields API call provides these details.

 

Curl Example

 curl -H "X-CDD-Token:$TOKEN" --form-string 'json={"project": "34888", "autoreject":"false", "mapping_template": "36708","runs":[{"run_date": "2017-05-26","place": "basement lab"}, {"run_date": "2017-05-28","person":"Lab assistant"}]}' -F 'file=@path/to/file.csv' 'https://app.collaborativedrug.com/api/v1/vaults/23/slurps'

 Caveat when using curl:

Curl appears to have a bug when using large files where, if the JSON comes after the file, it can be cutoff and repeated, leading to an error response of an unexpected token.


We recommend you ensure placement of the JSON prior to the file to avoid this.

 

Ruby REST Client Example

 RestClient.post 'https://app.collaborativedrug.com/api/v1/vaults/23/slurps', {:file => File.new("path/to/file.csv"), :json => '{"project":"34888", "autoreject":"true", "mapping_template":"36708", "runs":{"run_date":"2010-10-26"}}'}, {:"X-CDD-Token" => ‘<API_TOKEN>’ }

 

Python Requests Example

url = 'https://app.collaborativedrug.com/api/v1/vaults/23/slurps'
headers = {"X-CDD-Token": api_token}
data = {
'project': 34888,
'runs': {'run_date': '2001-01-01', 'conditions': '33C'}
}
files = {'file': open('file.csv', 'rb')}
response = requests.post(url, headers=headers, data={'json': json.dumps(data)}, files=files)

 

R Example

Please visit the R scripting language page for a Bulk Import example script.

Response

{
"id": 845005082,
"class": "slurp",
"state": "queued_for_processing",
"api_url": "https://app.collaborativedrug.com/api/v1/vaults/23/slurps/845005082"
}

 

Notes on the AUTOREJECT parameter 

The autoreject parameter is set to true by default so that imports initiated via the API will be REJECTED when Suspicious Events or Errors occur. This allows other file imports in the queue to continue processing.

POST /api/v1/vaults/<vault_id>/slurps

 

{
"project":"MyProject",
"mapping_template":"SomeMappingTemplate",
"autoreject":"true",
"runs":[
            {"run_date":"2021-12-25","place":"SCRNLab", "person":"DrGina"}
           ]
}

 

AutoReject.PNG

 

Change the autoreject parameter to false to maintain the old behaviour where an import with Suspicious Events or Errors must be handled via the CDD Vault Import Data interface.

 

POST /api/v1/vaults/<vault_id>/slurps

 

{
"project":"MyProject",
"mapping_template":"SomeMappingTemplate",
"autoreject":"false",
"runs":[
            {"run_date":"2021-12-25","place":"SCRNLab", "person":"DrGina"}
           ]
}

AutoReject2.PNG

 

 

Step 2: Check Import Status

GET /api/v1/ vaults/<vault_id>/slurps/<slurp_id>

show_events Boolean. Include in the returned JSON the details for the import_warnings (ambiguous_events and suspicious_events) and import_errors.

 It is likely that Step #1 above will suffice for most bulk imports since the data is committed automatically if there are no errors or warnings. If Step #1 is not successful, the user will receive an email notification alerting them of any errors or warnings. However, programmatically checking the status of a bulk import is an option.

 Once a file has been uploaded for import, you can check the import status using the ‘slurp’ id. A JSON representation of the slurp will be returned, including the slurp’s current state, number of records processed, and number of errors and warnings. If there are any errors or warnings, the response will also include a link to the import summary and a message letting you know to go and resolve them.

The “state” of a bulk import typically progresses from “mapping” to “committed” in the order listed below. The “mapping” state is completed by the time a non-error response is returned from creating a slurp.

All possible slurp states:

  • mapping
  • queued_for_processing
  • processing
  • processed
  • queued_for_committing
  • committing
  • committed
  • canceled
  • rejected
  • invalid

API bulk imports will automatically commit if there are no Errors nor Suspicious Events. In the case where the API import generates Errors or Suspicious Events, these will need to be resolved within the CDD Vault web interface.

 

Example

Using Curl

curl -H X-CDD-Token:$Token ‘https://app.collaborativedrug.com/api/v1/vaults/23/slurps/845005082’

Response

{
"id": 845005082,
"class": "slurp",
"state": "committed",
"api_url": "https://app.collaborativedrug.com/api/v1/vaults/23/slurps/845005082",
"total_records": 1,
"records_processed": 1,
"records_committed": 1,
"import_warnings": 0,
"import_errors": 0
}

Response with Import Errors

{
  "id": 845005095,
  "class": "slurp",
  "state": "processed",
  "api_url": "https://app.collaborativedrug.com/api/v1/vaults/23/slurps/845005095",
  "total_records": 1,
  "records_processed": 1,
  "records_committed": 0,
  "import_warnings": 0,
  "import_errors": 1,
  "message": "This slurp has generated import errors or warnings. Please use web application to resolve them.",
  "web_url": "https://app.collaborativedrug.com/vaults/23/slurps/845005095"
}

 

 

Step 3: View Protocol and Run Information for the Import

GET /api/v1/vaults/<vault_id>/protocols?slurp=<slurp_id>

Once an import has been committed, you can return additional JSON results that will expose the Protocol and Run(s) of data that were imported.

 

Example

Using Curl

curl -H X-CDD-Token:$Token ‘https://app.collaborativedrug.com/api/v1/vaults/23/protocols?slurp=845005082’

Response

{
"count": {
"247571": 1
},
"offset": 0,
"page_size": 50,
"objects": [
{
"id": 28889,
"class": "protocol",
"created_at": "2017-04-18T16:16:51.000Z",
"modified_at": "2018-11-26T18:34:27.000Z",
"name": "Inhibition",
"category": "Enzyme",
"data_set": {
"name": "Sandbox",
"id": 23
},
"readout_definitions": [
{
"id": 320811,
"class": "readout definition",
"created_at": "2017-04-18T16:18:09.000Z",
"modified_at": "2017-04-18T16:18:09.000Z",
"name": "Inhibition",
"unit_label": "%",
"data_type": "Number",
"precision_type": "significant figure",
"precision_number": 3,
"description": "Inhibition Results",
"protocol_condition": false
},
{
"id": 320812,
"class": "readout definition",
"created_at": "2017-04-18T16:18:46.000Z",
"modified_at": "2017-04-18T16:18:46.000Z",
"name": "SEM",
"unit_label": "%",
"data_type": "Number",
"precision_type": "significant figure",
"precision_number": 3,
"description": "Standard Error of Mean",
"protocol_condition": false
},
{
"id": 320813,
"class": "readout definition",
"created_at": "2017-04-18T16:21:45.000Z",
"modified_at": "2017-04-18T16:21:45.000Z",
"name": "Average Inhibition",
"unit_label": "%",
"data_type": "Number",
"precision_type": "significant figure",
"precision_number": 3,
"aggregation": "batch",
"description": "Average Inhibition",
"protocol_condition": false,
"calculation": 7369
},
{
"id": 320821,
"class": "readout definition",
"created_at": "2017-04-18T16:23:45.000Z",
"modified_at": "2017-04-18T16:23:45.000Z",
"name": "Average SEM",
"unit_label": "%",
"data_type": "Number",
"precision_type": "significant figure",
"precision_number": 3,
"aggregation": "batch",
"protocol_condition": false,
"calculation": 7370
}
],
"calculations": [
{
"id": 7369,
"class": "custom calculation",
"created_at": "2017-04-18T16:21:45.000Z",
"modified_at": "2017-10-18T20:41:31.000Z",
"inputs": {
"input_readout_definitions": [
320811
]
},
"outputs": {
"output_readout_definition": 320813
},
"formula": "average([320811])",
"aggregate_readouts_by": "batch and run"
},
{
"id": 8986,
"class": "custom calculation",
"created_at": "2017-10-18T19:29:42.000Z",
"modified_at": "2017-10-18T20:41:32.000Z",
"inputs": {
"input_readout_definitions": [
320813
]
},
"outputs": {
"output_readout_definition": 346952
},
"formula": "[320813]*1"
},
{
"id": 7370,
"class": "custom calculation",
"created_at": "2017-04-18T16:23:45.000Z",
"modified_at": "2017-10-18T20:41:38.000Z",
"inputs": {
"input_readout_definitions": [
320812
]
},
"outputs": {
"output_readout_definition": 320821
},
"formula": "average([320812])",
"aggregate_readouts_by": "batch and run"
}
],
"hit_definitions": [
{
"id": 1888,
"class": "hit definition",
"created_at": null,
"modified_at": null,
"readout_definition_id": 320813,
"operator": ">=",
"value": 30,
"color": "green"
}
],
"runs": [
{
"id": 247571,
"class": "run",
"created_at": "2018-11-26T18:34:27.000Z",
"modified_at": "2018-11-26T18:34:27.000Z",
"run_date": "2018-11-26",
"person": "User ABC"
}
],
"projects": [
{
"name": "Data",
"id": 6847
}
],
"owner": "User ABC"
}
]
}

 

 

Step 4: Resolve Any Errors via the CDD Vault Data Import Tab

Log into CDD Vault, click the Data Import tab and locate the Report link for your API import file (or simply use the web_url from the Import Status JSON). Review the details and either Commit or Reject the import. For information on handling Errors and Suspicious Events, please see the Import Validation Knowledge Base article.