Slurps [GET, POST] - i.e. Bulk Import of Data via Files – CDD Support

Bulk import allows you to programmatically import data into CDD Vault. The use of an existing mapping template is one option for mapping the data in an import file into the CDD Vault using this API call. (Mapping templates are created through the CDD Vault web interface.) Another option for mapping data via this API call is to create field-header mappings directly from the POST slurps API call.

Once a file has been uploaded through the API, data from the import is committed immediately unless there are errors or warnings. By default, any import errors or warnings (Suspicious Events) will cause the import to be REJECTED. (Note: there is a parameter that can be passed that will cause the import to remain active so that the user can log in to the Import Data tab and resolve the issue(s) within the CDD Vault web interface.

Note: You can import either protocol data or register molecules/entities using the API call described below. Any file type supported via the Import Data wizard (sdf, csv, xlsx, and/or zip) can be used with this POST Slurps API call.

Step 1: Upload a Data File and Assign Mapping Parameters

POST /api/v1/ vaults/<vault_id>/slurps

This call will initiate the import. To import data, both a data file and a JSON object are required and should be provided under the ‘file’ and ‘json’ parameters.

Parameters:

JSON

project

required The name or id of a single project. Names should be passed as strings and ids must be passed as integers.

autoreject

optional Designate if unresolved ambiguous events, suspicious events, or errors will cause the import to be automatically rejected (default behaviour) or be left active in the Import Data tab for the user to resolve within the interface.

ambiguous_events_resolution

optional Designate how to resolve ambiguous events, if any occur.

none	Default Do not resolve ambiguous events. See autoreject parameter for behavior.
reject	Automatically reject all ambiguous events.
new_molecule	Automatically create new molecules for all ambiguous events.
new_batch	Automatically create new batches on the first matching molecule for all ambiguous events.

suspicious_events_resolution

optional Designate how to resolve suspicious events, if any occur.

none	Default Do not resolve suspicious events. See autoreject parameter for behavior.
reject	Automatically reject all suspicious events.
accept	Automatically accept all suspicious events.

ignore_errors

optional (false by default)

If true, ignore any errors. If false, see autoreject parameter for behavior.

mapping_template

optional The name or id of a mapping template that matches the attached file. If not provided, a mapping template that matches the file will be used. If there is more than one matching template, an error will be raised. The following parameters can be included to provide a mapping template on-the-fly. This "virtual" mapping template can be temporary or saved for future use (see persist=true).

Mapping Template details
name	required when persist = true Unique name of the mapping template.
	registration_form Specifying a registration_form (or registration_type, read below) is required except when slurp_type is 'Add readouts'. When slurp_type is 'Add readouts', no registartion_form (or registration_type, the old parameter) should be specified. When registering or updating molecules/entities, specify the registration form by using one of the following: { "registration_form": { "id": <number> } } { "registration_form": { "name": <name> } } To find out the registration form information for a specific vault, use the Registration Form(s) GET endpoint The old parameter, registration_type, can still be used for backwards compatibility in vaults where there is only 1 registration_form per registration_type. We highly recommend updating any integrations to use registration_form instead.
mapping_options {“slurp_type”}	required Register with structures, Register without structures, Update, Add or update, Add readouts
persist	optional (set to false by default) Boolean: true or false.
header_mappings	required Details on how the columns/fields in the data file map into the Vault.
	header name (name of the column/field) position (position within the file, beginning with 0. For example column A in a csv file would be represented by a position value of 0.)
	definition id (internal id of destination field/readout) type (optional for Molecule/Batch fields, required for Protocol readouts) - available options include: "InternalFieldDefinition::MoleculeStructure" "InternalFieldDefinition::MoleculeBatchIdentifier" "InternalFieldDefinition::MoleculeFieldDefinition" "InternalFieldDefinition::BatchFieldDefinition" "ReadoutDefinition"
	Run_grouping (required for readout definitions) - an arbitrary number, identifies columns/fields that will be imported into the same run. Should not be used in any InternalFieldDefinition sections.

custom_parser

optional The name or id of a parser template that matches the attached file.

runs

optional Either a single run detail object which will be applied to all new runs, or an array of run detail objects that will be applied to new runs in the same order as specified by the mapping template.

Run details
run_date	optional Use YYYY-MM-DDThh:mm:ss±hh:mm. Default is today’s date.
place	optional This field is called “Lab” within the CDD Vault web interface. No default value provided.
person	optional Default is user’s full name.
conditions	optional No default value provided.
run_fields	optional Each vault has its own settings on the minimum information required to create a new Run (for a Vault Administrator, see Settings Vault Run Fields, to change which Run fields are required).

Notes on Date/Time

The CDD Vault API accepts ISO 8601 date/time formats in any API call that allows a date-type parameter. For example, the full date and timestamp may be used in GET calls that support a date parameter. You may still simply provide a date-only parameter like "created_after=2020-05-20".

You may also specify a date + timestamp, like "created_after= 2020-05-20 14:53:12", to indicate "20 May 2020 14:53:12 PDT" (PDT is based on the user's time zone setting). The timestamp portion can also include a UTC (Coordinated Universal Time) offset, like "created_after= 2020-05-27T14:48:40-07:00" which indicates that the time specified is -7 hours from the UTC time

Notes on Populating User-Defined Run Fields

As an example, the following JSON can be passed with the POST Slurps API call to populate user-defined Run Fields when using the POST SLURPS API call to load Protocol (assay) data into a new Run.

{
  "project":"New Project",
  "run_date":"2025-05-08",
  "conditions":"New Condition",
  "place":"New Lab",
  "person":"New Person",
    "run_fields": {"CRO":"New CRO"}
}

Notes on Mapping Template Syntax

A syntax is available for creating field-header mappings directly from the POST slurps API call. Simply pass JSON that includes details of how each field in your file should be mapped. These mappings can be submitted with each import and they can be saved as a mapping template for future imports.

The parameters required for a successful data import are included in JSON which is submitted via the POST slurps API call. In addition to the previously allowed criteria, you can also specify the type of import and the field-header pairs. By default, the mapping will only be used for the specific import but it can be saved as a mapping template with a name.

As an example, the following JSON can be passed with the POST Slurps API call to register new Molecules (or new Batches) in CDD Vault. Note: the csv file that is also passed has 4 columns - SMILES, Vendor, Person and Date.

{
  "project": "Internal Data",
  "autoreject": "true",
  "mapping_template": {
    "registration_form": { "id": 108463874 },
    "header_mappings": [
      {
        "header": {"name": "SMILES", "position": 0},
        "definition": { "id": 2, "type": "InternalFieldDefinition::MoleculeStructure" }
      },
      {
        "header": {"name": "Vendor", "position": 1},
        "definition": {"id": 40714,"type": "InternalFieldDefinition::BatchFieldDefinition"}
      },
      {
        "header": {"name": "Person","position": 2},
        "definition": {"id": 40724, "type":"InternalFieldDefinition::BatchFieldDefinition"}
      },
      {
        "header": {"name": "Date","position": 3},
        "definition": {"id": 40713,"type":"InternalFieldDefinition::BatchFieldDefinition"}
      }
    ],
    "mapping_options": {"slurp_type": "Register with structures"},
    "persist": "true",
    "name": "New Molecule Reg Template"
  }
}

In this partial example, the following JSON can be passed with the POST Slurps API call to register new Molecules (or new Batches) in CDD Vault without structures. This is useful to either (1) create new No-Struct Molecules or (2) add a new Batch to an existing Molecule using the Molecule Name. Note: the csv file that is also passed has the 1st column as the MoleculeName column. Of course, other Batch fields could be included and mapped.

{
"project": "Internal Data",
"autoreject": "true",
"mapping_template": {
  "registration_form": { "id": 108463874 },
  "header_mappings": [
   {
   "header": {
   "name": "MoleculeName",
   "position": 0
   },
   "definition": {
    "id": 3,
    "type": "InternalFieldDefinition::MoleculeSynonym"
    }
   },
 "mapping_options": {
  "slurp_type": "Register without structures"
  },
 "persist": "true",
  "name": "New Batch Reg Template"
 }
}

As another example, the following JSON can be passed with the POST slurps API call to register a new Run of a Protocol. Note: the csv file that is also passed has 3 columns - Molecule-Batch ID, Inhibition and SEM.

{
  "project": "Internal Data",
  "autoreject": "true",
  "mapping_template": {
    "registration_form": "",
    "header_mappings": [
      {
        "header": {"name": "Molecule-Batch ID","position": 0},
        "definition": {"id": 65061,"type": "InternalFieldDefinition::MoleculeBatchIdentifier"}
      },
      {
        "header": {"name": "Inhibition","position": 1},
        "definition": {"id": 283130,"type": "ReadoutDefinition"},
        "run_grouping": 1
      },
      {
        "header": {"name": "SEM","position": 2},
        "definition": {"id": 283131,"type": "ReadoutDefinition"},
        "run_grouping": 1
      }
    ],
    "mapping_options": {"slurp_type": "Add readouts"},
    "persist": "true",
    "name": "New Inh Import Template"
  },
  "runs":[{"run_date":"2022-04-29","place":"API", "person":"APIScientist"}]
}

As a final example, the following JSON can be passed with the POST slurps API call to create Plates, associate Batches to the Wells on the Plates and registers a new Run of a Protocol. Note: the csv file that is also passed has 5 columns - MoleculeBatchID, Plate, Well, Conc and Raw.

{
  "project": "Internal Data",
  "autoreject": "true",
  "mapping_template": {
    "registration_form": "",
    "header_mappings": [
      {
        "header": {"name": "MoleculeBatchID","position": 0},
        "definition": {"id": 65061,"type": "InternalFieldDefinition::MoleculeBatchIdentifier"}
      },
      {
        "header": {"name": "Plate","position": 1},
        "definition": {"id": 148, "type": "InternalFieldDefinition::PlateName"}
      },
      {
       "header": {"name": "Well","position": 2},
       "definition": {"id": 147," type": "InternalFieldDefinition::WellLocation"}
      },
      {
        "header": {"name": "Conc","position": 3},
        "definition": {"id": 249536,"type": "ReadoutDefinition"},
        "run_grouping": 1
      },
      {
       "header": {"name": "Raw","position": 4},
       "definition": {"id": 249540,"type": "ReadoutDefinition"},
        "run_grouping": 1
      }
    ],
    "mapping_options": {"slurp_type": "Add readouts"},
    "persist": "true",
    "name": "New Plate DR Import Template"
  },
  "runs":[{"run_date":"2023-01-03","place":"API", "person":"APIScientist"}]
}

Helpful Hints

Remember, you can get the list of existing mapping templates using the GET mapping_Templates API call.
GET mapping_Templates/<template_id> API call provides all of the details of an existing mapping template including the header-mappings. Use this as guidance when creating your new mapping templates.
The GET protocols/<protocol_id> API returns the internal IDs of each of the readout definitions within that Protocol - these correspond to the IDs used in the header_mappings (definition) section of the above JSON.
The GET Fields API call is also helpful in determining how to dynamically map data fields in a POST Slurps API call.

This GET Fields API call will provide you with the “type” and “name” values of all fields within a Vault. The JSON returned by this API call is organized into the following sections of fields:

Internal
Batch
Molecule
Protocol

If you are creating dynamic mappings for data files being imported via the POST Slurps API call, this GET Fields API call will be useful in discovering the details of how each field in your data file should be mapped. For example, when using the header_mappings parameter in a POST Slurps API call, you need to include the ID and type of data field being mapped to and this new GET Fields API call provides these details.

Curl Example

 curl -H "X-CDD-Token:$TOKEN" --form-string 'json={"project": "34888", "autoreject":"false", "mapping_template": "36708","runs":[{"run_date": "2017-05-26","place": "basement lab"}, {"run_date": "2017-05-28","person":"Lab assistant"}]}' -F 'file=@path/to/file.csv' 'https://app.collaborativedrug.com/api/v1/vaults/23/slurps'

Caveat when using curl:

Curl appears to have a bug when using large files where, if the JSON comes after the file, it can be cutoff and repeated, leading to an error response of an unexpected token.

We recommend you ensure placement of the JSON prior to the file to avoid this.

Ruby REST Client Example

 RestClient.post 'https://app.collaborativedrug.com/api/v1/vaults/23/slurps', {:file => File.new("path/to/file.csv"), :json => '{"project":"34888", "autoreject":"true", "mapping_template":"36708", "runs":{"run_date":"2010-10-26"}}'}, {:"X-CDD-Token" => ‘<API_TOKEN>’ }

Python Requests Example

url = 'https://app.collaborativedrug.com/api/v1/vaults/23/slurps'
headers = {"X-CDD-Token": api_token}
data = {
 'project': 34888,
'runs': {'run_date': '2001-01-01', 'conditions': '33C'}
}
files = {'file': open('file.csv', 'rb')}
response = requests.post(url, headers=headers, data={'json': json.dumps(data)}, files=files)

R Example

Please visit the R scripting language page for a Bulk Import example script.

Response

{
  "id": 845005082,
  "class": "slurp",
  "state": "queued_for_processing",
  "api_url": "https://app.collaborativedrug.com/api/v1/vaults/23/slurps/845005082"
}

Notes on the AUTOREJECT parameter

The autoreject parameter is set to true by default so that imports initiated via the API will be REJECTED when Suspicious Events or Errors occur. This allows other file imports in the queue to continue processing.

POST /api/v1/vaults/<vault_id>/slurps

{
"project":"MyProject",
"mapping_template":"SomeMappingTemplate",
"autoreject":"true",
"runs":[
            {"run_date":"2021-12-25","place":"SCRNLab", "person":"DrGina"}
           ]
}

Change the autoreject parameter to false to maintain the old behaviour where an import with Suspicious Events or Errors must be handled via the CDD Vault Import Data interface.

POST /api/v1/vaults/<vault_id>/slurps

{
"project":"MyProject",
"mapping_template":"SomeMappingTemplate",
"autoreject":"false",
"runs":[
            {"run_date":"2021-12-25","place":"SCRNLab", "person":"DrGina"}
           ]
}

Step 2: Check Import Status

GET /api/v1/ vaults/<vault_id>/slurps/<slurp_id>

show_events

Boolean. Include in the returned JSON the details for the import_warnings (ambiguous_events and suspicious_events) and import_errors.

It is likely that Step #1 above will suffice for most bulk imports since the data is committed automatically if there are no errors or warnings. If Step #1 is not successful, the user will receive an email notification alerting them of any errors or warnings. However, programmatically checking the status of a bulk import is an option.

Once a file has been uploaded for import, you can check the import status using the ‘slurp’ id. A JSON representation of the slurp will be returned, including the slurp’s current state, number of records processed, and number of errors and warnings. If there are any errors or warnings, the response will also include a link to the import summary and a message letting you know to go and resolve them.

The “state” of a bulk import typically progresses from “mapping” to “committed” in the order listed below. The “mapping” state is completed by the time a non-error response is returned from creating a slurp.

All possible slurp states:

mapping
queued_for_processing
processing
processed
queued_for_committing
committing
committed
canceled
rejected
invalid

API bulk imports will automatically commit if there are no Errors nor Suspicious Events. In the case where the API import generates Errors or Suspicious Events, these will need to be resolved within the CDD Vault web interface.

Example

Using Curl

curl -H X-CDD-Token:$Token ‘https://app.collaborativedrug.com/api/v1/vaults/23/slurps/845005082’

Response

{
  "id": 845005082,
  "class": "slurp",
  "state": "committed",
  "api_url": "https://app.collaborativedrug.com/api/v1/vaults/23/slurps/845005082",
  "total_records": 1,
  "records_processed": 1,
  "records_committed": 1,
  "import_warnings": 0,
  "import_errors": 0
}

Response with Import Errors

{
  "id": 845005095,
  "class": "slurp",
  "state": "processed",
  "api_url": "https://app.collaborativedrug.com/api/v1/vaults/23/slurps/845005095",
  "total_records": 1,
  "records_processed": 1,
  "records_committed": 0,
  "import_warnings": 0,
  "import_errors": 1,
  "message": "This slurp has generated import errors or warnings. Please use web application to resolve them.",
  "web_url": "https://app.collaborativedrug.com/vaults/23/slurps/845005095"
}

Step 3: View Protocol and Run Information for the Import

GET /api/v1/vaults/<vault_id>/protocols?slurp=<slurp_id>

Once an import has been committed, you can return additional JSON results that will expose the Protocol and Run(s) of data that were imported.

Example

Using Curl

curl -H X-CDD-Token:$Token ‘https://app.collaborativedrug.com/api/v1/vaults/23/protocols?slurp=845005082’

Response

{
 "count": {
 "247571": 1
 },
 "offset": 0,
 "page_size": 50,
 "objects": [
 {
 "id": 28889,
 "class": "protocol",
 "created_at": "2017-04-18T16:16:51.000Z",
 "modified_at": "2018-11-26T18:34:27.000Z",
 "name": "Inhibition",
 "category": "Enzyme",
 "data_set": {
 "name": "Sandbox",
 "id": 23
 },
 "readout_definitions": [
 {
 "id": 320811,
 "class": "readout definition",
 "created_at": "2017-04-18T16:18:09.000Z",
 "modified_at": "2017-04-18T16:18:09.000Z",
 "name": "Inhibition",
 "unit_label": "%",
 "data_type": "Number",
 "precision_type": "significant figure",
 "precision_number": 3,
 "description": "Inhibition Results",
 "protocol_condition": false
 },
 {
 "id": 320812,
 "class": "readout definition",
 "created_at": "2017-04-18T16:18:46.000Z",
 "modified_at": "2017-04-18T16:18:46.000Z",
 "name": "SEM",
 "unit_label": "%",
 "data_type": "Number",
 "precision_type": "significant figure",
 "precision_number": 3,
 "description": "Standard Error of Mean",
 "protocol_condition": false
 },
 {
 "id": 320813,
 "class": "readout definition",
 "created_at": "2017-04-18T16:21:45.000Z",
 "modified_at": "2017-04-18T16:21:45.000Z",
 "name": "Average Inhibition",
 "unit_label": "%",
 "data_type": "Number",
 "precision_type": "significant figure",
 "precision_number": 3,
 "aggregation": "batch",
 "description": "Average Inhibition",
 "protocol_condition": false,
 "calculation": 7369
 },
 {
 "id": 320821,
 "class": "readout definition",
 "created_at": "2017-04-18T16:23:45.000Z",
 "modified_at": "2017-04-18T16:23:45.000Z",
 "name": "Average SEM",
 "unit_label": "%",
 "data_type": "Number",
 "precision_type": "significant figure",
 "precision_number": 3,
 "aggregation": "batch",
 "protocol_condition": false,
 "calculation": 7370
 }
 ],
 "calculations": [
 {
 "id": 7369,
 "class": "custom calculation",
 "created_at": "2017-04-18T16:21:45.000Z",
 "modified_at": "2017-10-18T20:41:31.000Z",
 "inputs": {
 "input_readout_definitions": [
 320811
 ]
 },
 "outputs": {
 "output_readout_definition": 320813
 },
 "formula": "average([320811])",
 "aggregate_readouts_by": "batch and run"
 },
 {
 "id": 8986,
 "class": "custom calculation",
 "created_at": "2017-10-18T19:29:42.000Z",
 "modified_at": "2017-10-18T20:41:32.000Z",
 "inputs": {
 "input_readout_definitions": [
 320813
 ]
 },
 "outputs": {
 "output_readout_definition": 346952
 },
 "formula": "[320813]*1"
 },
 {
 "id": 7370,
 "class": "custom calculation",
 "created_at": "2017-04-18T16:23:45.000Z",
 "modified_at": "2017-10-18T20:41:38.000Z",
 "inputs": {
 "input_readout_definitions": [
 320812
 ]
 },
 "outputs": {
 "output_readout_definition": 320821
 },
 "formula": "average([320812])",
 "aggregate_readouts_by": "batch and run"
 }
 ],
 "hit_definitions": [
 {
 "id": 1888,
 "class": "hit definition",
 "created_at": null,
 "modified_at": null,
 "readout_definition_id": 320813,
 "operator": ">=",
 "value": 30,
 "color": "green"
 }
 ],
 "runs": [
 {
 "id": 247571,
 "class": "run",
 "created_at": "2018-11-26T18:34:27.000Z",
 "modified_at": "2018-11-26T18:34:27.000Z",
 "run_date": "2018-11-26",
 "person": "User ABC"
 }
 ],
 "projects": [
 {
 "name": "Data",
 "id": 6847
 }
 ],
 "owner": "User ABC"
 }
 ]
}

Step 4: Resolve Any Errors via the CDD Vault Data Import Tab

Log into CDD Vault, click the Data Import tab and locate the Report link for your API import file (or simply use the web_url from the Import Status JSON). Review the details and either Commit or Reject the import. For information on handling Errors and Suspicious Events, please see the Import Validation Knowledge Base article.