API calls related to molecule objects:
Retrieve
GET /api/v1/vaults/<vault_id>/molecules/<id>
Return a single molecule and its batches
GET /api/v1/vaults/<vault_id>/molecules
Return a list of molecules and their batches, based on the parameters.
If batch date constraints are specified, only batches matching the search criteria are included in the returned molecule object.
Query Parameters (all optional):
molecules |
Comma separated list of ids | Cannot be used with other parameters |
names |
Comma separated list of names/synonyms | |
async |
Boolean If true, do an asynchronous export (see Async Export) Use for large data sets |
This is strongly recommended any time you want to download more than page_size results. Note: any page_size parameter used in an API GET call that also uses the async=true parameter will be ignored. The GET call will return all valid data for the given GET call. |
no_structures |
Boolean If true, omit structure representations for a smaller and faster response. Default: false |
|
include_original_structures |
Boolean If true, include the original user defined structure for each molecule. Default: false |
Independent of no_structures |
only_ids |
Boolean If true, only the Molecule IDs are returned, allowing for a smaller and faster response. Default: false |
It can be added to any GET Molecules call. Async should still be used when many IDs are expected. |
only_batch_ids |
Boolean If true, the full Molecule details are still returned but the Batch-level information is left out of the JSON results. (Only the IDs of the Batches belonging to the Molecules are still included.) Default: false |
|
created_before |
Date (YYYY-MM-DDThh:mm:ss±hh:mm) | |
created_after |
Date (YYYY-MM-DDThh:mm:ss±hh:mm) | |
modified_before |
Date (YYYY-MM-DDThh:mm:ss±hh:mm) | |
modified_after |
Date (YYYY-MM-DDThh:mm:ss±hh:mm) | |
batch_created_before |
Date (YYYY-MM-DDThh:mm:ss±hh:mm) | A molecule with any batch that has a creation date on or before the parameter will be included |
batch_created_after |
Date (YYYY-MM-DDThh:mm:ss±hh:mm) | A molecule with any batch that has a creation date on or after the parameter will be included |
batch_field_before_name |
Batch field name | Specifes a user-defined batch field for batch_field_before_date |
batch_field_before_date |
Date (YYYY-MM-DDThh:mm:ss±hh:mm) | A molecule with any batch that has a batch_field_before_name value date on or before the parameter will be included |
batch_field_after_name |
Batch field name | Specifes a user-defined batch field for batch_field_after_date |
|
Date (YYYY-MM-DDThh:mm:ss±hh:mm) | A molecule with any batch that has a batch_field_after_name value date on or after the parameter will be included |
offset |
The index of the first object actually returned. Defaults to 0. | |
page_size |
The maximum number of objects to return in this call. Default is 50, maximum is 1000. | If the response exceeds the page_size , we strongly recommend using the async option instead of downloading multiple chunks. Note: any page_size parameter used in an API GET call that also uses the async=true parameter will be ignored. The GET call will return all valid data for the given GET call. |
projects |
Comma-separated list of project ids Defaults to all available projects Limits scope of query |
|
data_sets |
Comma-separated list of public dataset ids Defaults to no data sets Limits scope of query |
|
structure |
SMILES, cxsmiles or mol string for the query structure | Returns Molecules from the Vault that match the structure-based query submitted via this API call. |
structure_search_type |
Available options are: “exact”, “similarity” or “substructure” | Default option is substructure. |
structure_similarity_threshold |
A number between 0 and 1 | Include this parameter only if the structure_search_type is "similarity". |
inchikey |
A valid InchiKey | Use this parameter instead of the "structure" and "structure_search_type" parameters. |
molecule_fields |
Array of Molecule field names to include in the resulting JSON | Use this parameter to limit the number of Molecule UDF Fields to return |
batch_fields |
Array of Batch field names to include in the resulting JSON | Use this parameter to limit the number of Batch UDF Fields to return |
fields_search |
Array of Molecule field names & values Used to filter Molecules returned based on query values |
Notes on the molecule_fields
and batch_fields
parameters - returning only specified data (udf) fields when making the GET Molecules API call
To instruct the GET Molecules API call to return ONLY a subset of Molecule/Batch data fields, you may pass the molecule_fields and/or batch_fields parameters as JSON with the GET Molecules API call. The JSON passed with the GET Molecules API call to achieve this should follow this syntax:
GET https://app.collaborativedrug.com/api/v1/vaults/<vault_id>/molecules
As is typical when needing to pass multiple values in a JSON parameter, using the square brackets to pass an array of values is supported. Using this example JSON, for example, will result in 2 Batch fields (ID and Note) to be returned.
GET https://app.collaborativedrug.com/api/v1/vaults/<vault_id>/molecules
Also note, Molecule/Batch User-Defined Fields are only returned if they have values registered for them. The JSON returned will not include blank UDFs.
Notes on fields_search
parameter - passing Molecule field query values via the API
To pass Molecule field query values via the API, you must pass the field_search parameters as JSON with the GET Molecules API call. The field_search section of the JSON passed with the GET Molecules API call should follow this syntax:
{ "fields_search": [
{ “name”: "<Molecule udf field name>", ”<value_type>”: “<value to be searched” },
{ “name”: "<Molecule udf field name>", ”<value_type>”: “<value to be searched” }]
}
The allowed values for value_type are:
- text_value,
- float_value
- date_value
As an example, the JSON below will search for Molecules created after the date specified that have a Purity value of 90. Since the no_structure (true) and async (false) parameters are also included, no structural information is returned and the API call is not performed in the background.
GET https://app.collaborativedrug.com/api/v1/vaults/<vault_id>/molecules
Notes on Date/Time Formats
The CDD Vault API accepts ISO 8601 date/time formats in any API call that allows a date-type parameter. For example, the full date and timestamp may be used in GET calls that support a date parameter. You may still simply provide a date-only parameter like "created_after=2020-05-20".
You may also specify a date + timestamp, like "created_after= 2020-05-20 14:53:12", to indicate "20 May 2020 14:53:12 PDT" (PDT is based on the user's time zone setting). The timestamp portion can also include a UTC (Coordinated Universal Time) offset, like "created_after= 2020-05-27T14:48:40-07:00" which indicates that the time specified is -7 hours from the UTC time.
Example 1 - Exact Match Search by chemical structure (also including other parameters)
curl -H X-CDD-Token:apitoken -H "Content-Type: application/json" -X GET -d "@StructureSearch.json" https://app.collaborativedrug.com/api/v1/vaults/<vault_id>/molecules
File StructureSearch.json:
{"created_after":"2022-03-10",
"no_structures":"true",
"projects":"7090",
"structure": "CCCCCC1=CC(CC)SC1=O",
"structure_search_type": "exact"
}
Returns:
{
"count": 1,
"offset": 0,
"page_size": 50,
"objects": [
{
"id": 107763291,
"class": "molecule",
"created_at": "2022-03-15T18:15:56.000Z",
"modified_at": "2022-05-11T15:01:10.000Z",
"name": "DEMO-1000211",
"synonyms": [
"DEMO-1000211"
],
"registration_type": "CHEMICAL_STRUCTURE",
"projects": [
{
"name": "CRO Project",
"id": 7090
}
],
"owner": "Charlie Weatherall",
"batches": [
{
"id": 114276274,
"class": "batch",
"created_at": "2022-03-15T18:15:56.000Z",
"modified_at": "2022-05-11T15:01:10.000Z",
"name": "001",
"molecule_batch_identifier": "DEMO-1000211-001",
"owner": "Charlie Weatherall",
"projects": [
{
"name": "CRO Project",
"id": 7090
}
],
"salt_name": "Hydrochloride",
"solvent_of_crystallization_name": "H2O",
"formula_weight": 252.79500000000002,
"batch_fields": {
"Date": "2022-03-15",
"Person": "Charlie Weatherall",
"Vendor": "Graceland CRO",
"External Identifier": "Grace-10-005",
"Place": "Graceland CRO",
"Purity": 95.0,
"Inv Initial Amount": 100.0,
"Inv Current Amount": 100.0
},
"stoichiometry": {
"core_count": 1,
"salt_count": 1,
"solvent_of_crystallization_count": 1
}
}
]
}
]
}
Example 2 - Return a Molecule using the internal IDs
Note: the use of parameters within the GET Molecules url is obsolete - moving forward, create a JSON file containing the parameters you wish to submit with the GET Molecules API call (see Example 1 above)
curl -H "X-CDD-Token: $TOKEN" https://app.collaborativedrug.com/api/v1/vaults/489978881/molecules/927454016
Returns:
{
"id": 927454016,
"class": "molecule",
"name": "Taurine",
"synonyms": [ "Taurine" ],
"projects": [
{
"name": "McKerrow Vault",
"id": 938429932
},
{
"name": "Super User Private Stuff",
"id": 3870925
} ],
"collections": [
{
"name": "super user hits",
"id": 90769562
} ],
"owner": "Super User",
"created_at": "2010-11-17",
"modified_at": "2017-12-12",
"smiles": "NCCS(O)(=O)=O",
"cxsmiles": "NCCS(O)(=O)=O",
"inchi": "InChI=1S/C2H7NO3S/c3-1-2-7(4,5)6/h1-3H2,(H,4,5,6)",
"inchi_key": "XOAAWQZATWQOTB-UHFFFAOYSA-N",
"iupac_name": "2-aminoethane-1-sulfonic acid",
"molfile": "taurine\n Mrv1770 12121711132D \n\n 7 6 0 0 0 0 999 V2000\n 0.1105 -2.0625 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0\n 0.1105 -1.2375 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n 0.8250 -0.8250 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n 0.8250 0.0000 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0\n 0.8250 0.8250 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0\n 1.6500 -0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0\n 0.0000 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0\n 1 2 1 0 0 0 0\n 2 3 1 0 0 0 0\n 3 4 1 0 0 0 0\n 4 5 1 0 0 0 0\n 4 6 2 0 0 0 0\n 4 7 2 0 0 0 0\nM END\n",
"molecular_weight": 125.14,
"log_p": -2.61469,
"log_d": -2.61939,
"log_s": 0.996488,
"num_h_bond_donors": 2,
"num_h_bond_acceptors": 4,
"num_rule_of_5_violations": 0,
"formula": "C2H7NO3S",
"isotope_formula": "C2H7NO3S",
"dot_disconnected_formula": "C2H7NO3S",
"p_k_a": -1.49,
"p_k_a_type": "Acidic",
"exact_mass": 125.014664263,
"heavy_atom_count": 7,
"composition": "C (19.2%), H (5.64%), N (11.19%), O (38.35%), S (25.62%)",
"isotope_composition": "C (19.2%), H (5.64%), N (11.19%), O (38.35%), S (25.62%)",
"topological_polar_surface_area": 80.39,
"num_rotatable_bonds": 2,
"cns_mpo_score": 4.82803,
"fsp3": 1.0,
"batches": [
{
"id": 136660530,
"class": "batch",
"owner": "Super User",
"created_at": "2007-02-04",
"modified_at": "2007-12-31",
"projects": [
{
"name": "McKerrow Vault",
"id": 938429932
},
{
"name": "Super User Private Stuff",
"id": 3870925
} ]
} ]
}
Example 3 - Return Molecules with internal IDs 3 and 927454016
Note: the use of parameters within the GET Molecules url is obsolete - moving forward, create a JSON file containing the parameters you wish to submit with the GET Molecules API call (see Example 1 above)
curl -H "X-CDD-Token: $TOKEN" https://app.collaborativedrug.com/api/v1/vaults/489978881/molecules?molecules=3,927454016
Returns:
{
"count": 2,
"offset": 0,
"page_size": 50,
"objects": [
{
"id": 3,
"class": "molecule",
"name": "Cyanide",
"synonyms": [ "Cyanide" ],
"cdd_registry_number": 3,
"projects": [
{
"name": "McKerrow Vault",
"id": 938429932
} ],
"collections": [
{
"name": "super user hits",
"id": 90769562
},
{
"name": "all",
"id": 557113845
} ],
"owner": "Full-Access User",
"created_at": "1999-01-01",
"modified_at": "1999-01-01",
"smiles": "C#N",
"cxsmiles": "C#N",
"inchi": "InChI=1S/CHN/c1-2/h1H",
"inchi_key": "LELOWRISYMNNSU-UHFFFAOYSA-N",
"iupac_name": "formonitrile",
"molfile": "\n Mrv1770 12121711262D \n\n 2 1 0 0 0 0 999 V2000\n 0.8250 0.0000 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0\n 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n 1 2 3 0 0 0 0\nM END\n",
"molecular_weight": 27.026,
"log_p": -0.346198,
"log_d": -0.339047,
"log_s": 0.321662,
"num_h_bond_donors": 0,
"num_h_bond_acceptors": 1,
"num_rule_of_5_violations": 0,
"formula": "CHN",
"isotope_formula": "CHN",
"dot_disconnected_formula": "CHN",
"p_k_a": 9.5,
"p_k_a_type": "Acidic",
"exact_mass": 27.010899036,
"heavy_atom_count": 2,
"composition": "C (44.44%), H (3.73%), N (51.83%)",
"isotope_composition": "C (44.44%), H (3.73%), N (51.83%)",
"topological_polar_surface_area": 23.79,
"num_rotatable_bonds": 0,
"cns_mpo_score": 5.1895,
"fsp3": 0.0,
"batches": [
{
"id": 618771089,
"class": "batch",
"name": "KGB-51AD88",
"owner": "Full-Access User",
"created_at": "2007-02-04",
"modified_at": "2007-12-31",
"projects": [
{
"name": "McKerrow Vault",
"id": 938429932
} ],
"formula_weight": 27.026
} ]
},
{
"id": 927454016,
"class": "molecule",
"name": "Taurine",
"synonyms": [ "Taurine" ],
"projects": [
{
"name": "McKerrow Vault",
"id": 938429932
},
{
"name": "Super User Private Stuff",
"id": 3870925
} ],
"collections": [
{
"name": "super user hits",
"id": 90769562
} ],
"owner": "Super User",
"created_at": "2010-11-17",
"modified_at": "2017-12-12",
"smiles": "NCCS(O)(=O)=O",
"cxsmiles": "NCCS(O)(=O)=O",
"inchi": "InChI=1S/C2H7NO3S/c3-1-2-7(4,5)6/h1-3H2,(H,4,5,6)",
"inchi_key": "XOAAWQZATWQOTB-UHFFFAOYSA-N",
"iupac_name": "2-aminoethane-1-sulfonic acid",
"molfile": "taurine\n Mrv1770 12121711262D \n\n 7 6 0 0 0 0 999 V2000\n 0.1105 -2.0625 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0\n 0.1105 -1.2375 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n 0.8250 -0.8250 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n 0.8250 0.0000 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0\n 0.8250 0.8250 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0\n 1.6500 -0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0\n 0.0000 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0\n 1 2 1 0 0 0 0\n 2 3 1 0 0 0 0\n 3 4 1 0 0 0 0\n 4 5 1 0 0 0 0\n 4 6 2 0 0 0 0\n 4 7 2 0 0 0 0\nM END\n",
"molecular_weight": 125.14,
"log_p": -2.61469,
"log_d": -2.61939,
"log_s": 0.996488,
"num_h_bond_donors": 2,
"num_h_bond_acceptors": 4,
"num_rule_of_5_violations": 0,
"formula": "C2H7NO3S",
"isotope_formula": "C2H7NO3S",
"dot_disconnected_formula": "C2H7NO3S",
"p_k_a": -1.49,
"p_k_a_type": "Acidic",
"exact_mass": 125.014664263,
"heavy_atom_count": 7,
"composition": "C (19.2%), H (5.64%), N (11.19%), O (38.35%), S (25.62%)",
"isotope_composition": "C (19.2%), H (5.64%), N (11.19%), O (38.35%), S (25.62%)",
"topological_polar_surface_area": 80.39,
"num_rotatable_bonds": 2,
"cns_mpo_score": 4.82803,
"fsp3": 1.0,
"batches": [
{
"id": 136660530,
"class": "batch",
"owner": "Super User",
"created_at": "2007-02-04",
"modified_at": "2007-12-31",
"projects": [
{
"name": "McKerrow Vault",
"id": 938429932
},
{
"name": "Super User Private Stuff",
"id": 3870925
} ]
} ]
} ]
}
Create
POST /api/v1/vaults/<vault_id>/molecules/
Creates a new molecule.
Note: this operation is not permitted in registration vaults. Creation of new molecules in registration vaults must be accomplished using POST batch.
Allowed JSON keys for molecules:
class |
Optional. If present, must be “molecule” |
smiles |
Only one of these fields can be present. “structure” accepts SMILES strings or Molfiles as values For molfiles, replace all new lines with \n (JSON requirement) |
name |
String (Required) |
description |
String |
synonyms |
An array of strings |
udfs (user defined fields) |
{<udf_name>: <udf_value>, ... } |
projects |
An array of project ids and/or names (Required) |
collections |
An array of project ids and/or names |
Example
curl -H "X-CDD-Token: $TOKEN" -X POST -H "Content-Type: application/json" -d "@data.json" https://app.collaborativedrug.com/api/v1/vaults/489978881/molecules
File data.json:
{
"name": "methane",
"projects": [938429932, "Super User Private Stuff"],
"udfs": {
"field1": "value1",
"field2": "value2"
},
"molfile": "\n MJ171500 \n\n 1 0 0 0 0 0 0 0 0 0999 V2000\n -0.4464 1.3727 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\nM END\n"
}
Returns:
{
"id": 1070752025,
"class": "molecule",
"name": "methane",
"synonyms": [ "methane" ],
"projects": [
{
"name": "McKerrow Vault",
"id": 938429932
},
{
"name": "Super User Private Stuff",
"id": 3870925
}
],
"owner": "Super User",
"created_at": "2017-12-12",
"modified_at": "2017-12-12",
"smiles": "C",
"cxsmiles": "C",
"inchi": "InChI=1S/CH4/h1H4",
"inchi_key": "VNWKTOKETHGBQD-UHFFFAOYSA-N",
"iupac_name": "methane",
"molfile": "\n MJ171500 \n\n 1 0 0 0 0 0 0 0 0 0999 V2000\n -0.4464 1.3727 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\nM END\n",
"molecular_weight": 16.043,
"log_p": 1.08092,
"log_d": 1.08092,
"log_s": 0.588218,
"num_h_bond_donors": 0,
"num_h_bond_acceptors": 0,
"num_rule_of_5_violations": 0,
"formula": "CH4",
"isotope_formula": "CH4",
"dot_disconnected_formula": "CH4",
"exact_mass": 16.0313001288,
"heavy_atom_count": 1,
"composition": "C (74.87%), H (25.13%)",
"isotope_composition": "C (74.87%), H (25.13%)",
"topological_polar_surface_area": 0.0,
"num_rotatable_bonds": 0,
"cns_mpo_score": 5,
"fsp3": 1.0,
"udfs":
{
"field1": "value1",
"field2": "value2"
}
}
Update
PUT /api/v1/vaults/<vault_id>/molecules/id
Updates an existing molecule.
See Create for valid fields. Fields not specified in the JSON are not changed.
To delete a molecule, simply submit with an empty projects array.
Note that the ID for the molecule is specified as part of the URL, not in the JSON (the JSON can contain an ID field as long as its value matches the URL).
Notes on how particular fields are processed:
name |
If this field is supplied, the old name is automatically added as a synonym. To delete a molecule name, you must exclude it from the list of synonyms. Names cannot be changed in registration vaults. |
synonyms |
If supplied, the list of synonyms replaces the existing list |
udfs |
Only fields explicitly mentioned will be changed. A user-defined field can be removed by using the value null (no quotes). |
Examples
curl -H "X-CDD-Token: $TOKEN" -X PUT -H "Content-Type: application/json" -d "@data.json" https://app.collaborativedrug.com/api/v1/vaults/489978881/molecules/1070752025
File data.json:
{
"name": "a new name",
"udfs": {
"field1": null
}
}
Returns:
{
"id": 1070752025,
"class": "molecule",
"name": "a new name",
"synonyms": [ "methane", "a new name" ],
"projects": [
{
"name": "McKerrow Vault",
"id": 938429932
},
{
"name": "Super User Private Stuff",
"id": 3870925
}
],
"owner": "Super User",
"created_at": "2017-12-12",
"modified_at": "2017-12-12",
"smiles": "C",
"cxsmiles": "C",
"inchi": "InChI=1S/CH4/h1H4",
"inchi_key": "VNWKTOKETHGBQD-UHFFFAOYSA-N",
"iupac_name": "methane",
"molfile": "\n MJ171500 \n\n 1 0 0 0 0 0 0 0 0 0999 V2000\n -0.4464 1.3727 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\nM END\n",
"molecular_weight": 16.043,
"log_p": 1.08092,
"log_d": 1.08092,
"log_s": 0.588218,
"num_h_bond_donors": 0,
"num_h_bond_acceptors": 0,
"num_rule_of_5_violations": 0,
"formula": "CH4",
"isotope_formula": "CH4",
"dot_disconnected_formula": "CH4",
"exact_mass": 16.0313001288,
"heavy_atom_count": 1,
"composition": "C (74.87%), H (25.13%)",
"isotope_composition": "C (74.87%), H (25.13%)",
"topological_polar_surface_area": 0.0,
"num_rotatable_bonds": 0,
"cns_mpo_score": 5.0,
"fsp3": 1.0,
"udfs":
{
"field2": "value2"
}
}
Deleting a molecule
curl -H "X-CDD-Token: $TOKEN" -X PUT -H "Content-Type: application/json" -d "@data.json" https://app.collaborativedrug.com/api/v1/vaults/489978881/molecules/1070752025
File data.json:
{
"projects": []
}
Returns:
{
"message": "Object has been removed from all projects, so it has been deleted"
}
(Note: this is only possible in a "non-registration" vault; the majority of vaults are so called "registration" vaults. To be able to delete a molecule in the latter, you have to do this on the batch level; review this API related article. )