You can search on fields associated with molecules and batches in the Keywords box on the Search page on the Explore Data tab. While Protocol data cannot be searched in this section, it is available in the Protocols section of the Search interface and may be combined with the keyword search.
The keyword search option consists of:
- Drop-down list of available molecule and batch fields
- Drop-down list of operators
- Text-box
- Links to add a term to the search or remove a term from it
General search rules
- Search values are not case-sensitive.
- Search terms will be matched exactly.
- The allowed number of terms to search is 20,000.
- Searches can include any combination of terms in any order.
- The allowed character limit for searches is 65,535 characters.
- Retrieve all molecules and batches by selecting Any field and (Any Value).
- Complex searches can be created by adding terms.
- The information retrieved is determined by the fields you use in the search.
- All molecules matching the criteria and all of their batches are retrieved when searching only molecule fields or when searching Any field.
- Searching on only batch fields retrieves batches of molecules matching the criteria.
- Searching on both molecule and batch fields retrieves batches of molecules matching the criteria.
- When the results table is loaded, additional fields may be added via the Customize your report link at the top of the table.
Exact Match or Partial Match Search
By default, the search will return results that match your terms exactly, allowing you to find a precise list of molecules, user-defined fields, or any combination of keywords. It is possible to perform a partial match search by using a wildcard, a special character or an alphanumeric string.
Wildcard - Star (*)
- To compose flexible, partial match searches, use “ * ” wildcard to match 1 or more additional characters. For example, " Ben* " will find anything that starts with "ben": benzene, benzodiazepine, benadryl, etc.
- Note: Keyword searching is prefix based, meaning that the search term must include initial characters, and cannot start with the middle or tail of a term. e.g. to find benzodiazepine, the search must include "ben" or "benzo". Nothing will be found by looking for "diazepine", even if a wild card is included, e.g."*diazepine" will not work.
Special characters
- Search terms are broken into separate terms on the list of special characters below.
: / . , \ ~ ! @ # $ % ^ & ( ) - _ + = < > ?
- These divide the search term into sections that may be searched independently
- e.g. "ABD-123873-old-5" is broken up into separate terms "ABD" and "123873" and "old" and "5".
- Searching for "123873" will retrieve "ABD-123873-old-5".
- The order of the terms is preserved.
- e.g. searching for “old.5” will match "ABD-123873-old-5" but “5-old” will not.
- e.g. "ABD-123873-old-5" is broken up into separate terms "ABD" and "123873" and "old" and "5".
Alpha-numeric
- Alpha-numeric strings containing both letters and numbers will be split where the letters change to numbers or vice versa. When the string is split, leading zeros in the numeric portion are ignored.
- e.g. Both “ABC1234” and “ABC01234” will be broken into “ABC” and “1234”. Searching on “ABC1234”, “ABC”, ”ABC*”, “1234” will all retrieve “ABC1234” and “ABC01234”. Note: “ABC0*” will not retrieve “ABC01234” but "ABC1*" will.
Field Name
Any field
- Default setting of the Keyword Search.
- The search will look across all available fields. Restricting the search to any of the fields below will limit the search context to that field.
e.g. Search for a list of molecule names to retrieve all molecules in the list.
Molecule name
- The search will only look at the Molecule name field. Identifiers assigned as molecule names will be normalized prior to searching so that:
- Text-based prefix is separated from the numeric sequence. The prefix is separated if it precedes the numeric sequence directly (e.g. ABC001), or if it's separated by one of the following delimiters, the delimiter is removed for the purposes of searching and uniqueness validation.
: / . , \ ~ ! @ # $ % ^ & ( ) - _ + = < > ?
e.g. 'ABC-001', 'ABC.001', and 'ABC/001' all represent the same ID
- Leading zeros are removed from the numeric sequence, which is then searched independently of the prefix.
- Neither the leading zero padding or the ID prefix is necessary to include in this search.
e.g. The search below returns DEMO-0000012, DEMO-0000054 and DEMO-0000758.
Synonyms
- The search will only look at the Synonym field.
- Identifiers are normalized in the same manner as Molecule names.
e.g. The search, “Asp*” will retrieve synonyms such as aspirin, asparagine, etc.
Data set
- Search the data set name for publicly available molecules. Make sure to select the data-sets from the Explore Data sidebar under Public Data prior to executing your search. For example, select all data-sets and search for "Malaria". Molecules returned will belong to Malaria-related data-sets.
CDD#
- Search the CDD# assigned to all publicly registered molecules either independently or alongside any privately assigned molecule names or synonyms. For example, searching for "CDD" or "CDD-" with only your vault data selected, will return any molecules in your projects that have been registered in CDD public.
Note: If a private molecule in your vault appears with a CDD#, this means it is also available in CDD's public domain. Your private information is never shared.
Batch fields
- This is a general section heading that can be searched if you don't know the name of the specific batch field.
- Under Batch fields, there is batch name (- Name). There may be any number of batch fields beyond batch name, depending on business rules.
- If you are not able to find what you need under a specific field, try expanding your search to include all "batch fields" or "all fields".
-Name
- Search the Batch ID assigned by CDD. This is a required field.
- In Vaults with prefix-based batch names, this ID will be formatted as XX-000, where the prefix reflects a salt-code, and suffix is the sequence number for this salt form.
- In Vaults without prefixed based batch names, this ID is a 3-digit number, e.g. 001.
- The search will return batches that match the Batch ID.
-Salt
- Search for molecules of a particular salt form, using the full salt name (rather than the salt code described by Batch Name). This is generated during compound registration. Available CDD salt names and salt codes are listed here.
- Returned list will include all batches containing the specified salt.
- To view the salt data in the results table, change display options to include additional batch fields.
e.g. Search for "Hydrochloride" will return any batch that has an HCl salt.
-Other fields
- Other common batch fields may include Person, Vendor, External ID, Date and Note. Batch fields will vary depending on your vault's business rules, so make sure to look in the keyword drop-down for your fields.
User defined fields
- User defined fields (UDFs) are molecule fields. Depending on the business rules, there may be no UDFs, or may be any number of pre-defined fields.
- This is a general section heading which can be used if you don't know which user defined field contains the desired information.
- If you are not able to find what you need under a specific field, try expanding your search to include all "user defined fields" or "all fields".
Operators
(Any Value)
- Retrieves any molecules or batches of molecules with a value in the specified field.
- Text box should be left blank. Entering a term in the text box will automatically change (Any Value) to has.
- When combined with Any field, all molecules will be retrieved.
e.g. Searching for (Any Value) in the Synonyms field, will retrieve molecules with a value in the Synonyms field.
has
- Returns results containing the specified term(s).
- To search for multiple values in a specific field, enter a list of values (1 value per row)
- Wildcards, the star (*), are allowed.
e.g. Searching for either has “Hydrochloride” or has “Hydrochl*” in the Salt field, will retrieve batches of molecules containing a hydrochloride salt.
From
- Use to search for a range of numeric values.
- Only supported for numeric fields.
e.g. To find all batches where the current amount is less than or equal to 5, search for from “0” to “5” in the Inv Current Amount field.
Not
- Retrieves any molecules or batches of molecules not containing the specified term(s).
- Wildcards, a star (*) allowed.
e.g. To find all batches that have a salt, search for not “No Salt*” in the Salt field.
(No Value)
- Retrieves any molecules or batches of molecules without any value in the specified field
e.g. Searching for (No Value) in the Purity field, will retrieve batches of molecules where the Purity field is empty or blank.
Add a term/Remove a term
Searches containing multiple search terms can be constructed by adding one or more terms. The options are and and or.
And
- All conditions must be met.
e.g. To find batches in Freezer 1 with an amount less than 5, search for Current Amount from “0” to “5” then add a term for Location has Freezer 1. Click the and radio button.
Or
- One condition must be met.
e.g. To find batches from vendors that start with Sig or from TCI, search for Vendor has “Sig*” then add a term for Vendor has “TCI”. Click the or radio button.