This article covers the details of biological protocol set-up from raw data. We'll assume here that you've created protocols (also referred to as protocol definitions) before, so if you have not, you can start with this introductory tutorial instead. IC50 Dose-response protocols are described separately here.
This type of protocol will be suitable for any end-point data entry such as cytotoxicity, DMPK, or single-point screening data. Each protocol definition will consist of the following steps:
Create a new protocol (definition)
Before runs can be created and assay results can be imported, we will need to set up a protocol. Protocol architecture in CDD is very flexible, and many kinds of data can be accommodated, including data from enzymatic and cell-based assays, in vitro and in vivo ADME/TOX screens, as well as in vivo pharmacodynamic and efficacy data.
For any one assay, there are several ways to design a protocol, depending on how you plan to aggregate the data (calculate averages and such), and how you plan to search/mine it in the future. If you're unsure of the best design for your specific protocol, you can always contact your account manager, or Support directly.
With this in mind, here is a list of things to consider while planning your protocols:
- What is your raw data? (that you collect directly from your instruments)
- What is your primary result? (e.g. % Inhibition, Ki, IC50, etc)
- What are the conditions that you need to capture to give enough context to the results?
- What are the calculations you need to perform on your data?
- Do you need to aggregate/average your data?
With a good plan in hand, it's time to build your protocol definition. Think of this as the set of instructions you create to direct the system how, where and what to do with your data that is associated with the entities that you register.
Create a new protocol
On the Explore Data tab, click Create New at the top of the side-bar, and choose Protocol from the drop-down.
Protocol Fields in the Create a New Protocol dialog are configured and managed by your Vault Administrator under the Vault Settings. More on that in a bit. Name, Category and Description are default Protocol Fields supplied by the system.
Name - The protocol's name should be short and descriptive of the assay. It's a good idea to create some protocol-naming guidelines so that later, when your vault has dozens or even hundreds of protocols created by different users, you'll be able to pick the one you need from a list.
Category - This field is not required but it can be very helpful in the future for grouping protocols together and for narrowing down search results in your vault. It is by default a text field though we recommend formatting this as a pick list to limit how users categorize a protocol definition for optimal search retrieval. Category terms such as cellular assay, biochemical, ADME, binding, In Vivo, etc. are commonly used Categories.
Description - Another optional field that will be very helpful to other scientists who want to understand your protocol or to you in a year when you want to remember what was done.
Additional Protocol Fields - Other optional fields may be available to populate if your Vault Administrator has created custom Protocol Fields. These fields not only help you retrieve specific data, they also allow you to sort data when accessing the Protocol Index Page.
Project - The project field is required because it determines who has access to the data in the new protocol. When creating a new protocol you can only select one project from the drop-down menu but more projects can be added or removed later.
Don't forget to click "Create protocol" at the bottom of the form.
You are now taken to the "protocol details" page where you will continue to build out the protocol by creating readout definitions.
Readout definitions represent all of the result types that are captured for any specific assay. The result types may include all of the following: conditions, experimental data, calculated results, and meta-data such as experiment # and assay order. While "readout definition" is the default name of this record type, it may have alternate nomenclature within your specific vault. For example, it may be called "assay parameter" or "result type".
Basic readout definition will have a name and a data type. This is the required minimum. This readout will be imported directly from an import file, with no calculations being applied to it. It will appear in exactly the same format as appears in the import file. You may choose to express this with a display format and/or unit of measure.
Normalized readout definition will be a result of a calculation based on some imported basic readout. You will see the available options when you choose a "number" data type of the basic readout definition. When left on the default "do not normalize", the readout will be just the end-point that was imported from file, or the basic readout. As expected, we cannot perform calculations on non-number data types.
Here are the available fields and options for readout definitions:
Name- required. This is the label of the raw data readout. This could be a measured readout or a captured observation, e.g., RFU, Abs, Counts.
Required- Check this box if the readout you are creating must be populated for every row of readout data registered/imported. If a null value is read, the entire data row will be rejected.
Data type- required. This sets the type of data that is permitted in this readout definition.
- Number- the most common data type for HTS data. Only numeric values with modifiers (>, <, >=, <=) are permitted. This means that things like N/A or * or "precipitated" are not allowed.
- Text- alphanumeric values are permitted. Qualitative results and hyperlinks should also be entered as this data type.
- Pick List- a pre-defined list of alphanumeric values that may be entered for this readout. Gives the protocol owner ability to control the values that are imported into the vault. Data such as phenotypes, descriptions, cell lines etc. should be defined as pick lists. Note, that you cannot calculate across pick-list values! Learn more about pick lists.
- Batch Link: allows linking Batch records to other entities stored in the same or across separate CDD Vaults (if the Link Across Vaults feature is enabled for your CDD Account).
- File- file attachments of any file type and size. Image previews will be generated for JPG, GIF, BMP, PNG, TIFF and PDF formats. All other files will be available for download to view with their native software.
Protocol Condition - Check this box if the readout you are creating should be used by CDD Vault to aggregate data. Details on Protocol Conditions are documented here, in the Knowledgebase.
Display Format- defaults to 3 significant figures. This will determine the number of digits that appear throughout the vault. Much like Excel, this format is only a style, while all calculations are performed on the underlying full number.
- Decimal places- choose the number of decimal places following the decimal separator.
- Significant figures- choose the number of significant figures. If you remember your high school math, significant figures include all digits, except leading or trailing zeros.
Unit- optional. While the unit of measurement is an optional field, if you remember your high school science, you should include a unit, if you are taking a measurement. This is a free text field, so any unit can go in here. The units should stay consistent throughout the protocol, this is another thing they teach in Chemistry/Biology 101, so we won't harp on it.
Description- optional. Typically the description will be necessary if you did the calculation outside of CDD and are importing a final value, or if you have a scoring system that needs explanation.
Normalization- optional. Common data normalization options for HTS data are supported.
The drop-down for normalization includes the following options which will influence the fit validation as well as plot scales, so choose one that best describes your data:
- Normalize within each plate - both positive and negative controls are run on each screening plate. Controls will be averaged per plate, and test data normalization will be performed per plate. This will help remove any plate-to-plate variation.
- Normalize within each run - positive and negative controls are present on one or on some plates. All controls will be averaged together across plates before test data are normalized.
- Already normalized - if you perform another normalization outside of CDD, this is the best option to use.
- No controls (do not normalize) - choose this option if you're using data that will not have a consistent scale, e.g., fold-change.
Data normalization functions -
Fit parameters: Min, Max, and Hill Slope
The curve fit is performed using the standard Hill equation, or the four parameter logistic curve:
Response is the measured response on the Y axis.
Baseline response is the minimum response at the bottom of the plateau.
Maximum response is the maximum response at the top of the plateau.
EC50 is the concentration at 50% response
Concentration is the measured drug concentration on the X axis.
Hill Slope is the Hill coefficient that describes the steepness of the curve.
Calculated Readout Definition- Data in import files or reader files can be stored as basic readouts. CDD Vault includes the arithmetic and geometric mean calculations for those readouts. Visualization subscribers may access further functionality for calculations, including variables and calculated chemical properties. In order to expose any calculated values, these calculated readout definitions need to be built into the protocol definitions.
For details please refer to this knowledge base article:
Custom calculations within one protocol
Choose a control plate layout if you have defined any normalized readouts, with the exception of sample-based z-score. The control layouts are used by the normalization functions that need to calculate the negative and positive control means and standard deviations. Here's a complete article that addresses control layouts.
Control layouts defined on the protocol details tab will be the default layouts that are applied to all plates imported into your protocol. These defaults can be over-ridden on individual runs, or even on individual plates.