Information submitted through the support site is private but is not hosted within your secure CDD Vault. Please do not include sensitive intellectual property in your support requests.

Custom calculations within one protocol

Calculations performed between numeric protocol readouts, chemical properties, and constant numbers, can be defined in CDD Vault when the CDD Visualization extension is enabled in the Vault. Contact support to find out how to become a CDD Vision subscriber.

In this article, we are looking at performing calculations between readouts of a single protocol and calculations with chemical properties. Go to this article for calculations that involve multiple protocols.

Basic steps to add a calculated readout

Reference Variables

Readout definitions

Chemical properties

Functions

Average

Geomean

Median

Sum

Aggregation Scope

Operators

 

Example Calculations

pIC50

Geometric mean of IC50

Absolute IC50

FP ratio

BEI

LiPE (LLE)

Reagent Inventory

 

Calculations are based on existing assay data within your CDD Vault protocols, as well as chemical properties of the registered compounds, and numeric constants. In this manner, you can build dynamic mathematical functions that use protocol readouts as variables, and are automatically updated when new raw data is imported into the Vault.

Calculations are defined as new readouts in existing protocols, based on previously defined readouts. In a new protocol, calculations will be added after the initial set of raw data readout definitions.

We assume here that you are familiar with defining protocols and readouts already, and if you are new to protocol creation, please start here.

 

Basic steps to add a calculated readout

On the "Protocol Details" tab, and click "Add/edit readout definitions".

The three options you see are:

Screen_Shot_2014-05-05_at_3.45.49_PM.png

Chose to add a calculated readout definition. Note that if you don't see the formula box as shown below, you do not have access to advanced calculations. Contact your CDD representative or support to enable this.

You will need to provide a name for the readout definition (that does not conflict with any other readout names in this protocol of course), and then fill in the formula in the provided formula builder.

Screen_Shot_2014-05-05_at_3.50.10_PM.png

Begin typing in the Formula box to see what kinds of functions can be applied and combined.

Type a left square bracket [ to bring up a list of available readout definitions.

Type a curly bracket { to reveal a list of chemical properties.

Auto-complete and syntax help will appear as you type.

Use parentheses ( ) to indicate the order of operations.

 

Reference Variables

Calculations performed on referenced variables are dynamic: when the underlying variables are changed, the calculation is updated automatically. In this way, a single calculation can be defined to be performed on previously imported data, and on any incoming data.

 

Readout definitions

For your calculation, you can reference other existing readout definitions from the current protocol. Make sure to define all of the basic readout definitions before you start to combine them using calculations. For example, if you plan to build a calculation that includes % Inhibition values, % Inhibition needs to be defined ahead of the calculation- otherwise it will not be available as a reference variable when you start to build your formula. 

 

Syntax

Type in an open square bracket, then click a readout name in the drop-down list: the matching bracket will be inserted automatically. The list of available options includes all available readout definitions with your current protocol appearing first, followed by all other protocols in alphabetical order. 

[Protocol name -> readout definition name (units)]   e.g. [Primary Screen -> Raw Fluorescence (RFU)]

 

Scope

By default, when using above syntax, calculations are performed for each molecule/batch with data in the referenced readout in per readout row mode.

  • If the referenced readout is blank for a given batch, no calculated value will be returned for this batch.
  • If there are multiple values in the referenced readout for a given batch (replicate data), then multiple values will be returned by the calculation.
  • If the calculation references two readout definitions, a calculated value will be returned only if both of the referenced readout values occur on the same row. This happens when the data are added to the protocol in a single import event, and are both on the same row of the import file.

Here's an example of taking the ratio of two readouts for a single batch: the ratio value is calculated only when both CFP (477nm) and YFP(514nm) values appear on the same row, even though there are individual data points for both readouts on two other rows.

Screen_Shot_2014-05-05_at_4.43.41_PM.png

To perform a calculation based on multiple readout rows per batch, such as the scenario shown in the screen-shot above, the referenced readout(s) should aggregated with Average or Median functions.

 

Chemical properties

When you import molecule structures into CDD Vault, chemical properties are automatically calculated, and displayed on the molecule details page, or in the search results. These properties may be referenced as variables within calculated readout definitions.

 

Syntax

Type in an open curly bracket, then click a chemical property name in the drop-down list: the matching bracket will be inserted automatically.

{chemical property name}  e.g. {Molecular Weight}

Here's the list of available chemical properties:

    • Molecular weight (g/mol)
    • log P
    • H-bond donors
    • H-bond acceptors
    • Lipinski violations
    • logD
    • logS (solubility model)
    • pKa
    • CNS MPO score
    • Topological polar surface area
    • Exact mass (g/mol)
    • Fsp3
    • Heavy atom count
    • Rotatable bonds
    • Formula weight (g/mol)

Scope

When a chemical property is referenced in a calculated readout, molecule level properties are used, with the exception of formula weight, which is calculated per batch of molecule.

 

Functions

The following functions can be performed on referenced readout definitions. These functions may be used as part of another calculation, and in fact should be used on replicate data from multiple runs, or imported from multiple files.

Average

Average returns an average of all readout values in the selected aggregation scope for the chosen readout definition. When performed as a standalone calculation (not part of a larger formula), the returned result also includes the standard deviation, and the number (n) of values.

output: Avg ± StDev (n) 
23.43 ± 3.16 (n=8)

A bounded value (that contains "less than" or "greater than" numeric modifier) will not be included when computing the average if there is another measurement that is consistent with it and is more specific. Exact values are always included.

  • Less specific bounded values are not included:
    Average(<1, <10) = <1
    Average(20, 30, <100)  = 25
    Average(80, <100, 150)  = 115 (<100 is consistent with 80 and less specific)
  • Bounded values are retained if they are not consistent with an exact value:
    Average(<2, 8) = <5

This rule is based on how drug discovery data is normally generated. For example, if you perform a concentration-response assay between 100 to 10,000 nM, and the IC50 comes out as <100 nM, you will test at lower concentrations, say between 1 and 200 nM. If the IC50 in the second test was 80 nM, it makes sense to leave out the <100 nM from the first assay, because it has been superseded by the more specific 80 nM measurement.

On the other hand, if you ran the 100 to 10,000 nM assay twice and the only IC50s you had were <100 and 120 nM, we shouldn’t ignore the <100 nM, because doing so would result in a number that was artificially high.

 

Geomean

Geomean returns a geometric average of all readout values in the selected aggregation scope of the chosen readout definition. When performed as a standalone calculation (not part of a larger formula), the returned result also includes the standard deviation, and the number (n) of values.

Geomean is automatically inserted when averaging CDD-computed EC50 values. To over-ride the geomean aggregation, use "mean" in the formula.

output: Avg ×/÷ StDev (n) 
23.43 ×/÷ 3.16 (n=8)

A bounded value (that contains "less than" or "greater than" numeric modifier)  will not be included when computing the average if there is another measurement that is consistent with it and is more specific. Exact values are always included.

  • Less specific bounded values are not included:
    Average(<1, <10) = <1
    Average(20, 30, <100)  = 25
    Average(80, <100, 150)  = 115 (<100 is consistent with 80 and less specific)
  • Bounded values are retained if they are not consistent with an exact value:
    Average(<2, 8) = <5

This rule is based on how drug discovery data is normally generated. For example, if you perform a concentration-response assay between 100 to 10,000 nM, and the IC50 comes out as <100 nM, you will test at lower concentrations, say between 1 and 200 nM. If the IC50 in the second test was 80 nM, it makes sense to leave out the <100 nM from the first assay, because it has been superseded by the more specific 80 nM measurement.

On the other hand, if you ran the 100 to 10,000 nM assay twice and the only IC50s you had were <100 and 120 nM, we shouldn’t ignore the <100 nM, because doing so would result in a number that was artificially high.

Median

Median returns the median of all readout values in the selected aggregation scope for the chosen readout definition. When performed as a standalone calculation (not part of a larger formula), the returned result also includes the standard deviation, and the number (n) of values.

output: Median (n) 
23.43 (n=8)

Sum

Sum function adds all readout values in the selected aggregation scope of the chosen readout definition. When performed as a standalone calculation (not part of a larger formula), the returned result also includes the number (n) of values.

output: Sum (n)
        187.04 (n=8)

 

Syntax

average([Protocol name -> readout definition name (units)])   e.g. [Primary Screen -> Inhibition (%)]
geomean([Protocol name -> readout definition name (units) (CDD)]) e.g. geomean([Secondary Screen ->IC50 (uM)(CDD)])
sum([Protocol name -> readout definition name (units)]) e.g. sum([Inventory -> Amount Debited (mg)])


Aggregation Scope

Aggregation scope appears as a drop-down below the formula field. The same aggregation scopes apply to both average and median calculations.

Screen_Shot_2014-06-27_at_6.16.03_PM.png

  • Aggregate by batch and run - A single aggregate value is calculated per batch of molecule, for each individual run of the current protocol. This is the best option for replicate data.
  • Aggregate by molecule and protocol -A single aggregate value is calculated across all batches of a molecule and across all runs of the current protocol. Only one average value per molecule will ever be reported for the current protocol. This is the best summary level result.
  • Aggregate by batch and protocol - An individual aggregate value is calculated for each batch of a molecule, across all runs of the current protocol. There will be as many average values, as there were tested batches of a molecule in a protocol. This is the best option to keep track of compound performance.

 Here's an illustration of the different aggregation scope results:

Screen_Shot_2014-05-05_at_5.50.53_PM.png

 

Operators

Mathematical operations can be performed on constant numbers, readouts, including calculated readouts and chemical properties using the following operators

* multiplication
/ division
+ addition
- subtraction
^ power :  base^exponent
log() logarithm base 10 : log(number)
ln() natural logarithm, base e : ln(number)
exp() e to the power of number : exp(number)

 

Example calculations

 

pIC50

Don't forget to convert IC50 to Molar units

-log([Secondary Screen -> IC50 (uM)(CDD)] * 10^-6)

 

Geometric mean of the IC50 

If you don't use pIC50s (you really should), then IC50 values can be averaged using the geometric mean, since they are based on a logarithmic scale of concentrations. When you average the CDD-calculated IC50, the formula will automatically default to the geometric mean.

geomean([Secondary Screen -> IC50 (uM)(CDD)])

output: Avg ± StDev (n)
23.43 ± 3.16 (n=8)

 

Absolute IC50

Use the calculated fit parameters from the Dose-Response calculation, and substitute the absolute inhibition value. We show the equation for 50% activity, but a similar calculation can be done at any other activity. Thus, the formula will be applicable to other IC## or EC## calculations as well.

= [Protocol Name-> EC50 (uM)]/((([Protocol Name-> Maximum Response (%)] - [Protocol Name-> Baseline Response (%)])/(50 - [Protocol Name-> Baseline Response (%)])-1)^(1/[Protocol Name-> Hill Slope]))

 

Fluorescence polarization ratio

([Primary Screen -> nFparallel]-[Primary Screen -> Fperpendicular])/[Primary Screen -> Fparallel]+[Primary Screen  -> Fperpendicular]

 

Binding Efficiency Index

BEI is defined as (pIC50/molecular weight (kDa)). Based on the calculated IC50 from a protocol and molecular weight, reported in CDD in g/mol. Don't forget to convert IC50 to Molar units, and to convert molecular weight to KiloDaltons.

IC50(uM), MW (g/mol)
Create a pIC50 readout definition. Use the pIC50 to calculate BEI:

average([Secondary Screen -> pIC50]) / ({Molecular Weight} * 10^-3)

 

Lipophilic Efficiency

Lipophilic Efficienty is defined as (pIC50-cLogP). You will find clogP as part of the calculated properties: The partition coefficient is the ratio of the concentration of the molecule in octanol to the concentration of the molecule in water. ChemAxon uses a fragment based approach which some researchers refer to as a cLogP calculation. For more details on the algorithm used, seeChemAxon's website

Create a pIC50 readout definition. Use the pIC50 to calculate LiPE:

average([Secondary Screen -> pIC50]) - {log P}

 

Reagent Inventory

You can use calculations to create a simple reagent or compound inventory with amount debiting. Define "Initial amount" as a simple readout, "Amount debited" as a simple readout, and "Amount remaining" as a calculated readout with batch/protocol aggregation scope, so that you can keep track of amounts remaining for each batch separately. Assuming you have named your inventory protocol "Inventory", the calculation will be as follows:

[Inventory -> Initial amount (mg)] - sum( [ Inventory -> Amount debited (mg)] 
Aggregate by batch and protocol

 

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.