CDD Visualization is a software tool that allows users to plot and analyze large data sets to identify patterns, hotspots and outliers, as well as share findings in publication-quality graphics. It is available as part of the CDD Vault informatics platform and also as a free, browser-based application.
- Sending data into the Visualization Tool
- Creating charts, graphs, maps, etc
- 2D and 3D Scatterplots
- Bar charts
- Filtering the data
- Molecule Overview Card - Carousel Feature
- The Visualization Tool data table
- The SuperSAR Visualization Tool
- Add new data column
- Top Toolbar
- Exporting plots and data table
- Saving a collection of Molecules
- Sharing the visualization session
- Launching a new session
- Analyzing non-registered molecules
Sending search results from CDD Vault to Visualization
Since the Visualization session may draw the input data from CDD Vault, you will need to configure a search results table within CDD Vault such that it displays all of the numeric results you wish to pass to Visualization. Here's a refresher on how to customize your report to adjust the data in the results table.
At least one numeric data series must be present in the search results table for the "Launch Visualization" button to light up within the header of the search results table. Once you have a set of results loaded on the Explore Data -> Search tab, you will see the "Launch Visualization" button prominently in the search results table header.
The session is launched in a new tab within your Browser with a default view that includes a 2D scatter plot, a data table, default data filters, and a menu of actions and controls. (For the best experience, please use the latest version of a modern Internet browser, like Chrome, Firefox.)
When clicking the "Launch Visualization" button, all Numeric data (such as numeric Molecule fields, numeric Batch fields, aggregate readouts from Protocols, as well as chemical properties) are passed from your CDD Vault search results table into a Visualization session.
A special note on Protocol Readouts
The Protocol readout data passed into Visualization is always aggregated at the Molecule/Protocol level, which is an important distinction from the CDD Vault search results table, where you can display multiple/replicate data points per molecule. This means that any results passed from CDD Vault will be averaged across all Runs and across Batches, to yield a single value per Molecule. With a notable exception of end-point readouts in a Protocol that contains Conditions, this aggregation will be automatically performed by Visualization.
|Protocol with conditions
|Protocol without conditions
|Readout marked as condition
(plain readout definition)
|Normalized against controls
Example 1: If your search results table in the Vault includes multiple IC50 values per molecule, and the protocol does not have any conditions, you will see a single IC50 geometric average within the Visualization session.
Example 2: If the search results table in the Vault includes IC50s under several conditions, you will see a separate IC50 geometric average for each condition.
Example 3: if your search results table in the Vault contains a dose-response set, including concentrations, inhibitions and IC50ss, you will see a single IC50 value per molecule in the Visualization session, while concentrations and inhibitions will not be included.
Importing a file of data into Visualization
Users can now visualize data that have not yet been registered in any CDD Vault by launching a new Visualization session and clicking “Import file”. When you do this, the data files do not leave the local browser, and CDD does not store the files. To launch a new Visualization session, users may:
- Navigate to a special url directly to the Visualization tool
- From within a current Visualization session, click the Actions menu and select the “Launch session” option.
Creating charts, graphs, maps, scatter plots
CDD Vault Visualization is a visual analysis tool to generate publication-ready graphs and charts on the spot with real-time data. We’re constantly improving our software’s ability to produce compelling visualizations that are ready to share as soon as you are. Currently, scatter plots, histograms and bar charts are supported, with additional plot types to be added in the future.
The most prominent part of the work space is the multi-parameter scatter plot. Click on the "expand" icon in the top right corner to maximize the plot.
X and Y Axes
By default, the initial plot is a two-dimensional plot of the first two columns of the data table on the X and Y axes, with the same data properties used for color-coding and the size of the points.
- Select the data properties you wish to plot on the X and Y axes by clicking the drop-down arrows displayed on each axis label. The plot will adjust dynamically.
- You can manually set the ranges on each of your axes or set the scale to logarithmic by clicking the Set Range link, and also convert the axis to a log scale.
- From top toolbar you can add another 2D scatterplot, histogram, bar chart or 3D scatter plot by clicking the type of plot option. At a time, you may 3 more plots in each quadrant of page.
Settings menu – set size, shape, and color of your data points
- Control the presentation of all your plots using the Visualization Settings icon in the top left of each plot. Using these settings, you can greatly increase the amount of data you convey in your plot.
Moving and deleting the plots
When creating multiple plots/charts, you can rearrange your plots on the page by clicking the Move shuffle icon located in the top center of each plot. Simply click, grab, and reposition your plot as desired. Also, delete a plot by clicking the “X” in the top-right corner next to the expand icon.
- In this section we’ll take a regular 2-parameter plot (X and Y axis) and upgrade it to show three more parameters. Here is a generic two-parameter diagram showing logP as a function of molecular weight for a collection of small molecules.
- Clicking on the Settings Wheel will open your plot settings.
- Before we add any more parameters to the plot, we can see more in the first plot by moving the Opacity slider at the bottom of the Appearance tab.
- Now in a crowded field of overlapping points, you can see where the highest density is and better pick out points of interest. This also gives you a more intuitive view of the linear regression line coming up at the end of this section.
- Using this Settings menu, you can also add three more parameters to your plot by using different color, size, and shape appearance settings. The color parameter and size can be adjusted to fit anywhere on the spectrum of colors, or on a scale from small to large.
- All three of the new parameters (color, size and shape) can also be ‘binned’. The default is a 3-bin setting but additional bins can be added, and bin sizes adjusted.
- Here is the same plot from above but now showing the distribution coefficient, logD, along a color scale with logP on the Y axis. Set the colors on the spectrum circle by dragging the endpoints around the circle. The plot will be updated dynamically so you can pick your most desirable color scheme. Each parameter is described in the legend at the upper right corner of the plot showing the chosen parameter's scale.
- To show off a 5-parameter plot that doesn’t appear too busy, shapes for one of biological activity assay and size for another biological activity assay.
- Notice now that the legend shows each of the parameters being shown in the plot within the X and Y axis. Note: This legend can be collapsed by clicking the arrow symbol in the upper right corner of the legend.
- The second tab in the settings wheel for CDD Visualization is the Tooltip Tab. Notice that, when you hover over any point on the plot, a window appears showing the structure and-or the name of the compound you are highlighting. Shift-click to have the compound info persist on the plot with a pin (more like a needle) in the point. Click on the window and drag wherever you want it on the plot and adjust the size by dragging the lower right-hand corner.
- The tooltip tab in the settings wheel lets you control what you see in the point info window. Check the box for structure and or compound name to see these when you hover and select whatever parameters you want to see in the ‘pinned’ points.
- The third tab in the settings wheel for CDD Visualization is the Statistics Tab. From here you can draw a linear regression line on your plot with the slope listed in the legend and show ‘whiskers’ on each point showing the standard error or standard deviation for points which have been aggregated together.
Frequency distributions, the probability of a particular outcome’s occurrence, are generally presented in histograms. You may add a histogram to your plot area by clicking the Actions Menu <anchor5> and selecting the “Add histogram” option.
- The x-axis of the histogram can be set much like setting the x-axis of a scatter plot, by clicking the drop-down arrow displayed on the x-axis label. The histogram will adjust dynamically.
- The color of your bars within the histogram can be assigned to a data parameter and displayed using all of the colors around the color wheel or binned accordingly using the Settings menu.
- In this Settings menu, you can also set the number of bins used for the bars on the histogram.
Coming soon, the ability to create bar charts in the CDD Visualization tool.
Filtering the data
Scatter plot zoom and point selection
- Click and drag the mouse over a selection of data points on any of your plots to zoom in on that area of the plot. The plot area will resize dynamically, and you will see the number of selected molecules update in the top right corner of the plot, and the filter histograms will update to reflect plot selection.
- Double click in the empty plot area to reset zoom. Also, reset using "Undo" and "Reset" links at the top of the Vision window.
- Each point on the plot represents data from a single molecule. Recall that protocol data is aggregated per molecule.
- Molecules that are missing in the plotted series will be omitted and you will see an orange exclamation point in the Selected Molecules banner. Click the orange exclamation point to view the data omitted from the plot.
- You may also mouse over a data point to see a preview of the molecule structure and click a data point to see a pop-up Molecule Overview card.
- Use the shift-click option to make a pop-up Molecule Overview card persist on the plot.
- Click the Settings tool to customize what is displayed on these persisting Molecule Overview cards.
Molecule Overview Card Carousel Feature
To help users more easily navigate through the list of Molecules currently selected within the Visualization tool, a new Carousel feature has been added. When you click on a data point within a plot, or a Molecule Name within the data table, you can now navigate through the entire list of selected Molecules by clicking on the left and right arrow buttons.
The Visualization Tool data table
Near the bottom of the Visualization tool, usually underneath the plots, you find a dynamic data table containing the molecule structures and all of the numeric data. If you are a CDD Vault customer, you may recognize that this data table is similar to the search results table in CDD Vault.
- If this data table is minimized, you can always expand it to full screen by clicking the "expand" icon in the upper right corner of the data table.
- The table dynamically updates to reflect the molecule selection in the plot area.
- The check-boxes along the left edge of each row let you cherry-pick the data further. All molecules are selected by default, and when you de-select a molecule, the corresponding point on the plots is greyed out, and the count is updated.
- Each molecule structure represent a funnel shaped icon at right top corner. If you click on that then it will show all possible fragmentations and substructure of that molecule. Selecting one of the fragment will filter all data across visualization session. You can also choose substructure search from right hand side if you know the molecule id and would like to choose substructure of that molecule.
- The data table can be color coded based on custom criteria by clicking on the paint can icon in the lower right corner of the data table
- Here, each column can be colored based on threshold values you assign.
- Hint: these colors are also exported into Excel when the data table is exported out of CDD Visualization.
Add new data column
If you would like to do further calculations based on existing data sets then we have customized option to create a new column and feed a formula on the spot at visualization to have more parameters to analyze at visualization area.
The example calculations are
If you would like to create pIC50 for existing IC50s then click at Add Column and pop window will appear where you can name the column and add formula like
This newly created column with pIC50 is available as a parameter to get selected in the plots as well as in the data table as shown below.
The Top Toolbar, located in the top of the Visualization tool, allows you to:
- Create a new Collection of Molecules in your CDD Vault containing the currently selected molecules from your Visualization session (requires you to be a CDD Vault user with an active Vault)
- Export your plot to PNG or PDF
- Save the data table, including color coding, out to an Excel XLSX file
- Launch a new Visualization session where you can import a data file for visualization
- Share the url link of analyzed visualization session with other CDD users.
- Reset icon to reset the original plots or back arrow to undo the changes at any time during session.