When data coming from CRO are delivered in a human readable spreadsheet, or when the plate reader creates matrix-like data files, it constitutes a considerable effort to convert all those numbers to a tabular structure. In CDD Vault this copy-paste step can be skipped by using the Preview file parser module within the Import Data process.
To access the Preview file parser when importing new datasheet you’ll need to select the “Preview File…” option in the Import Data tab, as shown in the red box here below:
Once you click Upload File, you will be redirected to the Preview file parser interface where you’ll be able to define which part of the datasheet corresponds to your data, parameters, etc…
In this article we will cover how to set up the Preview file parser to import:
- Pre-analyzed datasets (e.g. from CRO reports)
- Raw data (e.g. concentration or time series)
- Data in matrix format (e.g. from plate readers)
Importing data reports
When working with a pre-made report (e.g. from CRO), the analyzed data are usually shown in various tables within the datasheet with additional parameters and information spread throughout it.
The first step is to define where the data lies within the spreadsheet. To do that, click first on the first cell of the first row cell, then on the last cell of the first row. Create the minimum table by clicking on “Tabular Data”, then include all the data of the table by clicking on “Extend Content Area”.
The data table will be now highlighted in light blue, with the automatically detected headers in dark blue. On the Assign box, the table will be defined according to its size parameters via the Columns and Rows fields, and the “Join Field” field will acquire the value of the leftmost column header. “Join field” will help us join data coming from different tables, plates, or tabs in the same file.
This can be enough if there is no more information to add. In that case, you can click on the “Preview Import Table” tab to check whether the data resemble your desired tabular format. If so, you can continue to the conventional import process by clicking on the “Proceed” button:
You may want to add further information, or for example join tables showing different values for the same molecules. Here below we go through these potential cases.
Join two tables
In some cases you may have two tables with different columns or experiments for the same molecules. Both table’s data should be subsequently highlighted and then assigned as Tabular Data as previously described.
Two independent Tabular Data sections are now appearing in the Assign box. “Join Field” should be here used to define which column creates the relationship between the two tables: here the column “Name” tells us the molecule, and therefore should be used in the “Join Field” to connect the information between the two tables.
Please note: if the different tables have different values for the same name, you should rename them as the Preview file parser accepts only columns with a unique name.
If we proceed to the Preview Import Table, we will now see the resulting table joined by the “Name” field. If this fits the requirements, you can click on “Proceed” and continue as usual.
Fixed Value as set parameters
The next step is to add parameters that are unique to each table. In this specific example, it could be the different dose and formulation that were administered via IV and PO. These values should be present as well in the datasheet, for example in a table below or in another datasheet.
To connect these values to each table, select the cell with the value and then click on the “Associate Fixed Value” button of the respective table.
Please note: the default label for the new fixed value will be the content of the cell to the left. Make sure that the labels do not have the same name, otherwise they will overwrite each other in the tabular view.
Perform again the same steps for the second table to connect its own value of the same parameter to it.
If the parameter is common to the entire datasheet, as for example the date of the report, this can be assigned to the whole table by using the Fixed Value button at the top of the Assign box. As for the previous fixed values, they do not need to be on the same datasheet.
Once in Preview, you will see that the two additional columns containing the new parameters as well as the column containing the date are part of the resulting table. You will be able later on to import each column to the corresponding readout in your protocol(s).
Importing raw data
Raw data coming in tabular format from devices (e.g. spectra or time series from plate readers) can be imported in CDD Vault and analyzed directly within it using the Curves module. This helps to skip laborious steps of formatting the spreadsheet, therefore maintaining data integrity and allowing you to harvest all the power of CDD Vault’s Curve analysis, which includes data normalization and fitting.
If you have one shared column for all the data series (in the example below one time column for different concentration measurements), you will need to select multiple times the whole table, including one new column at a time. In the example below you will need to:
- Select first data row from first to second column → click “Tabular Data” → click “Extend Content Area”
- Select first data row from first to third column → click “Tabular Data” → click “Extend Content Area”
In this example we are importing two time series, one for the PO route of administration and one for the IV route. In both cases, the data follow the same time trend, but this is not fundamental. A similar case would be a concentration range of a molecule vs the measured signal.
Further information such as the molecule’s ID is assigned using the “Fixed Value” button at the top of the Assign box. In this case we assigned to each administration route the corresponding dose as a fixed value using the respective “Associated Fixed Value” button. These steps are not always necessary and they depend on the assay.
If you have instead two time series or spectra that belong to different molecules, you will have to add to “Join Field” both the Time values as well as the molecule IDs.
Plate note: the label of the associated fixed value has to be the same for all tabular data in order to be used in “Join Field”
Replicates
The Preview file parser allows users to define replicates within the Tabular Data, which will be imported into CDD Vault along with the average and standard deviation.
The different measurements for the same molecule/parameter should be organized in different columns, and the whole table should be selected in one single Tabular Data space.
On the Assign box you should create one replicate per column using the “Add Replicate” button. Then, from each of the resulting dropdown menus, you should choose the corresponding replicate column. Each column should be assigned to one of the Replicate fields, so to obtain a result similar to the one below.
You can see that the two additional and optional fields have been filled:
- "Field Name" defines the column header of the processed table containing the values for all replicates
- "Column Value" defines instead the name of the column where the replicate’s label will be stored
The resulting Import Table will show values and replicate labels in two different columns. You will be able to use the values in the replicate labels’ column for potential conditions that you have defined as readout in your protocols.
The same type of parsing can be applied to time or concentration series for a single or multiple molecules, as well as previously shown functionalities, like the possibility to join multiple tables by a shared column or value, remain viable for further refining the parsing.
Changing headers
The column names may be not representative of the real content of the columns, but for example they may contain the unit of measure. If you desire to change it using other cells within the spreadsheet, you can do that as long as the new headers are part of the same Tabular Data “rectangle” and above the default header.
As usual, select the data to create a Tabular Data section, then select the row with the correct headers, and click on “Reassign Header Columns” in the Assign box to reach a similar result to the table below:
Importing plate/matrix-based data
CDD Vault’s Preview file parser allows you to work with raw data coming from plate readers and similar devices without the need of changing the data structure from the matrix format.
As for Tabular Data, you will need to select all the cells that represent the wells, without including the index row and column. Afterwards, instead of “Tabular Data”, click on “Plate Block”. In the Assign box, a new section will appear which will look similar to the one below:
In the Assign box the various fields have the following meanings:
- “Area” identifies the size of the matrix as columns x rows
- “Well Field” is automatically set to Well Name, so that a column with the well index (letter + number) will be added to the Import Table
- “Value Field” is the container for the resulting column header. While blank by default, it must be customized to your needs, in particular if you have multiple matrixes. In this example it was labelled it as “Signal (F.U.)”
- “Join Field” works as previously described, and it comes useful when dealing with several Plate Blocks with different information. One should be always set to “Well Name” to use the well indexes as a common column, and an additional one should be added to join the various data by the Plate ID if multiple plates are analyzed
- “Well Padding” defines the index format in the Well Name column of the resulting Import table:
- Set to “strip zeros” the label A1 stays A1. If the column index was 01, it would still become A1
- Set to 2 digits, the label A1 becomes A01
- Set to 3 digits, the label A1 becomes A001
- “Repeat” allows you to add multiple plates within the same import process. Please note: you will need to add both “Well Name” as well as an associated field for the plate ID to the “Join Field” section
When working with plate based matrixes you will then need to create as many “Plate Blocks” as matrixes you have in your spreadsheet. The data will be then joined using the “Well Name” column and appear as tabular in the “Preview Import table” tab.
Please note: you do not need to add any Well Name column, as it is automatically created using the “Plate Block” function
Repeated tables for multiple plates
If you have a spreadsheet where the data of multiple plates are present, you will need to use the “Repeat” function to find all the plates and join them via their plate ID.
You will need to select the first plate in the tab, and then the “Repeat: this tab” option. Remember to create an associated fixed value for the plate ID to the plate block, which in the example below is in the top left corner, but could be as well located in another tab.
It is fundamental that the associated fixed value has the same label in all the tables, otherwise it won’t be possible to set up correctly the new additional field in the “Join Field” section.
Please note: you will need to press the + sign on the right of the existing field in the “Join Field” section to create a new field for the plate ID
Set parameters and define multiple assays per plate
As for Tabular Data, Plate Block allows you to add fixed values via either the “Fixed Value” or the “Associate Fixed Value” buttons. This will generate an additional column populated with the selected fixed value for each row.
If a single plate/matrix contains multiple assays or conditions for the same molecules and you want to assign them to different protocol or protocol’s readouts, you can proceed in the following way:
- Select the whole plate/matrix and define it as a single “Plate Block”
- Select within the defined block the area of the first assay and click “Shrink Active Area”.
- Proceed by selecting the remaining part of the plate where the second assay was performed and click on “Add Area”
This will result in two different columns that will be joined by the well name, but that can be indexed with different headers and therefore imported in different CDD Vault’s protocols. Remember to label the “Value Field” it the corresponding assay title or desired column header.
Saving custom Preview file parser templates
If a report is imported several times with the same format, and it needs to be repeatedly parsed, it is possible to save the parsing template to easily reapply it each time it is necessary to import that kind of data.
Once you have set up the desired Preview file parser settings, you can click on the “Save custom parser…” option in the top right corner of the Preview file parser’s interface.
Do not worry about variability in the length of the columns: the Preview file parser will automatically detect and adapt the length of the different Tabular Data sections you previously created.
Next time to use the saved template, select it from the Parser type dropdown menu during the first step of the Import Data process: