Example Data Files
Example Input Data
The example data, and corresponding metadata files, provided here can be downloaded and used for testing the application. The data files contain transformed log2 ratios normalized to solvent controls from TK6 cells treated with different chemicals for 4 hours. The metadata files identify the chemical(s) and concentration(s) tested based on the data file column labels and must be included for each data file to be processed. Sample formats for data files and metadata files are displayed in the Sample Input Data Format section below.
The example data represent three different study designs: a study of a single chemical tested at multiple concentrations; multiple chemicals tested at a single concentration; and multiple chemicals tested at multiple concentrations. Data from 200 test agents/concentrations and 3000 genes can be analyzed by both classifiers in ~5 minutes.
NOTE: All input data and metadata files must be in TSV format.
File Description | Example Files |
---|---|
Log2 normalized data from a study of 1 chemical tested at 4 concentrations plus metadata file |
One-Chem_Multi-Conc_log2_Norm_Data.tsv One-Chem_Multi-Conc_Metadata.tsv |
Log2 normalized data from a study of 3 chemicals at 1 concentration plus metadata file |
Multi-Chem_One-Conc_log2_Norm_Data.tsv Multi-Chem_One-Conc_Metadata.tsv |
Log2 normalized data from a study of 3 chemicals at 4 concentrations each plus metadata file |
Multi-Chem_&_Conc_log2_Norm_Data.tsv Multi-Chem_&_Conc_Metadata.tsv |
Use the example file sets to test the functionality of the classification tool:
- Download the data file(s) you wish to test.
- In the 'Study Information' section of the application, use the default 'Cell line', TK6, 'Exposure duration' (4 hr), and 'Post-exposure sample time' (0 hr) to obtain optimal results. Data for the remaining fields can be selected as desired simply for testing purposes.
- In the 'Expression platform' field, keep the default (TempO-Seq). Data for the remaining fields can be selected as desired simply for testing purposes.
- In the 'File Upload' section, select the classifier to be used for data analysis from the drop-down list ( TGx-HDACi, TGx-DDI, or Both).
- Click 'Choose File' to select and upload the data file of interest, then repeat this process to select the corresponding metadata file.
-
Click [Submit] to begin data analysis. A 'File Verification' page that displays two tables should appear:
- The upper table shows the metadata file contents, which are used to label the output results; and
- The lower table lists the Column Names used in the data file along with a Column Index number.
- Review these tables for accuracy, completeness, and exact agreement of column names. Column names that do no match between the files will be highlighted in yellow and the 'Start Process' button will be inactive.
- Click [START PROCESS] to begin the data analysis.
- When data processing is complete, the results should be displayed in a Results table, as shown below in the `Sample Results Table`.
Sample Input Data Format
A portion of data from an example input file is displayed below that shows the log2 expression data normalized to the solvent controls from a study of multiple chemicals tested at multiple concentrations. The data are shown in the required format for use by the classification tool. Note: the data column labels do not include any spaces.
Probe Name | ChemX_12.5uM | ChemX_25uM | ChemX_75uM | ChemY_0.5uM | ChemY_1uM | ChemY_2.5uM | ChemZ_50uM | ChemZ_100uM |
---|---|---|---|---|---|---|---|---|
AKAP8 | -0.44382 | -1.01739 | -0.81992 | -1.37537 | -0.82716 | -1.64859 | -0.93796 | -0.46685 |
AP5S1 | -0.27618 | -0.50385 | -0.69536 | -1.57468 | -1.5786 | -1.76764 | -1.26408 | -0.38821 |
ATP1B1 | 0.748028 | 1.479563 | 1.66343 | 1.101814 | 0.612622 | 1.029037 | 0.965245 | 1.131768 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
ZNF280C | 0.388573 | 0.538586 | 0.853168 | 0.235886 | 0.130409 | 0.111331 | 0.663534 | 0.244081 |
ZNF282 | -0.3493 | -0.68269 | -0.65219 | -0.69001 | -0.28267 | -0.66982 | -0.72557 | -0.22885 |
ZNF383 | -0.53505 | -0.84094 | -0.77104 | -1.15616 | -0.82889 | -1.40235 | -0.7623 | -0.12039 |
Sample Metadata File Format
A portion of the metadata file that corresponds to the sample data file shown above is displayed here in the required format. All data fields must be included and complete. The `Column Name` values should match the data file column headers exactly (case sensitive and no spaces) and be listed in the same order.
COLUMN_NAME | CHEMICAL_NAME | SHORT_LABEL | CONCENTRATION | CONCENTRATION_UNIT |
---|---|---|---|---|
ChemX_12.5uM | Chemical-X 2'N 123F | ChemX | 12.5 | uM |
ChemX_25uM | Chemical-X 2'N 123F | ChemX | 25 | uM |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
ChemZ_50uM | Chemical-Z hydroxide | ChemZ | 50 | uM |
ChemZ_100uM | Chemical-Z hydroxide | ChemZ | 100 | uM |
Sample Results Table
The table below shows results for the first three concentrations of the sample data illustrated above. The table can be downloaded as a tab-delimited text file. Additional options to download data files and/or plot files are available from the Results table displayed online, including an option to download all files.
Batch Column Name | Classifier | Prediction | Positive Probability | Negative Probability | Chemical Name | Concentration | Cell Line | Exposure Duration | Post-Exposure Sampling | Expression Platform | Submitted Files | Submission ID |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ChemX_12.5uM | DDI | Non DNA Damage Inducing | 0.001600058219 | 0.998399941781 | Chemical-X 2'N 123F | 12.5 uM | TK6 | 4 hr | 0 hr | TempO-Seq | One-Chem_Multi-Conc_log2_Data.tsv One-Chem_Multi-Conc_Metadata.tsv |
20211217_7byn |
ChemX_25uM | DDI | Non DNA Damage Inducing | 5.16e-20 | 1 | Chemical-X 2'N 123F | 25 uM | TK6 | 4 hr | 0 hr | TempO-Seq | One-Chem_Multi-Conc_log2_Data.tsv One-Chem_Multi-Conc_Metadata.tsv |
20211217_7byn |
ChemX_75uM | DDI | Non DNA Damage Inducing | 0.031206 | 0.9687933 | Chemical-X 2'N 123F | 75 uM | TK6 | 4 hr | 0 hr | TempO-Seq | One-Chem_Multi-Conc_log2_Data.tsv One-Chem_Multi-Conc_Metadata.tsv |
20211217_7byn |
ChemX_12.5uM | HDACi | Non HDAC Inhibiting | 5.16e-20 | 1 | Chemical-X 2'N 123F | 12.5 uM | TK6 | 4 hr | 0 hr | TempO-Seq | One-Chem_Multi-Conc_log2_Data.tsv One-Chem_Multi-Conc_Metadata.tsv |
20211217_7byn |
ChemX_25uM | HDACi | HDAC Inhibiting | 0.968793348463 | 0.031206651536 | Chemical-X 2'N 123F | 25 uM | TK6 | 4 hr | 0 hr | TempO-Seq | One-Chem_Multi-Conc_log2_Data.tsv One-Chem_Multi-Conc_Metadata.tsv |
20211217_7byn |
ChemX_75uM | HDACi | HDAC Inhibiting | 1 | 0 | Chemical-X 2'N 123F | 75 uM | TK6 | 4 hr | 0 hr | TempO-Seq | One-Chem_Multi-Conc_log2_Data.tsv One-Chem_Multi-Conc_Metadata.tsv |
20211217_7byn |
NOTE: "Chemical Name" and "Concentration" are obtained from the metadata file that is submitted along with the normalized log2 data.
The [DOWNLOAD RESULTS TABLE FILES] button displayed below the Results table is used to download all of the output files generated during the classification process, including a tab-delimited text file of the Results table itself. When all files are downloaded, they are compressed in a single zip file and include:
- Classifier results [ TXT format]
- Heat map [ PNG format]
- Dendograph [ PNG format]
- Principle Component Analysis (PCA) [ PNG format]
- Fold change data [ TXT format]
- Gene Cluster distance [ TXT format]
- Chem Cluster distance [ TXT format]
- Prediction p-value/class [ TXT format]
Links to individual output files are also available in the “Data Files” and “Plot Files” columns of the Results table to view and download output files separately.
- PNG
- Portable Network Graphic
- TSV
- Tab Separated Values
- TXT
- Text