U.S. flag

An official website of the United States government

Dot gov

The .gov means it's official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Share This:

Example Data Files

Example Input Data

The example data, and corresponding metadata files, provided here can be downloaded and used for testing the application. The data files contain transformed log2 ratios normalized to solvent controls from TK6 cells treated with different chemicals for 4 hours. The metadata files identify the chemical(s) and concentration(s) tested based on the data file column labels and must be included for each data file to be processed. Sample formats for data files and metadata files are displayed in the Sample Input Data Format section below.

The example data represent three different study designs: a study of a single chemical tested at multiple concentrations; multiple chemicals tested at a single concentration; and multiple chemicals tested at multiple concentrations. Data from 200 test agents/concentrations and 3000 genes can be analyzed by both classifiers in ~5 minutes.

NOTE: All input data and metadata files must be in TSV format.

Example Input Data Files
File Description Example Files
Log2 normalized data from a study of 1 chemical tested at 4 concentrations plus metadata file One-Chem_Multi-Conc_log2_Norm_Data.tsv
One-Chem_Multi-Conc_Metadata.tsv
Log2 normalized data from a study of 3 chemicals at 1 concentration plus metadata file Multi-Chem_One-Conc_log2_Norm_Data.tsv
Multi-Chem_One-Conc_Metadata.tsv
Log2 normalized data from a study of 3 chemicals at 4 concentrations each plus metadata file Multi-Chem_&_Conc_log2_Norm_Data.tsv
Multi-Chem_&_Conc_Metadata.tsv

Use the example file sets to test the functionality of the classification tool:

  • Download the data file(s) you wish to test.
  • In the 'Study Information' section of the application, use the default 'Cell line', TK6, 'Exposure duration' (4 hr), and 'Post-exposure sample time' (0 hr) to obtain optimal results. Data for the remaining fields can be selected as desired simply for testing purposes.
  • In the 'Expression platform' field, keep the default (TempO-Seq). Data for the remaining fields can be selected as desired simply for testing purposes.
  • In the 'File Upload' section, select the classifier to be used for data analysis from the drop-down list ( TGx-HDACi, TGx-DDI, or Both).
  • Click 'Choose File' to select and upload the data file of interest, then repeat this process to select the corresponding metadata file.
  • Click [Submit] to begin data analysis. A 'File Verification' page that displays two tables should appear:
    • The upper table shows the metadata file contents, which are used to label the output results; and
    • The lower table lists the Column Names used in the data file along with a Column Index number.
  • Review these tables for accuracy, completeness, and exact agreement of column names. Column names that do no match between the files will be highlighted in yellow and the 'Start Process' button will be inactive.
  • Click [START PROCESS] to begin the data analysis.
  • When data processing is complete, the results should be displayed in a Results table, as shown below in the `Sample Results Table`.

Sample Input Data Format

A portion of data from an example input file is displayed below that shows the log2 expression data normalized to the solvent controls from a study of multiple chemicals tested at multiple concentrations. The data are shown in the required format for use by the classification tool. Note: the data column labels do not include any spaces.

Probe Name ChemX_12.5uM ChemX_25uM ChemX_75uM ChemY_0.5uM ChemY_1uM ChemY_2.5uM ChemZ_50uM ChemZ_100uM
AKAP8 -0.44382 -1.01739 -0.81992 -1.37537 -0.82716 -1.64859 -0.93796 -0.46685
AP5S1 -0.27618 -0.50385 -0.69536 -1.57468 -1.5786 -1.76764 -1.26408 -0.38821
ATP1B1 0.748028 1.479563 1.66343 1.101814 0.612622 1.029037 0.965245 1.131768
ZNF280C 0.388573 0.538586 0.853168 0.235886 0.130409 0.111331 0.663534 0.244081
ZNF282 -0.3493 -0.68269 -0.65219 -0.69001 -0.28267 -0.66982 -0.72557 -0.22885
ZNF383 -0.53505 -0.84094 -0.77104 -1.15616 -0.82889 -1.40235 -0.7623 -0.12039

Sample Metadata File Format

A portion of the metadata file that corresponds to the sample data file shown above is displayed here in the required format. All data fields must be included and complete. The `Column Name` values should match the data file column headers exactly (case sensitive and no spaces) and be listed in the same order.

COLUMN_NAME CHEMICAL_NAME SHORT_LABEL CONCENTRATION CONCENTRATION_UNIT
ChemX_12.5uM Chemical-X 2'N 123F ChemX 12.5 uM
ChemX_25uM Chemical-X 2'N 123F ChemX 25 uM
ChemZ_50uM Chemical-Z hydroxide ChemZ 50 uM
ChemZ_100uM Chemical-Z hydroxide ChemZ 100 uM

Sample Results Table

The table below shows results for the first three concentrations of the sample data illustrated above. The table can be downloaded as a tab-delimited text file. Additional options to download data files and/or plot files are available from the Results table displayed online, including an option to download all files.

Batch Column Name Classifier Prediction Positive Probability Negative Probability Chemical Name Concentration Cell Line Exposure Duration Post-Exposure Sampling Expression Platform Submitted Files Submission ID
ChemX_12.5uM DDI Non DNA Damage Inducing 0.001600058219 0.998399941781 Chemical-X 2'N 123F 12.5 uM TK6 4 hr 0 hr TempO-Seq One-Chem_Multi-Conc_log2_Data.tsv
One-Chem_Multi-Conc_Metadata.tsv
20211217_7byn
ChemX_25uM DDI Non DNA Damage Inducing 5.16e-20 1 Chemical-X 2'N 123F 25 uM TK6 4 hr 0 hr TempO-Seq One-Chem_Multi-Conc_log2_Data.tsv
One-Chem_Multi-Conc_Metadata.tsv
20211217_7byn
ChemX_75uM DDI Non DNA Damage Inducing 0.031206 0.9687933 Chemical-X 2'N 123F 75 uM TK6 4 hr 0 hr TempO-Seq One-Chem_Multi-Conc_log2_Data.tsv
One-Chem_Multi-Conc_Metadata.tsv
20211217_7byn
ChemX_12.5uM HDACi Non HDAC Inhibiting 5.16e-20 1 Chemical-X 2'N 123F 12.5 uM TK6 4 hr 0 hr TempO-Seq One-Chem_Multi-Conc_log2_Data.tsv
One-Chem_Multi-Conc_Metadata.tsv
20211217_7byn
ChemX_25uM HDACi HDAC Inhibiting 0.968793348463 0.031206651536 Chemical-X 2'N 123F 25 uM TK6 4 hr 0 hr TempO-Seq One-Chem_Multi-Conc_log2_Data.tsv
One-Chem_Multi-Conc_Metadata.tsv
20211217_7byn
ChemX_75uM HDACi HDAC Inhibiting 1 0 Chemical-X 2'N 123F 75 uM TK6 4 hr 0 hr TempO-Seq One-Chem_Multi-Conc_log2_Data.tsv
One-Chem_Multi-Conc_Metadata.tsv
20211217_7byn

NOTE: "Chemical Name" and "Concentration" are obtained from the metadata file that is submitted along with the normalized log2 data.

The [DOWNLOAD RESULTS TABLE FILES] button displayed below the Results table is used to download all of the output files generated during the classification process, including a tab-delimited text file of the Results table itself. When all files are downloaded, they are compressed in a single zip file and include:

  • Classifier results [ TXT format]
  • Heat map [ PNG format]
  • Dendograph [ PNG format]
  • Principle Component Analysis (PCA) [ PNG format]
  • Fold change data [ TXT format]
  • Gene Cluster distance [ TXT format]
  • Chem Cluster distance [ TXT format]
  • Prediction p-value/class [ TXT format]

Links to individual output files are also available in the “Data Files” and “Plot Files” columns of the Results table to view and download output files separately.

PNG
Portable Network Graphic
TSV
Tab Separated Values
TXT
Text