Downloads


DSSTox Identifier to PubChem Identifier Mapping File Posted: 11/14/2016

The DSSTox to PubChem Identifiers mapping file is in TXT format and includes the PubChem SID, PubChem CID and DSSTox substance identifier (DTXSID).
image


DSSTox identifiers mapped to CAS Numbers and Names File Posted: 11/14/2016

The DSSTox Identifiers file is in Excel format and includes the CAS Number, DSSTox substance identifier (DTXSID), Preferred Name, DTXCID. Standard InChI String and Standard InChIKey (UPDATED APRIL 2019).


DSSTox MS Ready Mapping File Posted: 11/14/2016

The CompTox Chemistry Dashboard can be used by mass spectrometrists for the purpose of structure identification. A normal formula search would search the exact formula associated with any chemical, whether it include solvents of hydration, salts or multiple components. However, mass spectrometry detects ionized chemical structures and molecular formulae searches should be based on desalted, and desolvated structures with stereochemistry removed. We refer to these as “MS ready structures” and the MS-ready mappings are delivered as Excel Spreadsheets containing the Preferred Name, CAS-RN. DTXSID, Formula, Formula of the MS-ready structure and associated masses, SMILES and InChI Strings/Keys. (UPDATED APRIL 2019)


DSSTox SDF File Posted: 12/14/2016

This zip file contains the entire chemical structure collection of over 850,000 chemicals from the DSSTox database contained in one large SDF file. The file contains the structure, The DSSTox Structure Identifier (DTXCID), The DSSTOX Substance Identifier (DTXSID listed as PubChem External Data Source), the associated Dashboard URL, associated synonyms and Quality Control Level details. In order to view an SDF file you will need to have access to the appropriate piece of software to open an SDF files. Examples include ChemAxon JChem, ACD/ChemFolder or ChemDraw. (UPDATED APRIL 2019)


PHYSPROP Analysis File Posted: 12/14/2016

The data associated with the publication “An automated curation procedure for addressing chemical errors and inconsistencies in public datasets used in QSAR modeling” represents the curated data associated with the OPERA models used to predicted properties for the CompTox Chemistry Data. The data include the training and test data sets as well as the KNIME workflows used to perform the curation of the data. For a full understanding of the data and workflows we recommend accessing the publication also.


DSSTox Mapping File Posted: 12/14/2016

The DSSTOX mapping file contains mappings between the DSSTox substance identifier (DTXSID) and the associated InChI String and InChI Key. The file is made available as a Tab Separated Value (TSV) file with each entry represented as shown:
DTXSID7020001  InChI=1S/C11H9N3/c12-10-6-5-8-7-3-1-2-4-9(7)13-11(8)14-10/h1-6H,(H3,12,13,14)  FJTNLJLPLJDTRM-UHFFFAOYSA-N


DSSTox Predicted Property Data Posted: 12/14/2016

A number of property prediction models were developed using curated data as described in the publication “An automated curation procedure for addressing chemical errors and inconsistencies in public datasets used in QSAR modelling”. These property prediction models include logP, water solubility, bioconcentration factor and many others. The files include DTXSIDs, names and the predicted properties where possible. The models cannot predict properties for all chemicals contained in the database (for example, inorganics, organometallics and elements cannot be handled).


DSSTox Synonyms File Posted: 12/14/2016

The DSSTox synonyms file is in SDF format and includes the DSSTox substance identifier (DTXSID). The preferred name, the CAS Registry Number and the list of associated synonyms for over 720,000 chemicals. In order to view an SDF file you will need to have access to the appropriate piece of software to open an SDF files. Examples include ChemAxon JChem, ACD/ChemFolder or ChemDraw.


PubMed Abstract Sifter Posted: 07/06/2017

The Abstract Sifter is a Microsoft Excel based tool that greatly enhances literature searching in PubMed. The tool implements a novel “sifter” functionality for relevance ranking, giving the researcher a way to find articles of interest quickly. The Sifter assists researchers to triage results and keep track of articles of interest. The tool also gives researchers a view of the literature landscape for a set of entities such as chemicals or genes and makes it easy to dive deeper into areas of interest.


Tandem Mass Spectrometry Fragment Summary File Posted: 08/22/2017

A new “Tandem Mass Spectrometry Fragment Summary File” has been added to the downloads page for our mass spectrometry users. This file contains DTXSIDs, structural and neutral mass information from the CompTox Chemistry Dashboard mapped to precursor and MS/MS fragment summaries from mass spectral records submitted to European MassBank (MassBank.EU) and contained within the MASSBANKREF and MASSBANKEUSP lists. For more details download the ZIP file and examine the README file.


INVITRODB_Mapping Posted: 03/06/2019

ToxCast assay endpoint names (aenm) may have changed between the previous release of the CompTox Chemicals Dashboard (released August 2018) and the March 2019 release because updates have been made to the ToxCast assay data source, invitrodb. The ToxCast assay annotation data in invitrodb version 3.1 (https://doi.org/10.23645/epacomptox.6062623.v3), as viewed in the CompTox Chemicals Dashboard, may have been updated, resulting in aenm changes. The numeric assay endpoint identifier (aeid) is intended to remain static between invitrodb releases. To enable easier mapping between aenm in the previous data from invitrodb_v2 (https://doi.org/10.23645/epacomptox.6062623.v1) and the latest release version (invitrodbv3.1), a mapping file is provided for reference.


CPCATARCHIVE Posted: 03/21/2019

The EPA CPCat (Chemical and Product Categories) database was released in May 2014. It maps >43,000 chemicals to a set of terms categorizing their usage or function. We have compiled a comprehensive list of chemicals with associated categories of chemical and product use by compiling publicly available sources. Sources include, but are not limited to: the Substances in Preparation in Nordic Countries (SPIN) database, information provided by companies, trade associations, and regulatory agencies such as the U.S. Environmental Protection Agency (EPA) and Food and Drug Administration (FDA), the DrugBank database of pharmaceutical products, and information mined from the Aggregated Computational Toxicology Resource (ACToR) database developed by the U.S. EPA. Unique use category taxonomies from each source are mapped onto a single common set of ~800 terms. The user can search for chemicals by chemical name, Chemical Abstracts Registry Number (CASRN), or by CPCat terms (i.e. category names) associated with chemicals. See Dionisio et al., 2014 for a full description of the database, sources used, interpretation of chemical categories, and potential applications. The .zip file available at the "Download" tab of this website provides a full copy of the database, available for free download, which can be freely searched and sorted for data analysis. The .zip file includes a list of all chemicals included in CPCat. A list of all sources included in CPCat is provided in the table below. This is an archive of the file that is available via the CPCat web application.


CPDATdownload Posted: 04/10/2019

Quantitative data on product chemical composition is a necessary parameter for characterizing near-field exposure. This data set comprises reported and predicted information on >75,000 chemicals contained in >15,000 consumer products. The data’s primary intended use is for exposure, risk, and safety assessments. The data set includes specific products with quantitative or qualitative ingredient information, which has been publicly disclosed through material safety data sheets (MSDS) and ingredient lists. A single product category from a refined and harmonized set of categories has been assigned to each product. The data set also contains information on the functional role of chemicals in products, which can inform predictions of the concentrations in which they occur. These data will be useful to exposure and risk assessors evaluating chemical and product safety. The data set presented here is in the form of a MySQL relational database, which mimics CPDat data available under the ‘Exposure’ tab of the CompTox Chemistry Dashboard (https://comptox.epa.gov/dashboard) as of August 2017.


MetFrag Posted: 04/11/2019

The integration of metadata files from the CompTox Chemicals Dashboard with the online version of the MetFrag in silico fragmenter has previously been discussed in the paper ““MS-Ready” structures for non-targeted high-resolution mass spectrometry screening studies” by McEachran et al. (https://dx.doi.org/ 10.1186/s13321-018-0299-2). For local installations of MetFrag we have provided here downloadable versions of these metadata files.