NP-Scout: quantification and visualization of NP-likeness
NP-Scout is a free web service that for the
- Identification of natural products in large molecular libraries
- Quantification of natural product-likeness of small molecules
- Visualization of atoms and areas in small molecules characteristic to natural products or synthetic molecules (based on similarity maps)
NP-Scout utilizes random forest classifiers trained on more than 161k natural products and an equal number of synthetic molecules.
Molecular structures are loaded either by directly drawing a molecule with the JSME Molecular editor , by pasting a SMILES into the field “Enter SMILES”, or by uploading a text file containing a list of SMILES. NP-Scout runs a thorough data preparation protocol to standardize the input. Therefore, chemical structures do not need to be preprocessed by the user with respect to hydrogen annotation, aromatization, protonation, tautomerism and stereochemistry. Salts are also recognized, and the minor components removed prior to calculations.
Example upload file
Lists of SMILES should be formatted as shown in the following examples:
Example 1: One SMILES per row with no additional data
Example 2: One SMILES per row with additional data
The following separators may be used: " " (space character), "\t" (tab), ";" or ",".
Running the calculations
Calculations are started by clicking the “Submit” button. A new web page will load that reports on the progress of calculations and displays a web link that allows users to return and inspect the results once all calculations have been completed.
The results page mainly consists of a table which presents the predictions for the query molecules.
- Column "SMILES" reports the input SMILES.
- Column "Molecule name" reports the name of a molecule. If not specified, an index is reported.
- Column "Error/Warning" reports any errors or warnings.
- Possible errors are:
- Empty input
- Invalid SMILES
- SMILES is too long (longer than 500 characters).
- Possible warnings are:
- The salt filter identified a multi-compound SMILES for which the core component could not be determined
- The salt filter has removed at least one component of the input SMILES
- The molecular weight of a molecule is below 150 Da or above 1500 Da
- The molecule contains element types not supported by the model
- The tautomer canonicalizer identified a tautomer for which the main component could not be determined
- The tautomer canonicalizer has changed the input SMILES into one of its tautomers.
- Column "NP class probability" reports the NP class probability.
- Column "Similarity maps" shows a visualization of similarity maps. Green highlights mark atom contributing to the classification of a molecule as natural products, whereas orange highlights mark atoms contributing to the classification of a molecule as synthetic molecules.
Note that similarity maps are not calculated for molecules with a molecular weight below 150 Da or above 1500 Da.
The results in .csv format and a log file can be downloaded for further use. Errors and warnings are listed in the .log file and the .csv file contains the table of results except the similarity maps.
||NP class probability