
藻朗问答:清丽鲜明的知识平台,面向藻类学与植物学的多学科专业问答。
Simple command line tools to generate cyclic and linear oligopeptides in SMILES format, filter them based on drug-likeness, output stats and perform ligand file preparation using Open Babel to output ready-to-dock mol2 files. Supports Windows, OSX and Linux
Before installing, ensure you have the following requirements
Once you have the requirements, run the following in a cmd/terminal:
npm install -g oplgen
If you are on linux or OSX you may need to add sudo
at the start of the install command like so:
sudo npm install -g oplgen
To update to the latest version, use the following command (with sudo
at the start if needed):
npm install -g oplgen@latest
This program contains the following command line utilities:
opl-generate
(alias oplgen
) - Generate oligopeptide SMILES filesopl-filter
(alias oplflt
) - Sort, filter, output stats and mol2opl-subunits
(alias oplsub
) - Insert the default subunits.json
into the working directoryopl-dockstats
(alias opldst
) - Get stats from a mol2 file docked with dock6The correct procedure to use these commands is to create separate folders for each type/length of chain you are interested in generating/filtering, and then run opl-generate
and then opl-filter
in each folder with different settings. For example if you want to generate cyclic chains of length 5, make a folder named cyclo.5
or something similar, and open a cmd/terminal in that folder to run the generation/filtering. Then if you want to generate linear chains of length 4 with ADDA
conserved at position 1, make another folder named linear.4.1:ADDA
or similar and open a cmd/terminal there to do the generation/filtering for that chain type.
Generate a specified number of oligopeptides in SMILES format from a collection of subunits stored in JSON format. Duplicates will not be created. For an example of how the subunits JSON is structured see the default JSON here
Once you have run oplgen
once in a working directory with some arguments, they will be stored in a .params
file so if you want to generate more using the same arguments you can just run oplgen
without having to respecify the settings. This is useful since the number of possible oligopeptides is huge for anything but very small lengths, so you may want to generate in batches. You can still specify a different value for -n
/--number
if you want to generate more/less each time, but you shouldnt change any of the other options in the current working directory. If you accidentally generated the wrong type of chain the first time you ran oplgen
in a new folder, just delete the smiles
folder and .params
file and start again. Note if you have lots of smiles files in the smiles
folder it may be slow to try and delete it via your file explorer, instead delete it from the cmd/terminal using something like rm -rf smiles
for OSX/Linux and rmdir smiles /S /Q
for Windows
Once you have generated the desired number of SMILES, use opl-filter
to select a subset of them based on drug-likeness, output stats about them, and convert them to mol2 using openbabel
The following options are available:
-n --number
- default: 100000
100000
-l --sequenceLength
- default: 5
-s --subunits
- default: subunits.json
subunits.json
file is present, or if you dont pass anything for this argument, the default subunits JSON is usedopl-subunits
command. This will insert the default JSON into the current working directory, for you to add/remove/modify subunits as needed-c --conserve
-c POSITION:SUBUNIT,POSITION:SUBUNIT
ADDA
at position 1 of all generated chains: -c 1:ADDA
ADDA
at position 2 and D-ALA
at position 5: -c 2:ADDA,5:D-ALA
--linear
--linear
argument makes the generated chains linear-r --ringClosureDigit
- default: 9
Generate SMILES for 100,000 cyclic oligopeptides of length 5 if in a fresh directory, or generate more oligopeptides using previous settings in a directory that has had opl-generate
run in it before:
opl-generate
Generate SMILES for 100,000 linear oligopeptides of length 7:
opl-generate -l 7 --linear
Generate SMILES for 20,000 cyclic oligopeptides of length 6 with the ADDA subunit conserved at position 1 and D-ALA at position 4:
opl-generate -l 6 -n 20000 -c 1:ADDA,4:D-ALA
Take a large number of SMILES files generated by opl-generate
, sort them by drug-likeness, select a range from the top scorers, output stats about this selected range, and optionally convert them to mol2 and create an output.mol2
ligand file ready for dock6. If an output.mol2
file already exists in the current working directory, newly converted mol2s will be appended to it. As with opl-generate
duplicates are not created, and you may want to do this stage in batches since the openbabel mol2 conversion may take while for each SMILES file. You can also use a tool like UCSF Chimera to view individual mol2 files that are output in the mol2
folder
This command generates 2 stats files, stats-totals.txt
and stats-files.txt
, both containing data about the values of the properties of the SMILES that opl-filter
picked. Both will contain a header line with a description of, and the command used to create that data set, followed by CSV style data about the selected SMILES. If you are interested in the specific property values of each SMILES file that was chosen, look in stats-files.txt
. If you want to see statistics such as mean, median, min, max and standard deviations for all the SMILES chosen by this filter run, look in stats-totals.txt
. If you run opl-filter
multiple times, the new stats will be appended at the end of the stats txt files.
The following options are available:
-n --number
- default: 100
-n 0
-r --range
- default: same as --number
-n
/--number
argument) be selected from?-n
/--number
, meaning you get the exact top 100 of the sorted list--number
, if you pass -r 0
this will set the range to include all SMILES files.-s --stats
oplflt
will sort, generate stats, and convert to mol2.-s
or --stats
Generate stats and mol2 files for the exact top scoring 100 SMILES files (default):
opl-filter
Generate stats and mol2 files for 200 SMILES files randomly selected from the top scoring 1000:
opl-filter -n 200 -r 1000
Generate only stats for 300 SMILES randomly selected from all available SMILES files:
opl-filter -n 300 -r 0 -s
Generate only stats for all available SMILES files:
opl-filter -n 0 -s
Copy the default subunits.json into the current working directory. You can then edit it as needed, and opl-generate
and opl-filter
will use your local copy when running in that directory, instead of the default subunits JSON file
opl-subunits
oplsub
Create a dock-stats.txt
file containing statistics for min, max, median, mean and standard deviation from the energy scores contained in a mol2 file that has been successfully docked with dock6. By default it looks for a docked.mol2
file in the working directory, or you can pass the filename eg opl-dockstats my_docked_file.mol2
Important: If you put your docked mol2 file in the working directory, dont name it output.mol2
or the energy scores from dock6 will be overwritten if you run opl-filter
again in this directory
opl-dockstats
opldst docked_file.mol2