Ligand Preparation (Building)

Building ligands is the process of preparing them in the db2 file format for use in the DOCK software. It consists of generating protonation states, tautomers, conformations, and calculating partial charges, atomic desolvation, and strain.

The commands for creating building jobs on wynton and gimel are below. All submission options are listed at the bottom.

A note on open source

Beginning ~Dec 12, 2025 we are transitioning away from ChemAxon and OpenEye commercial licenses to use open source tools for our building pipeline. The instructions here are to use the open source version of building in preparation of this transition. At the bottom of the page are the instructions for building with the old pipeline which will soon be deprecated.

Job Setup on Wynton (Open Source)

Source the environment

source /wynton/group/bks/soft/DOCK-3.8.6/DOCK-3.8.6.1/env.sh

Create the building jobs

python $DOCK_INSTALL_PATH/zinc22-3d/submit/submit_building_docker.py --use_open_source --output_folder [output_folder_name] --bundle_size [bundle_size] --skip_name_check --scheduler sge --container_software apptainer --container_path_or_name $DOCK_INSTALL_PATH/building_pipeline_oss.sif [smi_file]

Submit the building jobs

qsub building_array_job.sh

Job Setup on Gimel (Open Source)

Run this on a SLURM node (epyc, epyc2, gimel2, etc).

Source the environment

source /mnt/nfs/soft/dock/versions/dock386/DOCK-3.8.6.1/env.sh

Create Building Jobs

python $DOCK_INSTALL_PATH/zinc22-3d/submit/submit_building_docker.py --use_open_source  --container_path_or_name building_pipeline_oss --output_folder [output_folder_name] --bundle_size [bundle_size] --skip_name_check [smi_file]

Submit the building jobs

sbatch --exclude=$(paste -sd, /mnt/nfs/exk/work/bwhall61/deploy_building_pipeline_docker/broken_nodes_oss.txt) building_array_job.sh

Job Setup on Wynton (Commercial - Soon to be deprecated)

Source the environment

source /wynton/group/bks/soft/DOCK-3.8.5/env.sh

Create the building jobs

python $DOCK_INSTALL_PATH/zinc22-3d/submit/submit_building_docker.py --output_folder [output_folder_name] --bundle_size [bundle_size] --skip_name_check --scheduler sge --container_software apptainer --container_path_or_name $DOCK_INSTALL_PATH/building_pipeline.sif [smi file name]

Submit the building jobs

qsub building_array_job.sh

Job Setup on Gimel (Commercial - Soon to be deprecated)

If this is your first time building on gimel, first ask John and his team to add you to the “docker” permissions group.

Run this on a SLURM node (epyc, epyc2, gimel2, etc).

Source the environment

source /mnt/nfs/soft/dock/versions/dock385/env.sh

Create Building Jobs

python $DOCK_INSTALL_PATH/zinc22-3d/submit/submit_building_docker.py --output_folder [output_folder_name] --bundle_size [bundle_size] --skip_name_check [smi file]

Submit the building jobs

sbatch --exclude=$(paste -sd, /mnt/nfs/exk/work/bwhall61/deploy_building_pipeline_docker/broken_nodes.txt) building_array_job.sh

Understanding What’s Happening

When you run the python script, your smile file will be separated into smaller units called “bundles”. Each bundle of ligands will be built by an individual job submitted to the scheduler.

The python script will create an output folder with the name you provide. Inside the output folder will be subfolders called 1,2,…N where each subfolder is an individual bundle. Inside these folders is where your built molecules will be.

Options

–output_folder: The name of the output folder to store results.

–bundle_size: The number of molecules to include in a single bundle. You will end up with [total_num_molecules]/[bundle_size] output bundles. Note: If you are building lots of molecules, try and have no more than a few thousand bundles to limit the number of subfolders in the output folder. You may need to break up your input smile into smaller chunks and run building jobs for each one

–minutes_per_mol: The number of minutes to allow for building each molecule. An individual bundle’s job will run for max [minutes_per_mol]*[bundle_size] minutes.

–skip_name_check: If you know that you have unique moleucle names in your input smile file, you should use this option.

–scheduler: sge or slurm. Defaults to slurm.

–container_software: docker or apptainer. Defaults to docker.

–container_path_or_name: If docker is used, this is the name of the docker image. If apptainer is used, this is the path to the apptainer image. Default is building_pipeline (the docker image name on gimel)

–use_open_source: Add to use the open source version of building