Cheminformatics Engines

Bioshift integrates RDKit and OpenBabel for comprehensive molecular processing, structure optimization, conformer generation, and format conversion capabilities.

Integrated Toolkits

Two powerful cheminformatics engines working together.

RDKit

Latest via pip

Industry-standard cheminformatics toolkit

Key Capabilities:

  • 2D/3D structure generation
  • Molecular descriptors calculation
  • Fingerprint generation
  • Substructure searching
  • Force field minimization (UFF, MMFF94)
  • Conformer generation (ETKDG)

OpenBabel

3.1.1 (executables included)

Universal molecular format converter

Key Capabilities:

  • 100+ file format conversions
  • 3D structure generation
  • Force field optimization
  • SMILES/InChI generation
  • Hydrogen addition/removal
  • Charge calculation

Structure Optimization Nodes

Energy minimization and conformer generation capabilities.

RDKit Minimize

Energy minimization using RDKit force fields

Type: rdkit_minimize

Key Features

  • UFF universal force field
  • MMFF94 for drug-like molecules
  • Automatic hydrogen addition
  • Batch processing support
  • Energy calculation
  • Constraint support

Properties

PropertyTypeDefaultDescription
force_fieldstringUFFForce field (UFF or MMFF94)
max_iterationsint200Maximum iterations
energy_tolerancefloat1e-4Convergence threshold
add_hydrogensbooltrueAdd missing hydrogens

OpenBabel Minimize

Structure optimization with OpenBabel force fields

Type: openbabel_minimize

Key Features

  • Multiple force fields (GAFF, MMFF94, UFF, Ghemical)
  • Steepest descent optimization
  • Conjugate gradient method
  • Format conversion during optimization
  • Partial charge assignment
  • Constraint minimization

Properties

PropertyTypeDefaultDescription
force_fieldstringMMFF94Force field type
stepsint2500Optimization steps
output_formatstringsdfOutput file format
add_hydrogensbooltrueAdd hydrogens

Conformer Generation

Generate 3D conformers using RDKit ETKDG

Type: conformer_gen

Key Features

  • ETKDG algorithm (v3)
  • Distance geometry approach
  • Torsion angle preferences
  • RMSD-based pruning
  • Energy minimization
  • Chirality preservation

Properties

PropertyTypeDefaultDescription
num_conformersint10Number of conformers
max_attemptsint1000Maximum attempts
prune_rms_threshfloat0.5RMSD pruning threshold
use_random_coordsboolfalseRandom initial coords
enforce_chiralitybooltruePreserve stereochemistry

Molecular Descriptors

Comprehensive molecular property calculations available through RDKit.

Molecular Properties

  • Molecular Weight
  • LogP (lipophilicity)
  • TPSA (polar surface area)
  • Number of rotatable bonds
  • H-bond donors/acceptors
  • Molar refractivity

Drug-likeness

  • Lipinski's Rule of Five
  • QED (Quantitative Estimate of Drug-likeness)
  • Synthetic accessibility score
  • Lead-likeness
  • PAINS alerts
  • Bioavailability score

Molecular Fingerprints

  • Morgan fingerprints (ECFP)
  • MACCS keys
  • Topological fingerprints
  • Atom pair fingerprints
  • RDKit fingerprints
  • Pharmacophore fingerprints

3D Descriptors

  • Radius of gyration
  • Inertial shape descriptors
  • Plane of best fit
  • Spherocity
  • Asphericity
  • 3D pharmacophores

Supported File Formats

Universal format conversion with OpenBabel and RDKit.

Input Formats

SDF/MOL

Structure Data File

2D/3D structures
MOL2

Tripos MOL2

3D with charges
PDB

Protein Data Bank

Macromolecules
PDBQT

AutoDock format

Docking
XYZ

Cartesian coordinates

3D geometry
SMILES

Linear notation

2D structure
InChI

IUPAC identifier

Structure identifier

Output Formats

All input formats

Full compatibility

Various
CML

Chemical Markup Language

XML-based
PNG/SVG

2D depictions

Images
FASTA

Sequence format

Biopolymers
Gaussian

Computational chemistry

QM input

Common Use Cases

Typical cheminformatics workflows in Bioshift.

Lead Optimization

Optimize molecular structures for drug discovery

  1. 1Load lead compounds (SDF)
  2. 2Generate 3D conformers
  3. 3Energy minimization with MMFF94
  4. 4Calculate drug-like properties
  5. 5Filter by Lipinski's Rule of Five
  6. 6Export optimized structures

Virtual Library Generation

Create and process virtual compound libraries

  1. 1Generate SMILES from scaffold
  2. 2Enumerate chemical library
  3. 3Convert to 3D structures
  4. 4Minimize with UFF
  5. 5Calculate fingerprints
  6. 6Cluster by similarity

Structure Preparation

Prepare structures for docking or MD

  1. 1Load protein-ligand complex
  2. 2Add missing hydrogens
  3. 3Optimize hydrogen positions
  4. 4Assign partial charges
  5. 5Generate conformers
  6. 6Export in target format