oddt.toolkits package¶
Subpackages¶
Submodules¶
oddt.toolkits.common module¶
Code common to all toolkits
oddt.toolkits.ob module¶
-
class
oddt.toolkits.ob.
Atom
(OBAtom)[source]¶ Bases:
pybel.Atom
Attributes
atomicmass
atomicnum
bonds
cidx
coordidx
coords
exactmass
formalcharge
heavyvalence
heterovalence
hyb
idx
implicitvalence
isotope
neighbors
partialcharge
residue
spin
type
valence
vector
-
atomicmass
¶
-
atomicnum
¶
-
bonds
¶
-
cidx
¶
-
coordidx
¶
-
coords
¶
-
exactmass
¶
-
formalcharge
¶
-
heavyvalence
¶
-
heterovalence
¶
-
hyb
¶
-
idx
¶
-
implicitvalence
¶
-
isotope
¶
-
neighbors
¶
-
partialcharge
¶
-
residue
¶
-
spin
¶
-
type
¶
-
valence
¶
-
vector
¶
-
-
class
oddt.toolkits.ob.
Bond
(OBBond)[source]¶ Bases:
object
Attributes
atoms
isrotor
order
-
atoms
¶
-
isrotor
¶
-
order
¶
-
-
class
oddt.toolkits.ob.
Fingerprint
(fingerprint)[source]¶ Bases:
pybel.Fingerprint
Attributes
bits
raw
-
bits
¶
-
raw
¶
-
-
class
oddt.toolkits.ob.
Molecule
(OBMol=None, source=None, protein=False)[source]¶ Bases:
pybel.Molecule
Attributes
OBMol
atom_dict
atoms
bonds
canonic_order
Returns np.array with canonic order of heavy atoms in the molecule charge
charges
clone
conformers
coords
data
dim
energy
exactmass
formula
molwt
num_rotors
Number of strict rotatable res_dict
residues
ring_dict
smiles
spin
sssr
title
unitcell
Methods
addh
([only_polar])Add hydrogens calccharges
([model])Estimates atomic partial charges in the molecule. calcdesc
([descnames])Calculate descriptor values. calcfp
([fptype])Calculate a molecular fingerprint. clone_coords
(source)convertdbonds
()Convert Dative Bonds. draw
([show, filename, update, usecoords])Create a 2D depiction of the molecule. localopt
([forcefield, steps])Locally optimize the coordinates. make2D
()Generate 2D coordinates for molecule make3D
([forcefield, steps])Generate 3D coordinates removeh
()Remove hydrogens write
([format, filename, overwrite, opt, size])-
OBMol
¶
-
atom_dict
¶
-
atoms
¶
-
bonds
¶
-
calccharges
(model='mmff94')¶ Estimates atomic partial charges in the molecule.
- Optional parameters:
- model – default is “mmff94”. See the charges variable for a list
- of available charge models (in shell, obabel -L charges)
This method populates the partialcharge attribute of each atom in the molecule in place.
-
calcdesc
(descnames=[])¶ Calculate descriptor values.
- Optional parameter:
- descnames – a list of names of descriptors
If descnames is not specified, all available descriptors are calculated. See the descs variable for a list of available descriptors.
-
calcfp
(fptype='FP2')¶ Calculate a molecular fingerprint.
- Optional parameters:
- fptype – the fingerprint type (default is “FP2”). See the
- fps variable for a list of of available fingerprint types.
-
canonic_order
¶ Returns np.array with canonic order of heavy atoms in the molecule
-
charge
¶
-
charges
¶
-
clone
¶
-
conformers
¶
-
convertdbonds
()¶ Convert Dative Bonds.
-
coords
¶
-
data
¶
-
dim
¶
-
draw
(show=True, filename=None, update=False, usecoords=False)¶ Create a 2D depiction of the molecule.
- Optional parameters:
show – display on screen (default is True) filename – write to file (default is None) update – update the coordinates of the atoms to those
determined by the structure diagram generator (default is False)- usecoords – don’t calculate 2D coordinates, just use
- the current coordinates (default is False)
Tkinter and Python Imaging Library are required for image display.
-
energy
¶
-
exactmass
¶
-
formula
¶
-
localopt
(forcefield='mmff94', steps=500)¶ Locally optimize the coordinates.
- Optional parameters:
- forcefield – default is “mmff94”. See the forcefields variable
- for a list of available forcefields.
steps – default is 500
If the molecule does not have any coordinates, make3D() is called before the optimization. Note that the molecule needs to have explicit hydrogens. If not, call addh().
-
molwt
¶
-
num_rotors
¶ Number of strict rotatable
-
res_dict
¶
-
residues
¶
-
ring_dict
¶
-
smiles
¶
-
spin
¶
-
sssr
¶
-
title
¶
-
unitcell
¶
-
-
class
oddt.toolkits.ob.
MoleculeData
(obmol)[source]¶ Bases:
pybel.MoleculeData
Methods
clear
()has_key
(key)items
()iteritems
()keys
()to_dict
()update
(dictionary)values
()-
clear
()¶
-
has_key
(key)¶
-
items
()¶
-
iteritems
()¶
-
keys
()¶
-
update
(dictionary)¶
-
values
()¶
-
-
class
oddt.toolkits.ob.
Outputfile
(format, filename, overwrite=False, opt=None)[source]¶ Bases:
pybel.Outputfile
Methods
close
()Close the Outputfile to further writing. write
(molecule)Write a molecule to the output file. -
close
()¶ Close the Outputfile to further writing.
-
write
(molecule)¶ Write a molecule to the output file.
- Required parameters:
- molecule
-
-
class
oddt.toolkits.ob.
Residue
(OBResidue)[source]¶ Bases:
object
Represent a Pybel residue.
- Required parameter:
- OBResidue – an Open Babel OBResidue
- Attributes:
- atoms, idx, name.
(refer to the Open Babel library documentation for more info).
- The original Open Babel atom can be accessed using the attribute:
- OBResidue
Attributes
atoms
idx
name
-
atoms
¶
-
idx
¶
-
name
¶
-
class
oddt.toolkits.ob.
Smarts
(smartspattern)[source]¶ Bases:
pybel.Smarts
Initialise with a SMARTS pattern.
Methods
findall
(molecule)Find all matches of the SMARTS pattern to a particular molecule. match
(molecule)Checks if there is any match. -
findall
(molecule)¶ Find all matches of the SMARTS pattern to a particular molecule.
- Required parameters:
- molecule
-
oddt.toolkits.rdk module¶
rdkit - A Cinfony module for accessing the RDKit from CPython
- Global variables:
- Chem and AllChem - the underlying RDKit Python bindings informats - a dictionary of supported input formats outformats - a dictionary of supported output formats descs - a list of supported descriptors fps - a list of supported fingerprint types forcefields - a list of supported forcefields
-
class
oddt.toolkits.rdk.
Atom
(Atom)[source]¶ Bases:
object
Represent an rdkit Atom.
- Required parameters:
- Atom – an RDKit Atom
- Attributes:
- atomicnum, coords, formalcharge
- The original RDKit Atom can be accessed using the attribute:
- Atom
Attributes
atomicnum
bonds
coords
formalcharge
idx
Note that this index is 1-based and RDKit’s internal index in 0-based. neighbors
partialcharge
-
atomicnum
¶
-
bonds
¶
-
coords
¶
-
formalcharge
¶
-
idx
¶ Note that this index is 1-based and RDKit’s internal index in 0-based. Changed to be compatible with OpenBabel
-
neighbors
¶
-
partialcharge
¶
-
class
oddt.toolkits.rdk.
Bond
(Bond)[source]¶ Bases:
object
Attributes
atoms
isrotor
order
-
atoms
¶
-
isrotor
¶
-
order
¶
-
-
class
oddt.toolkits.rdk.
Fingerprint
(fingerprint)[source]¶ Bases:
object
A Molecular Fingerprint.
- Required parameters:
- fingerprint – a vector calculated by one of the fingerprint methods
- Attributes:
- fp – the underlying fingerprint object bits – a list of bits set in the Fingerprint
- Methods:
The “|” operator can be used to calculate the Tanimoto coeff. For example, given two Fingerprints ‘a’, and ‘b’, the Tanimoto coefficient is given by:
tanimoto = a | b
Attributes
raw
-
raw
¶
-
class
oddt.toolkits.rdk.
Molecule
(Mol=None, source=None, protein=False)[source]¶ Bases:
object
Represent an rdkit Molecule.
- Required parameter:
- Mol – an RDKit Mol or any type of cinfony Molecule
- Attributes:
- atoms, data, formula, molwt, title
- Methods:
- addh(), calcfp(), calcdesc(), draw(), localopt(), make3D(), removeh(), write()
- The underlying RDKit Mol can be accessed using the attribute:
- Mol
Attributes
Mol
atom_dict
atoms
bonds
canonic_order
Returns np.array with canonic order of heavy atoms in the molecule charges
clone
coords
data
formula
molwt
num_rotors
res_dict
residues
ring_dict
smiles
sssr
title
Methods
addh
([only_polar])Add hydrogens. calcdesc
([descnames])Calculate descriptor values. calcfp
([fptype, opt])Calculate a molecular fingerprint. clone_coords
(source)localopt
([forcefield, steps])Locally optimize the coordinates. make2D
()Generate 2D coordinates for molecule make3D
([forcefield, steps])Generate 3D coordinates. removeh
(**kwargs)Remove hydrogens. write
([format, filename, overwrite, size])Write the molecule to a file or return a string. -
Mol
¶
-
atom_dict
¶
-
atoms
¶
-
bonds
¶
-
calcdesc
(descnames=None)[source]¶ Calculate descriptor values.
- Optional parameter:
- descnames – a list of names of descriptors
If descnames is not specified, all available descriptors are calculated. See the descs variable for a list of available descriptors.
-
calcfp
(fptype='rdkit', opt=None)[source]¶ Calculate a molecular fingerprint.
- Optional parameters:
- fptype – the fingerprint type (default is “rdkit”). See the
- fps variable for a list of of available fingerprint types.
- opt – a dictionary of options for fingerprints. Currently only used
- for radius and bitInfo in Morgan fingerprints.
-
canonic_order
¶ Returns np.array with canonic order of heavy atoms in the molecule
-
charges
¶
-
clone
¶
-
coords
¶
-
data
¶
-
formula
¶
-
localopt
(forcefield='uff', steps=500)[source]¶ Locally optimize the coordinates.
- Optional parameters:
- forcefield – default is “uff”. See the forcefields variable
- for a list of available forcefields.
steps – default is 500
If the molecule does not have any coordinates, make3D() is called before the optimization.
-
make3D
(forcefield='mmff94', steps=50)[source]¶ Generate 3D coordinates.
- Optional parameters:
- forcefield – default is “uff”. See the forcefields variable
- for a list of available forcefields.
steps – default is 50
Once coordinates are generated, a quick local optimization is carried out with 50 steps and the UFF forcefield. Call localopt() if you want to improve the coordinates further.
-
molwt
¶
-
num_rotors
¶
-
res_dict
¶
-
residues
¶
-
ring_dict
¶
-
smiles
¶
-
sssr
¶
-
title
¶
-
write
(format='smi', filename=None, overwrite=False, size=None, **kwargs)[source]¶ Write the molecule to a file or return a string.
- Optional parameters:
- format – see the informats variable for a list of available
- output formats (default is “smi”)
filename – default is None overwite – if the output file already exists, should it
be overwritten? (default is False)
If a filename is specified, the result is written to a file. Otherwise, a string is returned containing the result.
To write multiple molecules to the same file you should use the Outputfile class.
-
class
oddt.toolkits.rdk.
MoleculeData
(Mol)[source]¶ Bases:
object
Store molecule data in a dictionary-type object
- Required parameters:
- Mol – an RDKit Mol
Methods and accessor methods are like those of a dictionary except that the data is retrieved on-the-fly from the underlying Mol.
Example: >>> mol = next(readfile(“sdf”, ‘head.sdf’)) >>> data = mol.data >>> print(data) {‘Comment’: ‘CORINA 2.61 0041 25.10.2001’, ‘NSC’: ‘1’} >>> print(len(data), data.keys(), data.has_key(“NSC”)) 2 [‘Comment’, ‘NSC’] True >>> print(data[‘Comment’]) CORINA 2.61 0041 25.10.2001 >>> data[‘Comment’] = ‘This is a new comment’ >>> for k,v in data.items(): ... print(k, “–>”, v) Comment –> This is a new comment NSC –> 1 >>> del data[‘NSC’] >>> print(len(data), data.keys(), data.has_key(“NSC”)) 1 [‘Comment’] False
Methods
clear
()has_key
(key)items
()iteritems
()keys
()to_dict
()update
(dictionary)values
()
-
class
oddt.toolkits.rdk.
Outputfile
(format, filename, overwrite=False)[source]¶ Bases:
object
Represent a file to which output is to be sent.
- Required parameters:
- format - see the outformats variable for a list of available
- output formats
filename
- Optional parameters:
- overwite – if the output file already exists, should it
- be overwritten? (default is False)
- Methods:
- write(molecule) close()
Methods
close
()Close the Outputfile to further writing. write
(molecule)Write a molecule to the output file.
-
class
oddt.toolkits.rdk.
Residue
(ParentMol, atom_path)[source]¶ Bases:
object
Represent a RDKit residue.
- Required parameter:
- ParentMol – Parent molecule (Mol) object path – atoms path of a residue
- Attributes:
- atoms, idx, name.
(refer to the Open Babel library documentation for more info).
- The Mol object constucted of residues’ atoms can be accessed using the attribute:
- Residue
Attributes
atoms
idx
name
-
atoms
¶
-
idx
¶
-
name
¶
-
class
oddt.toolkits.rdk.
Smarts
(smartspattern)[source]¶ Bases:
object
Initialise with a SMARTS pattern.
Methods
findall
(molecule)Find all matches of the SMARTS pattern to a particular molecule. match
(molecule)Find all matches of the SMARTS pattern to a particular molecule.
-
oddt.toolkits.rdk.
base_feature_factory
= <rdkit.Chem.rdMolChemicalFeatures.MolChemicalFeatureFactory object>¶ Global feature factory based on BaseFeatures.fdef
-
oddt.toolkits.rdk.
descs
= ['fr_C_O_noCOO', 'PEOE_VSA3', 'Chi4v', 'fr_Ar_COO', 'fr_SH', 'Chi4n', 'SMR_VSA10', 'fr_para_hydroxylation', 'fr_barbitur', 'fr_Ar_NH', 'fr_halogen', 'fr_dihydropyridine', 'fr_priamide', 'SlogP_VSA4', 'fr_guanido', 'MinPartialCharge', 'fr_furan', 'fr_morpholine', 'fr_nitroso', 'NumAromaticCarbocycles', 'fr_COO2', 'fr_amidine', 'SMR_VSA7', 'fr_benzodiazepine', 'ExactMolWt', 'fr_Imine', 'MolWt', 'fr_hdrzine', 'fr_urea', 'NumAromaticRings', 'fr_quatN', 'NumSaturatedHeterocycles', 'NumAliphaticHeterocycles', 'fr_benzene', 'fr_phos_acid', 'fr_sulfone', 'VSA_EState10', 'fr_aniline', 'fr_N_O', 'fr_sulfonamd', 'fr_thiazole', 'TPSA', 'EState_VSA8', 'PEOE_VSA14', 'PEOE_VSA13', 'PEOE_VSA12', 'PEOE_VSA11', 'PEOE_VSA10', 'BalabanJ', 'fr_lactone', 'fr_Al_COO', 'EState_VSA10', 'EState_VSA11', 'HeavyAtomMolWt', 'fr_nitro_arom', 'Chi0', 'Chi1', 'NumAliphaticRings', 'MolLogP', 'fr_nitro', 'fr_Al_OH', 'fr_azo', 'NumAliphaticCarbocycles', 'fr_C_O', 'fr_ether', 'fr_phenol_noOrthoHbond', 'fr_alkyl_halide', 'NumValenceElectrons', 'fr_aryl_methyl', 'fr_Ndealkylation2', 'MinEStateIndex', 'fr_term_acetylene', 'HallKierAlpha', 'fr_C_S', 'fr_thiocyan', 'fr_ketone_Topliss', 'VSA_EState4', 'Ipc', 'VSA_EState6', 'VSA_EState7', 'VSA_EState1', 'VSA_EState2', 'VSA_EState3', 'fr_HOCCN', 'fr_phos_ester', 'BertzCT', 'SlogP_VSA12', 'EState_VSA9', 'SlogP_VSA10', 'SlogP_VSA11', 'fr_COO', 'NHOHCount', 'fr_unbrch_alkane', 'NumSaturatedRings', 'MaxPartialCharge', 'fr_methoxy', 'fr_thiophene', 'SlogP_VSA8', 'SlogP_VSA9', 'MinAbsPartialCharge', 'SlogP_VSA5', 'SlogP_VSA6', 'SlogP_VSA7', 'SlogP_VSA1', 'SlogP_VSA2', 'SlogP_VSA3', 'NumRadicalElectrons', 'fr_NH2', 'fr_piperzine', 'fr_nitrile', 'NumHeteroatoms', 'fr_NH1', 'fr_NH0', 'MaxAbsEStateIndex', 'LabuteASA', 'fr_amide', 'Chi3n', 'fr_imidazole', 'SMR_VSA3', 'SMR_VSA2', 'SMR_VSA1', 'Chi3v', 'SMR_VSA6', 'Kappa3', 'Kappa2', 'EState_VSA6', 'EState_VSA7', 'SMR_VSA9', 'EState_VSA5', 'EState_VSA2', 'EState_VSA3', 'fr_Ndealkylation1', 'EState_VSA1', 'fr_ketone', 'SMR_VSA5', 'MinAbsEStateIndex', 'fr_diazo', 'SMR_VSA4', 'fr_Ar_N', 'fr_Nhpyrrole', 'fr_ester', 'VSA_EState5', 'EState_VSA4', 'NumHDonors', 'fr_prisulfonamd', 'fr_oxime', 'SMR_VSA8', 'fr_isocyan', 'Chi2n', 'Chi2v', 'HeavyAtomCount', 'fr_azide', 'NumHAcceptors', 'fr_lactam', 'fr_allylic_oxid', 'VSA_EState8', 'fr_oxazole', 'VSA_EState9', 'fr_piperdine', 'fr_Ar_OH', 'fr_sulfide', 'fr_alkyl_carbamate', 'NOCount', 'Chi1n', 'PEOE_VSA8', 'PEOE_VSA7', 'PEOE_VSA6', 'PEOE_VSA5', 'PEOE_VSA4', 'MaxEStateIndex', 'PEOE_VSA2', 'PEOE_VSA1', 'NumSaturatedCarbocycles', 'fr_imide', 'FractionCSP3', 'Chi1v', 'fr_Al_OH_noTert', 'fr_epoxide', 'fr_hdrzone', 'fr_isothiocyan', 'NumAromaticHeterocycles', 'fr_bicyclic', 'Kappa1', 'Chi0n', 'fr_phenol', 'MolMR', 'PEOE_VSA9', 'fr_aldehyde', 'fr_pyridine', 'fr_tetrazole', 'RingCount', 'fr_nitro_arom_nonortho', 'Chi0v', 'fr_ArN', 'NumRotatableBonds', 'MaxAbsPartialCharge']¶ A list of supported descriptors
-
oddt.toolkits.rdk.
forcefields
= ['mmff94', 'uff']¶ A list of supported forcefields
-
oddt.toolkits.rdk.
fps
= ['rdkit', 'layered', 'maccs', 'atompairs', 'torsions', 'morgan']¶ A list of supported fingerprint types
-
oddt.toolkits.rdk.
informats
= {'inchi': 'InChI', 'mol2': 'Tripos MOL2 file', 'sdf': 'MDL SDF file', 'smi': 'SMILES', 'mol': 'MDL MOL file'}¶ A dictionary of supported input formats
-
oddt.toolkits.rdk.
outformats
= {'inchikey': 'InChIKey', 'sdf': 'MDL SDF file', 'can': 'Canonical SMILES', 'smi': 'SMILES', 'mol': 'MDL MOL file', 'inchi': 'InChI'}¶ A dictionary of supported output formats
-
oddt.toolkits.rdk.
readfile
(format, filename, lazy=False, opt=None, *args, **kwargs)[source]¶ Iterate over the molecules in a file.
- Required parameters:
- format - see the informats variable for a list of available
- input formats
filename
You can access the first molecule in a file using the next() method of the iterator:
mol = next(readfile(“smi”, “myfile.smi”))- You can make a list of the molecules in a file using:
- mols = list(readfile(“smi”, “myfile.smi”))
You can iterate over the molecules in a file as shown in the following code snippet: >>> atomtotal = 0 >>> for mol in readfile(“sdf”, “head.sdf”): ... atomtotal += len(mol.atoms) ... >>> print(atomtotal) 43