Command Reference
This chapter describes all SQL commands provided by Mychem. These commands are classified in five sections:
- Conversion Commands - this section details functions that convert chemical files.
- Helper Commands - this section details functions that return informations about the Mychem environment.
- Modification Commands - this section details functions that modify chemical structures.
- Molmatch Commands - this section details functions that compare chemical structures.
- Property Commands - this section details functions that compute molecular properties.
The Default Molecule Type
Many functions are using or returning a molecule in DEFAULT_TYPE format. Since version 0.6.0 of Mychem, the DEFAULT_TYPE format is MDL Molfile (V2000). In earlier versions, the DEFAULT_TYPE format was InChI.
The speed of several functions has been measured when using InChI and MDL Molfile formats. It has been shown that using MDL Molfile is a bit faster than InChI. Using the MDL Molfile format is also interesting, as it permits to store 2D or 3D coordinates. Moreover, the MDL Molfile format is commonly used by many softwares.
Conversion Commands
Conversion commmands permit to convert chemical data from a format to another format. Mychem uses the Open Babel library for the conversion, but does not implement all the 80 chemical file formats supported by Open Babel. Mychem supports only few of them. They are detailed in the Molecule Formats appendix.
To avoid having too many functions, a set of two functions is available for each format: FORMAT_TO_MOLECULE and MOLECULE_TO_FORMAT. If you need other formats, please open a ticket on the feature tracker.
Note
If a function of this section fails, it returns an empty string.
CML_TO_MOLECULE(molecule)CML_TO_MOLECULEconverts a molecule in CML format to a molecule in DEFAULT_TYPE format.mysql> SELECT CML_TO_MOLECULE(cml_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 2-Aminoacetic acid OpenBabel11190809032D 10 9 0 0 0 0 0 0 0 0999 V2000 -0.1068 -1.0521 1.1509 H 0 0 0 0 0 0 0 0 0 0 0 0 0.0877 -0.0798 0.6477 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.2870 0.7082 1.3331 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5185 0.1919 0.3951 N 0 0 0 0 0 0 0 0 0 0 0 0 2.0168 0.1266 1.2568 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8890 -0.4693 -0.2544 H 0 0 0 0 0 0 0 0 0 0 0 0 -0.6775 -0.0767 -0.6578 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.4455 -0.6968 -1.6798 O 0 0 0 0 0 0 0 0 0 0 0 0 -1.7851 0.6991 -0.6705 O 0 0 0 0 0 0 0 0 0 0 0 0 -2.2101 0.6489 -1.5212 H 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 2 3 1 0 0 0 0 2 4 1 0 0 0 0 2 7 1 0 0 0 0 4 5 1 0 0 0 0 4 6 1 0 0 0 0 7 8 2 0 0 0 0 7 9 1 0 0 0 0 9 10 1 0 0 0 0 M END
MOLECULE_TO_CML(molecule)MOLECULE_TO_CMLconverts a molecule in DEFAULT_TYPE format to a molecule in CML format.mysql> SELECT MOLECULE_TO_CML(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> <molecule id="id2-Aminoacetic acid"> <name>2-Aminoacetic acid</name> <atomArray> <atom id="a1" elementType="H" x3="-0.106800" y3="-1.052100" z3="1.150900"/> <atom id="a2" elementType="C" x3="0.087700" y3="-0.079800" z3="0.647700"/> <atom id="a3" elementType="H" x3="-0.287000" y3="0.708200" z3="1.333100"/> <atom id="a4" elementType="N" x3="1.518500" y3="0.191900" z3="0.395100"/> <atom id="a5" elementType="H" x3="2.016800" y3="0.126600" z3="1.256800"/> <atom id="a6" elementType="H" x3="1.889000" y3="-0.469300" z3="-0.254400"/> <atom id="a7" elementType="C" x3="-0.677500" y3="-0.076700" z3="-0.657800"/> <atom id="a8" elementType="O" x3="-0.445500" y3="-0.696800" z3="-1.679800"/> <atom id="a9" elementType="O" x3="-1.785100" y3="0.699100" z3="-0.670500"/> <atom id="a10" elementType="H" x3="-2.210100" y3="0.648900" z3="-1.521200"/> </atomArray> <bondArray> <bond atomRefs2="a1 a2" order="1"/> <bond atomRefs2="a2 a3" order="1"/> <bond atomRefs2="a2 a4" order="1"/> <bond atomRefs2="a2 a7" order="1"/> <bond atomRefs2="a4 a5" order="1"/> <bond atomRefs2="a4 a6" order="1"/> <bond atomRefs2="a7 a8" order="2"/> <bond atomRefs2="a7 a9" order="1"/> <bond atomRefs2="a9 a10" order="1"/> </bondArray> </molecule>
FINGERPRINT(molecule, type)FINGERPRINTconverts a molecule in DEFAULT_TYPE format to a fingerprint. The fingerprint type is specified by the second argument (FP2, FP3 or FP4). The SQL type of the converted molecule is a binary string for all kinds of fingerprint.mysql> SELECT FINGERPRINT(molecule_col, "FP2") FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> binary fingerprint (type FP2)
Note
Tanimoto scoring can be improved by using a concatenation of different fingerprint types. The concatenation of fingerprints can be performed with the following query:
mysql> SELECT CONCAT(FINGERPRINT(molecule_col,"FP2"), -> FINGERPRINT(molecule_col,"FP3")) FROM tbl_name -> WHERE id=9; -> binary fingerprint (type FP2 + type FP3)
FINGERPRINT2(molecule)FINGERPRINT2converts a molecule in DEFAULT_TYPE format to a FP2 fingerprint. The SQL type of FP2 fingerprints is a binary string.mysql> SELECT FINGERPRINT2(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> binary fingerprint (type FP2)
FINGERPRINT3(molecule)FINGERPRINT3converts a molecule in DEFAULT_TYPE format to a FP3 fingerprint. The SQL type of FP3 fingerprints is a binary string.mysql> SELECT FINGERPRINT3(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> binary fingerprint (type FP3)
FINGERPRINT4(molecule)FINGERPRINT4converts a molecule in DEFAULT_TYPE format to a FP4 fingerprint. The SQL type of FP4 fingerprints is a binary string.mysql> SELECT FINGERPRINT4(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> binary fingerprint (type FP4)
INCHI_TO_MOLECULE(molecule)INCHI_TO_MOLECULEconverts a molecule in InChI format to a molecule in DEFAULT_TYPE format.mysql> SELECT INCHI_TO_MOLECULE(inchi_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> OpenBabel11190809142D 5 4 0 0 0 0 0 0 0 0999 V2000 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 1 3 1 0 0 0 0 2 4 2 0 0 0 0 2 5 1 0 0 0 0 M END
MOLECULE_TO_INCHI(molecule)MOLECULE_TO_INCHIconverts a molecule in DEFAULT_TYPE format to a molecule in InChI format.mysql> SELECT MOLECULE_TO_INCHI(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> InChI=1S/C2H5NO2/c3-1-2(4)5/h1,3H2,(H,4,5)
Note
The
INCHI_VERSIONfunction permits to know which version of the InChI library is used.MOL2_TO_MOLECULE(molecule)MOL2_TO_MOLECULEconverts a molecule in Sybyl Mol2 format to a molecule in DEFAULT_TYPE format.mysql> SELECT MOL2_TO_MOLECULE(mol2_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 2-Aminoacetic acid OpenBabel11190809032D 10 9 0 0 0 0 0 0 0 0999 V2000 -0.1068 -1.0521 1.1509 H 0 0 0 0 0 0 0 0 0 0 0 0 0.0877 -0.0798 0.6477 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.2870 0.7082 1.3331 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5185 0.1919 0.3951 N 0 0 0 0 0 0 0 0 0 0 0 0 2.0168 0.1266 1.2568 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8890 -0.4693 -0.2544 H 0 0 0 0 0 0 0 0 0 0 0 0 -0.6775 -0.0767 -0.6578 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.4455 -0.6968 -1.6798 O 0 0 0 0 0 0 0 0 0 0 0 0 -1.7851 0.6991 -0.6705 O 0 0 0 0 0 0 0 0 0 0 0 0 -2.2101 0.6489 -1.5212 H 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 2 3 1 0 0 0 0 2 4 1 0 0 0 0 2 7 1 0 0 0 0 4 5 1 0 0 0 0 4 6 1 0 0 0 0 7 8 2 0 0 0 0 7 9 1 0 0 0 0 9 10 1 0 0 0 0 M END
MOLECULE_TO_MOL2(molecule)MOLECULE_TO_MOL2converts a molecule in DEFAULT_TYPE format to a molecule in Sybyl Mol2 format.mysql> SELECT MOLECULE_TO_MOL2(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; ->@<TRIPOS>MOLECULE 2-Aminoacetic acid 10 9 0 0 0 SMALL GASTEIGER @<TRIPOS>ATOM 1 HA1 -0.1068 -1.0521 1.1509 H 1 GLY1 0.0537 2 CA 0.0877 -0.0798 0.6477 C.3 1 GLY1 0.0918 3 HA2 -0.2870 0.7082 1.3331 H 1 GLY1 0.0537 4 N 1.5185 0.1919 0.3951 N.3 1 GLY1 -0.3209 5 H1 2.0168 0.1266 1.2568 H 1 GLY1 0.1187 6 H2 1.8890 -0.4693 -0.2544 H 1 GLY1 0.1187 7 C -0.6775 -0.0767 -0.6578 C.2 1 GLY1 0.3185 8 O -0.4455 -0.6968 -1.6798 O.2 1 GLY1 -0.2496 9 OXT -1.7851 0.6991 -0.6705 O.3 1 GLY1 -0.4797 10 HXT -2.2101 0.6489 -1.5212 H 1 GLY1 0.2951 @<TRIPOS>BOND 1 1 2 1 2 2 3 1 3 2 4 1 4 2 7 1 5 4 5 1 6 4 6 1 7 7 8 2 8 7 9 1 9 9 10 1
MOLECULE_TO_MOLECULE(molecule)MOLECULE_TO_MOLECULEconverts a molecule in the old DEFAULT_TYPE format (InChI) to a molecule in the current DEFAULT_TYPE format (MDL Molfile). This function is useful when updating a database with a new version of Mychem.mysql> SELECT MOLECULE_TO_MOLECULE(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> OpenBabel11190809092D 5 4 0 0 0 0 0 0 0 0999 V2000 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 1 3 1 0 0 0 0 2 4 2 0 0 0 0 2 5 1 0 0 0 0 M END
MOLECULE_TO_SERIALIZEDOBMOL(molecule)MOLECULE_TO_SERIALIZEDOBMOLconverts a molecule in DEFAULT_TYPE format to a serialized OBMol object. The SQL type of the serialized object is a binary string.mysql> SELECT MOLECULE_TO_SERIALIZEDOBMOL(molecule_col) -> FROM tbl_name WHERE name='2-Aminoacetic acid'; -> binary string
MOLFILE_TO_MOLECULE(molecule)MOLFILE_TO_MOLECULEconverts a molecule in MDL Molfile (V2000) format to a molecule in DEFAULT_TYPE format.mysql> SELECT MOLFILE_TO_MOLECULE(molfile_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 2-Aminoacetic acid OpenBabel12091111263D 10 9 0 0 0 0 0 0 0 0999 V2000 -0.1068 -1.0521 1.1509 H 0 0 0 0 0 0 0 0 0 0 0 0 0.0877 -0.0798 0.6477 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.2870 0.7082 1.3331 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5185 0.1919 0.3951 N 0 0 0 0 0 0 0 0 0 0 0 0 2.0168 0.1266 1.2568 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8890 -0.4693 -0.2544 H 0 0 0 0 0 0 0 0 0 0 0 0 -0.6775 -0.0767 -0.6578 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.4455 -0.6968 -1.6798 O 0 0 0 0 0 0 0 0 0 0 0 0 -1.7851 0.6991 -0.6705 O 0 0 0 0 0 0 0 0 0 0 0 0 -2.2101 0.6489 -1.5212 H 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 2 3 1 0 0 0 0 2 4 1 0 0 0 0 2 7 1 0 0 0 0 4 5 1 0 0 0 0 4 6 1 0 0 0 0 7 8 2 0 0 0 0 7 9 1 0 0 0 0 9 10 1 0 0 0 0 M END
MOLECULE_TO_MOLFILE(molecule)MOLECULE_TO_MOLFILEconverts a molecule in DEFAULT_TYPE format to a molecule in MDL Molfile (V2000) format.mysql> SELECT MOLECULE_TO_MOLFILE(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 2-Aminoacetic acid OpenBabel11190809062D 10 9 0 0 0 0 0 0 0 0999 V2000 -0.1068 -1.0521 1.1509 H 0 0 0 0 0 0 0 0 0 0 0 0 0.0877 -0.0798 0.6477 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.2870 0.7082 1.3331 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5185 0.1919 0.3951 N 0 0 0 0 0 0 0 0 0 0 0 0 2.0168 0.1266 1.2568 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8890 -0.4693 -0.2544 H 0 0 0 0 0 0 0 0 0 0 0 0 -0.6775 -0.0767 -0.6578 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.4455 -0.6968 -1.6798 O 0 0 0 0 0 0 0 0 0 0 0 0 -1.7851 0.6991 -0.6705 O 0 0 0 0 0 0 0 0 0 0 0 0 -2.2101 0.6489 -1.5212 H 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 2 3 1 0 0 0 0 2 4 1 0 0 0 0 2 7 1 0 0 0 0 4 5 1 0 0 0 0 4 6 1 0 0 0 0 7 8 2 0 0 0 0 7 9 1 0 0 0 0 9 10 1 0 0 0 0 5 4 0 0 0 0 0 M END
PDB_TO_MOLECULE(molecule)PDB_TO_MOLECULEconverts a molecule in PDB format to a molecule in DEFAULT_TYPE format.mysql> SELECT PDB_TO_MOLECULE(pdb_col) FROM tbl_name -> WHERE name='1CRN'; -> OpenBabel08210907243D 327337 0 0 0 0 0 0 0 0999 V2000 17.0470 14.0990 3.6250 N 0 0 0 0 0 16.9670 12.7840 4.3380 C 0 0 0 0 0 15.6850 12.7550 5.1330 C 0 0 0 0 0 15.2680 13.8250 5.5940 O 0 0 0 0 0 18.1700 12.7030 5.3370 C 0 0 0 0 0 19.3340 12.8290 4.4630 O 0 0 0 0 0 18.1500 11.5460 6.3040 C 0 0 0 0 0 15.1150 11.5550 5.2650 N 0 0 0 0 0 13.8560 11.4690 6.0660 C 0 0 0 0 0 14.1640 10.7850 7.3790 C 0 0 0 0 0 14.9930 9.8620 7.4430 O 0 0 0 0 0 12.7320 10.7110 5.2610 C 0 0 0 0 0 13.3080 9.4390 4.9260 O 0 0 0 0 0 12.4840 11.4420 3.8950 C 0 0 0 0 0 13.4880 11.2410 8.4170 N 0 0 0 0 0 13.6600 10.7070 9.7870 C 0 0 0 0 0 12.2690 10.4310 10.3230 C 0 0 0 0 0 11.3930 11.3080 10.1850 O 0 0 0 0 0 14.3680 11.7480 10.6910 C 0 0 0 0 0 15.8850 12.4260 10.0160 S 0 0 0 0 0 12.0190 9.2720 10.9280 N 0 0 0 0 0 10.6460 8.9910 11.4080 C 0 0 0 0 0 10.6540 8.7930 12.9190 C 0 0 0 0 0 11.6590 8.2960 13.4910 O 0 0 0 0 0 10.0570 7.7520 10.6820 C 0 0 0 0 0 9.8370 8.0180 8.9040 S 0 0 0 0 0 9.5610 9.1080 13.5630 N 0 0 0 0 0 9.4480 9.0340 15.0120 C 0 0 0 0 0 9.2880 7.6700 15.6060 C 0 0 0 0 0 ...
SMILES_TO_MOLECULE(molecule)SMILES_TO_MOLECULEconverts a molecule in SMILES format to a molecule in DEFAULT_TYPE format.mysql> SELECT SMILES_TO_MOLECULE(smiles_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> OpenBabel11190809142D 5 4 0 0 0 0 0 0 0 0999 V2000 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 1 3 1 0 0 0 0 3 4 2 0 0 0 0 3 5 1 0 0 0 0 M END
MOLECULE_TO_SMILES(molecule)MOLECULE_TO_SMILESconverts a molecule in DEFAULT_TYPE format to molecule in SMILES format.mysql> SELECT MOLECULE_TO_SMILES(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> C(C(=O)O)N
Note
If you need a single canonical form for any particular molecule, regardless of atom order, the use of the
MOLECULE_TO_CANONICAL_SMILESfunction should be preferred.MOLECULE_TO_CANONICAL_SMILES(molecule)MOLECULE_TO_CANONICAL_SMILESconverts a molecule in DEFAULT_TYPE format to a molecule in Canonical SMILES format.mysql> SELECT MOLECULE_TO_CANONICAL_SMILES(molecule_col) -> FROM tbl_name WHERE name='2-Aminoacetic acid'; -> NCC(=O)O
V3000_TO_MOLECULE(molecule)V3000_TO_MOLECULEconverts a molecule in MDL Molfile (V3000) format to a molecule in DEFAULT_TYPE format.mysql> SELECT V3000_TO_MOLECULE(V3000_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 2-Aminoacetic acid OpenBabel11190809082D 10 9 0 0 0 0 0 0 0 0999 V2000 -0.1068 -1.0521 1.1509 H 0 0 0 0 0 0 0 0 0 0 0 0 0.0877 -0.0798 0.6477 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.2870 0.7082 1.3331 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5185 0.1919 0.3951 N 0 0 0 0 0 0 0 0 0 0 0 0 2.0168 0.1266 1.2568 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8890 -0.4693 -0.2544 H 0 0 0 0 0 0 0 0 0 0 0 0 -0.6775 -0.0767 -0.6578 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.4455 -0.6968 -1.6798 O 0 0 0 0 0 0 0 0 0 0 0 0 -1.7851 0.6991 -0.6705 O 0 0 0 0 0 0 0 0 0 0 0 0 -2.2101 0.6489 -1.5212 H 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 2 3 1 0 0 0 0 2 4 1 0 0 0 0 2 7 1 0 0 0 0 4 5 1 0 0 0 0 4 6 1 0 0 0 0 7 8 2 0 0 0 0 7 9 1 0 0 0 0 9 10 1 0 0 0 0 M END
MOLECULE_TO_V3000(molecule)MOLECULE_TO_V3000converts a molecule in DEFAULT_TYPE format to a molecule in MDL Molfile (V3000) format.mysql> SELECT MOLECULE_TO_V3000(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 2-Aminoacetic acid OpenBabel11190809132D 0 0 0 0 0 999 V3000 M V30 BEGIN CTAB M V30 COUNTS 10 9 0 0 0 M V30 BEGIN ATOM M V30 1 H -0.1068 -1.0521 1.1509 0 M V30 2 C 0.0877 -0.0798 0.6477 0 M V30 3 H -0.287 0.7082 1.3331 0 M V30 4 N 1.5185 0.1919 0.3951 0 M V30 5 H 2.0168 0.1266 1.2568 0 M V30 6 H 1.889 -0.4693 -0.2544 0 M V30 7 C -0.6775 -0.0767 -0.6578 0 M V30 8 O -0.4455 -0.6968 -1.6798 0 M V30 9 O -1.7851 0.6991 -0.6705 0 M V30 10 H -2.2101 0.6489 -1.5212 0 M V30 END ATOM M V30 BEGIN BOND M V30 1 1 1 2 M V30 2 1 2 3 M V30 3 1 2 4 M V30 4 1 2 7 M V30 5 1 4 5 M V30 6 1 4 6 M V30 7 2 7 8 M V30 8 1 7 9 M V30 9 1 9 10 M V30 END BOND M V30 END CTAB M END
Helper Commands
This section details helper functions. They permit to get informations about the Mychem environment.
INCHI_VERSION()INCHI_VERSIONreturns the version of the InChI library.mysql> SELECT INCHI_VERSION(); -> 1.02
MYCHEM_VERSION()MYCHEM_VERSIONreturns the Mychem version.mysql> SELECT MYCHEM_VERSION(); -> 1.0.0
OPENBABEL_VERSION()OPENBABEL_VERSIONreturns the Open Babel version.mysql> SELECT OPENBABEL_VERSION(); -> 2.3.2
Modification Commands
This section describes modification functions that modify chemical structures. The following functions are fully working starting on version 0.6.0 of Mychem.
Note
If a function of this section fails, it returns an empty string.
ADD_HYDROGENS(molecule)ADD_HYDROGENSadds hydrogens to a molecule (makes explicit the hydrogen atoms).mysql> SELECT ADD_HYDROGENS(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 2-Aminoacetic acid OpenBabel11190809242D 10 9 0 0 0 0 0 0 0 0999 V2000 -0.1068 -1.0521 1.1509 H 0 0 0 0 0 0 0 0 0 0 0 0 0.0877 -0.0798 0.6477 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.2870 0.7082 1.3331 H 0 0 0 0 0 0 0 0 0 0 0 0 1.5185 0.1919 0.3951 N 0 0 0 0 0 0 0 0 0 0 0 0 2.0168 0.1266 1.2568 H 0 0 0 0 0 0 0 0 0 0 0 0 1.8890 -0.4693 -0.2544 H 0 0 0 0 0 0 0 0 0 0 0 0 -0.6775 -0.0767 -0.6578 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.4455 -0.6968 -1.6798 O 0 0 0 0 0 0 0 0 0 0 0 0 -1.7851 0.6991 -0.6705 O 0 0 0 0 0 0 0 0 0 0 0 0 -2.2101 0.6489 -1.5212 H 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 2 3 1 0 0 0 0 2 4 1 0 0 0 0 2 7 1 0 0 0 0 4 5 1 0 0 0 0 4 6 1 0 0 0 0 7 8 2 0 0 0 0 7 9 1 0 0 0 0 9 10 1 0 0 0 0 M END
REMOVE_HYDROGENS(molecule)REMOVE_HYDROGENSremoves the hydrogens from a molecule (makes implicit the hydrogen atoms).mysql> SELECT REMOVE_HYDROGENS(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 2-Aminoacetic acid OpenBabel11190809252D 5 4 0 0 0 0 0 0 0 0999 V2000 0.0877 -0.0798 0.6477 C 0 0 0 0 0 0 0 0 0 0 0 0 1.5185 0.1919 0.3951 N 0 0 0 0 0 0 0 0 0 0 0 0 -0.6775 -0.0767 -0.6578 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.4455 -0.6968 -1.6798 O 0 0 0 0 0 0 0 0 0 0 0 0 -1.7851 0.6991 -0.6705 O 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 1 3 1 0 0 0 0 3 4 2 0 0 0 0 3 5 1 0 0 0 0 M END
STRIP_SALTS(molecule)STRIP_SALTSremoves all atoms except for the larger contiguous fragment.mysql> SELECT STRIP_SALTS(molecule_col) FROM tbl_name -> WHERE name='sodium 2-aminoacetate'; -> OpenBabel11190812342D 5 4 0 0 0 0 0 0 0 0999 V2000 0.0000 0.0000 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 2 3 1 0 0 0 0 3 4 2 0 0 0 0 3 5 1 0 0 0 0 M CHG 1 5 -1 M END
Molmatch Commands
The molmatch functions permit to compare chemical structures.
BIT_FP_AND(fingerprint, fingerprint)BIT_FP_ANDoperates on two fingerprints (bit patterns) of equal length and performs the logical AND operation on each pair of corresponding bits. In each pair, the result is 1 if the both bits are 1. Otherwise, the result is 0. If the two fingerprints do not have the same length, the function returns NULL.mysql> SELECT BIT_FP_AND(fingerprint1, fingerprint2); -> binary fingerprint
Note
The
BIT_FP_ANDfunction is very useful when working with structure fingerprints. For example, if a molecule (with a fingerprint fp1) is a substructure of an other molecule (with a fingerprint fp2), the following property is observed:mysql> SELECT TANIMOTO(BIT_FP_AND(fp1, fp2), fp1); -> 1
BIT_FP_COUNT(fingerprint)BIT_FP_COUNTreturns the number of bits that are set in the fingerprint binary representation.mysql> SELECT BIT_FP_COUNT(fp_col) FROM tbl_name -> WHERE name='1H-indole'; -> 23
BIT_FP_OR(fingerprint, fingerprint)BIT_FP_ORoperates on two fingerprints (bit patterns) of equal length and performs the logical OR operation on each pair of corresponding bits. In each pair, if the first bit is 1 or the second bit is 1 (or both), the result is 1. Otherwise, the result is 0. If the two fingerprints do not have same length, the function returns NULL.mysql> SELECT BIT_FP_OR(fingerprint1, fingerprint2); -> binary fingerprint
MATCH_SUBSTRUCT(query_smarts, reference_obmol)MATCH_SUBSTRUCTchecks if a query_smarts fragment is a substructure of a reference_obmol molecule. The first argument is a SMARTS string, whereas the second argument is a serialized OBMol object. The second argument type is generated by theMOLECULE_TO_SERIALIZEDOBMOLfunction. If the query_smarts is a substructure of reference_obmol, the function returns 1, otherwise, it returns 0.mysql> SELECT MATCH_SUBSTRUCT('C=O', serializedobmol_col) -> FROM tbl_name WHERE name='2-Aminoacetic acid'; -> 1
Note
If the function encounters an error, it returns NULL.
SUBSTRUCT_ATOM_IDS(query_smarts, reference_obmol)SUBSTRUCT_ATOM_IDSreturns the atom ids of a reference_obmol molecule that are contained in substructures matching a query_smarts fragment. The first argument is a SMARTS string, whereas the second argument is a serialized OBMol object. The second argument type is generated by theMOLECULE_TO_SERIALIZEDOBMOLfunction. If a reference_obmol molecule contains several fragments matching a query_smarts fragment, a list of items is returned. Each item contains a fragment’s atom ids and is separated from the next item by a semicolon character.mysql> SELECT SUBSTRUCT_ATOM_IDS('C(=O)', serializedobmol_col) -> FROM tbl_name WHERE name='2-Aminoacetic acid'; -> 2 3 ;
Note
If the function encounter an error, it returns NULL.
SUBSTRUCT_COUNT(query_smarts, reference_obmol)SUBSTRUCT_COUNTreturns the number of query_smarts fragments founded in a reference_obmol molecule. The first argument is a SMARTS string, whereas the second argument is a serialized OBMol object. The second argument type is generated by theMOLECULE_TO_SERIALIZEDOBMOLfunction.mysql> SELECT SUBSTRUCT_COUNT('C(=O)', serializedobmol_col) -> FROM tbl_name WHERE name='2-Aminoacetic acid'; -> 2
Note
If the function encounter an error, it returns NULL.
TANIMOTO(first_fingerprint, second_fingerprint)TANIMOTOreturns the tanimoto coefficient between two fingerprints. Fingerprints are bit patterns and can be generated with theFINGERPRINTfunction. The returned value is comprised between 0 and 1. The higher the tanimoto coefficient is, the more the molecules are similar.mysql> SELECT TANIMOTO(molecule_fp, fp_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 0.8934
Note
The use of another Mychem functions (like
FINGERPRINTorFINGERPRINT2) within theTANIMOTOfunction makes the query slower. In order to get the best performance, you should use theSETfunction of MySQL:mysql> SET @fp = (SELECT FINGERPRINT2( -> SMILES_TO_MOLECULE('C(C(=O)O)N'))); mysql> SELECT id FROM tbl_name WHERE TANIMOTO(@fp, fp_col) -> FROM tbl_name > 0.7; -> list of id
Property Commands
This section describes several functions that calculate molecular properties.
EXACTMASS(molecule)EXACTMASSreturns the monoisotopic molecular weight of a molecule. The monoisotopic molecular weight is defined as the molecular weight calculated using the mass of the most abundant isotope for each element of a molecule. The unit of the returned value is g.mol-1.mysql> SELECT EXACTMASS(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 75.032028
IS_2D(molecule)IS_2Dreturns 1 if a molecule has 2D coordinates.mysql> SELECT IS_2D(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 1
IS_3D(molecule)IS_3Dreturns 1 if a molecule has 3D coordinates.mysql> SELECT IS_3D(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 1
IS_CHIRAL(molecule)IS_CHIRALreturns 1 if a molecule is chiral.mysql> SELECT IS_CHIRAL(molecule_col) FROM tbl_name -> WHERE name='2S-Butan-2-ol'; -> 1
MOLFORMULA(molecule)MOLFORMULAreturns the molecular formula of a molecule.mysql> SELECT MOLFORMULA(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> C2H5NO2
MOLLOGP(molecule)MOLLOGPreturns the LogP of a molecule.mysql> SELECT MOLLOGP(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> -0.27
Note
Note that the result of this function is depending on the hydrogen atoms. If there is any doubt on the presence of hydrogen atoms in the molecule, it is recommended to use the
ADD_HYDROGENSfunction.MOLMR(molecule)MOLMRreturns the molar refractivity of a molecule. The unit of the returned value is J.mol-1.K-1.mysql> SELECT MOLMR(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 16.2072
Note
Note that the result of this function is depending on the hydrogen atoms. If there is any doubt on the presence of hydrogen atoms in the molecule, it is recommended to use the
ADD_HYDROGENSfunction.MOLPSA(molecule)MOLPSAreturns the topological polar surface area of a molecule. The unit of the returned value is Å2.mysql> SELECT MOLPSA(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 63.32
Note
Note that the result of this function is depending on the hydrogen atoms. If there is any doubt on the presence of hydrogen atoms in the molecule, it is recommended to use the
ADD_HYDROGENSfunction.MOLWEIGHT(molecule)MOLWEIGHTreturns the molecular weight of a molecule. The unit of the returned value is g.mol-1.mysql> SELECT MOLWEIGHT(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 75.066600
NUMBER_OF_ACCEPTORS(molecule)NUMBER_OF_ACCEPTORSreturns the number of hydrogen-bond acceptors in a molecule.mysql> SELECT NUMBER_OF_ACCEPTORS(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 3
Note
Note that the result of this function is depending on the hydrogen atoms. If there is any doubt on the presence of hydrogen atoms in the molecule, it is recommended to use the
ADD_HYDROGENSfunction.NUMBER_OF_ATOMS(molecule)NUMBER_OF_ATOMSreturns the number of atoms in a molecule.mysql> SELECT NUMBER_OF_ATOMS(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 10
NUMBER_OF_BONDS(molecule)NUMBER_OF_BONDSreturns the number of bonds in a molecule.mysql> SELECT NUMBER_OF_BONDS(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 9
NUMBER_OF_DONORS(molecule)NUMBER_OF_DONORSreturns the numbers of hydrogen-bond donors in a molecule.mysql> SELECT NUMBER_OF_DONORS(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 2
Note
Note that the result of this function is depending on the presence of hydrogen atoms in the molecule. If the hydrogen atoms are not described by the molecule, the
ADD_HYDROGENSfunction must be used. The two following examples described the effect of the ADD_HYDROGENS function:mysql> SELECT NUMBER_OF_DONORS( -> SMILES_TO_MOLECULE('O=C(O)CCN')); -> 0
mysql> SELECT NUMBER_OF_DONORS(ADD_HYDROGENS( -> SMILES_TO_MOLECULE('O=C(O)CCN'))); -> 2
NUMBER_OF_HEAVY_ATOMS(molecule)NUMBER_OF_HEAVY_ATOMSreturns the number of heavy atoms in a molecule (all atoms except hydrogen).mysql> SELECT NUMBER_OF_HEAVY_ATOMS(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 5
NUMBER_OF_RINGS(molecule)NUMBER_OF_RINGSreturns the number of rings in a molecule.mysql> SELECT NUMBER_OF_RINGS(molecule_col) FROM tbl_name -> WHERE name='adenine'; -> 2
NUMBER_OF_ROTABLE_BONDS(molecule)NUMBER_OF_ROTABLE_BONDSreturns the number of rotable bonds in a molecule.mysql> SELECT NUMBER_OF_ROTABLE_BONDS(molecule_col) FROM tbl_name -> WHERE name='2-Aminoacetic acid'; -> 1
TOTAL_CHARGE(molecule)TOTAL_CHARGEreturns the total charge of a molecule.mysql> SELECT TOTAL_CHARGE(SMILES_TO_MOLECULE('NCC(=O)[O-]')); -> -1