Command Reference

This chapter describes all SQL commands provided by Mychem. These commands are classified in five sections:

  • Conversion Commands - this section details functions that convert chemical files.
  • Helper Commands - this section details functions that return informations about the Mychem environment.
  • Modification Commands - this section details functions that modify chemical structures.
  • Molmatch Commands - this section details functions that compare chemical structures.
  • Property Commands - this section details functions that compute molecular properties.

The Default Molecule Type

Many functions are using or returning a molecule in DEFAULT_TYPE format. Since version 0.6.0 of Mychem, the DEFAULT_TYPE format is MDL Molfile (V2000). In earlier versions, the DEFAULT_TYPE format was InChI.

The speed of several functions has been measured when using InChI and MDL Molfile formats. It has been shown that using MDL Molfile is a bit faster than InChI. Using the MDL Molfile format is also interesting, as it permits to store 2D or 3D coordinates. Moreover, the MDL Molfile format is commonly used by many softwares.

Conversion Commands

Conversion commmands permit to convert chemical data from a format to another format. Mychem uses the Open Babel library for the conversion, but does not implement all the 80 chemical file formats supported by Open Babel. Mychem supports only few of them. They are detailed in the Molecule Formats appendix.

To avoid having too many functions, a set of two functions is available for each format: FORMAT_TO_MOLECULE and MOLECULE_TO_FORMAT. If you need other formats, please open a ticket on the feature tracker.

Note

If a function of this section fails, it returns an empty string.

  • CML_TO_MOLECULE(molecule)

    CML_TO_MOLECULE converts a molecule in CML format to a molecule in DEFAULT_TYPE format.

    mysql> SELECT CML_TO_MOLECULE(cml_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 2-Aminoacetic acid
     OpenBabel11190809032D
    
     10  9  0  0  0  0  0  0  0  0999 V2000
       -0.1068   -1.0521    1.1509 H   0  0  0  0  0  0  0  0  0  0  0  0
        0.0877   -0.0798    0.6477 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.2870    0.7082    1.3331 H   0  0  0  0  0  0  0  0  0  0  0  0
        1.5185    0.1919    0.3951 N   0  0  0  0  0  0  0  0  0  0  0  0
        2.0168    0.1266    1.2568 H   0  0  0  0  0  0  0  0  0  0  0  0
        1.8890   -0.4693   -0.2544 H   0  0  0  0  0  0  0  0  0  0  0  0
       -0.6775   -0.0767   -0.6578 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.4455   -0.6968   -1.6798 O   0  0  0  0  0  0  0  0  0  0  0  0
       -1.7851    0.6991   -0.6705 O   0  0  0  0  0  0  0  0  0  0  0  0
       -2.2101    0.6489   -1.5212 H   0  0  0  0  0  0  0  0  0  0  0  0
      1  2  1  0  0  0  0
      2  3  1  0  0  0  0
      2  4  1  0  0  0  0
      2  7  1  0  0  0  0
      4  5  1  0  0  0  0
      4  6  1  0  0  0  0
      7  8  2  0  0  0  0
      7  9  1  0  0  0  0
      9 10  1  0  0  0  0
    M  END
    
  • MOLECULE_TO_CML(molecule)

    MOLECULE_TO_CML converts a molecule in DEFAULT_TYPE format to a molecule in CML format.

    mysql> SELECT MOLECULE_TO_CML(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> <molecule id="id2-Aminoacetic acid">
     <name>2-Aminoacetic acid</name>
     <atomArray>
      <atom id="a1" elementType="H" x3="-0.106800" y3="-1.052100" z3="1.150900"/>
      <atom id="a2" elementType="C" x3="0.087700" y3="-0.079800" z3="0.647700"/>
      <atom id="a3" elementType="H" x3="-0.287000" y3="0.708200" z3="1.333100"/>
      <atom id="a4" elementType="N" x3="1.518500" y3="0.191900" z3="0.395100"/>
      <atom id="a5" elementType="H" x3="2.016800" y3="0.126600" z3="1.256800"/>
      <atom id="a6" elementType="H" x3="1.889000" y3="-0.469300" z3="-0.254400"/>
      <atom id="a7" elementType="C" x3="-0.677500" y3="-0.076700" z3="-0.657800"/>
      <atom id="a8" elementType="O" x3="-0.445500" y3="-0.696800" z3="-1.679800"/>
      <atom id="a9" elementType="O" x3="-1.785100" y3="0.699100" z3="-0.670500"/>
      <atom id="a10" elementType="H" x3="-2.210100" y3="0.648900" z3="-1.521200"/>
     </atomArray>
     <bondArray>
      <bond atomRefs2="a1 a2" order="1"/>
      <bond atomRefs2="a2 a3" order="1"/>
      <bond atomRefs2="a2 a4" order="1"/>
      <bond atomRefs2="a2 a7" order="1"/>
      <bond atomRefs2="a4 a5" order="1"/>
      <bond atomRefs2="a4 a6" order="1"/>
      <bond atomRefs2="a7 a8" order="2"/>
      <bond atomRefs2="a7 a9" order="1"/>
      <bond atomRefs2="a9 a10" order="1"/>
     </bondArray>
    </molecule>
    
  • FINGERPRINT(molecule, type)

    FINGERPRINT converts a molecule in DEFAULT_TYPE format to a fingerprint. The fingerprint type is specified by the second argument (FP2, FP3 or FP4). The SQL type of the converted molecule is a binary string for all kinds of fingerprint.

    mysql> SELECT FINGERPRINT(molecule_col, "FP2") FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> binary fingerprint (type FP2)
    

    Note

    Tanimoto scoring can be improved by using a concatenation of different fingerprint types. The concatenation of fingerprints can be performed with the following query:

    mysql> SELECT CONCAT(FINGERPRINT(molecule_col,"FP2"),
        -> FINGERPRINT(molecule_col,"FP3")) FROM tbl_name
        -> WHERE id=9;
           -> binary fingerprint (type FP2 + type FP3)
    
  • FINGERPRINT2(molecule)

    FINGERPRINT2 converts a molecule in DEFAULT_TYPE format to a FP2 fingerprint. The SQL type of FP2 fingerprints is a binary string.

    mysql> SELECT FINGERPRINT2(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> binary fingerprint (type FP2)
    
  • FINGERPRINT3(molecule)

    FINGERPRINT3 converts a molecule in DEFAULT_TYPE format to a FP3 fingerprint. The SQL type of FP3 fingerprints is a binary string.

    mysql> SELECT FINGERPRINT3(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> binary fingerprint (type FP3)
    
  • FINGERPRINT4(molecule)

    FINGERPRINT4 converts a molecule in DEFAULT_TYPE format to a FP4 fingerprint. The SQL type of FP4 fingerprints is a binary string.

    mysql> SELECT FINGERPRINT4(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> binary fingerprint (type FP4)
    
  • INCHI_TO_MOLECULE(molecule)

    INCHI_TO_MOLECULE converts a molecule in InChI format to a molecule in DEFAULT_TYPE format.

    mysql> SELECT INCHI_TO_MOLECULE(inchi_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            ->
     OpenBabel11190809142D
    
      5  4  0  0  0  0  0  0  0  0999 V2000
        0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
      1  2  1  0  0  0  0
      1  3  1  0  0  0  0
      2  4  2  0  0  0  0
      2  5  1  0  0  0  0
    M  END
    
  • MOLECULE_TO_INCHI(molecule)

    MOLECULE_TO_INCHI converts a molecule in DEFAULT_TYPE format to a molecule in InChI format.

    mysql> SELECT MOLECULE_TO_INCHI(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> InChI=1S/C2H5NO2/c3-1-2(4)5/h1,3H2,(H,4,5)
    

    Note

    The INCHI_VERSION function permits to know which version of the InChI library is used.

  • MOL2_TO_MOLECULE(molecule)

    MOL2_TO_MOLECULE converts a molecule in Sybyl Mol2 format to a molecule in DEFAULT_TYPE format.

    mysql> SELECT MOL2_TO_MOLECULE(mol2_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 2-Aminoacetic acid
     OpenBabel11190809032D
    
     10  9  0  0  0  0  0  0  0  0999 V2000
       -0.1068   -1.0521    1.1509 H   0  0  0  0  0  0  0  0  0  0  0  0
        0.0877   -0.0798    0.6477 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.2870    0.7082    1.3331 H   0  0  0  0  0  0  0  0  0  0  0  0
        1.5185    0.1919    0.3951 N   0  0  0  0  0  0  0  0  0  0  0  0
        2.0168    0.1266    1.2568 H   0  0  0  0  0  0  0  0  0  0  0  0
        1.8890   -0.4693   -0.2544 H   0  0  0  0  0  0  0  0  0  0  0  0
       -0.6775   -0.0767   -0.6578 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.4455   -0.6968   -1.6798 O   0  0  0  0  0  0  0  0  0  0  0  0
       -1.7851    0.6991   -0.6705 O   0  0  0  0  0  0  0  0  0  0  0  0
       -2.2101    0.6489   -1.5212 H   0  0  0  0  0  0  0  0  0  0  0  0
      1  2  1  0  0  0  0
      2  3  1  0  0  0  0
      2  4  1  0  0  0  0
      2  7  1  0  0  0  0
      4  5  1  0  0  0  0
      4  6  1  0  0  0  0
      7  8  2  0  0  0  0
      7  9  1  0  0  0  0
      9 10  1  0  0  0  0
    M  END
    
  • MOLECULE_TO_MOL2(molecule)

    MOLECULE_TO_MOL2 converts a molecule in DEFAULT_TYPE format to a molecule in Sybyl Mol2 format.

    mysql> SELECT MOLECULE_TO_MOL2(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            ->@<TRIPOS>MOLECULE
    2-Aminoacetic acid
     10 9 0 0 0
    SMALL
    GASTEIGER
    
    @<TRIPOS>ATOM
          1 HA1        -0.1068   -1.0521    1.1509 H       1  GLY1        0.0537
          2 CA          0.0877   -0.0798    0.6477 C.3     1  GLY1        0.0918
          3 HA2        -0.2870    0.7082    1.3331 H       1  GLY1        0.0537
          4 N           1.5185    0.1919    0.3951 N.3     1  GLY1       -0.3209
          5 H1          2.0168    0.1266    1.2568 H       1  GLY1        0.1187
          6 H2          1.8890   -0.4693   -0.2544 H       1  GLY1        0.1187
          7 C          -0.6775   -0.0767   -0.6578 C.2     1  GLY1        0.3185
          8 O          -0.4455   -0.6968   -1.6798 O.2     1  GLY1       -0.2496
          9 OXT        -1.7851    0.6991   -0.6705 O.3     1  GLY1       -0.4797
         10 HXT        -2.2101    0.6489   -1.5212 H       1  GLY1        0.2951
    @<TRIPOS>BOND
         1     1     2    1
         2     2     3    1
         3     2     4    1
         4     2     7    1
         5     4     5    1
         6     4     6    1
         7     7     8    2
         8     7     9    1
         9     9    10    1
    
  • MOLECULE_TO_MOLECULE(molecule)

    MOLECULE_TO_MOLECULE converts a molecule in the old DEFAULT_TYPE format (InChI) to a molecule in the current DEFAULT_TYPE format (MDL Molfile). This function is useful when updating a database with a new version of Mychem.

    mysql> SELECT MOLECULE_TO_MOLECULE(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            ->
     OpenBabel11190809092D
    
      5  4  0  0  0  0  0  0  0  0999 V2000
        0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
      1  2  1  0  0  0  0
      1  3  1  0  0  0  0
      2  4  2  0  0  0  0
      2  5  1  0  0  0  0
    M  END
    
  • MOLECULE_TO_SERIALIZEDOBMOL(molecule)

    MOLECULE_TO_SERIALIZEDOBMOL converts a molecule in DEFAULT_TYPE format to a serialized OBMol object. The SQL type of the serialized object is a binary string.

    mysql> SELECT MOLECULE_TO_SERIALIZEDOBMOL(molecule_col)
        -> FROM tbl_name WHERE name='2-Aminoacetic acid';
            -> binary string
    
  • MOLFILE_TO_MOLECULE(molecule)

    MOLFILE_TO_MOLECULE converts a molecule in MDL Molfile (V2000) format to a molecule in DEFAULT_TYPE format.

    mysql> SELECT MOLFILE_TO_MOLECULE(molfile_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 2-Aminoacetic acid
     OpenBabel12091111263D
    
     10  9  0  0  0  0  0  0  0  0999 V2000
       -0.1068   -1.0521    1.1509 H   0  0  0  0  0  0  0  0  0  0  0  0
        0.0877   -0.0798    0.6477 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.2870    0.7082    1.3331 H   0  0  0  0  0  0  0  0  0  0  0  0
        1.5185    0.1919    0.3951 N   0  0  0  0  0  0  0  0  0  0  0  0
        2.0168    0.1266    1.2568 H   0  0  0  0  0  0  0  0  0  0  0  0
        1.8890   -0.4693   -0.2544 H   0  0  0  0  0  0  0  0  0  0  0  0
       -0.6775   -0.0767   -0.6578 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.4455   -0.6968   -1.6798 O   0  0  0  0  0  0  0  0  0  0  0  0
       -1.7851    0.6991   -0.6705 O   0  0  0  0  0  0  0  0  0  0  0  0
       -2.2101    0.6489   -1.5212 H   0  0  0  0  0  0  0  0  0  0  0  0
      1  2  1  0  0  0  0
      2  3  1  0  0  0  0
      2  4  1  0  0  0  0
      2  7  1  0  0  0  0
      4  5  1  0  0  0  0
      4  6  1  0  0  0  0
      7  8  2  0  0  0  0
      7  9  1  0  0  0  0
      9 10  1  0  0  0  0
    M  END
    
  • MOLECULE_TO_MOLFILE(molecule)

    MOLECULE_TO_MOLFILE converts a molecule in DEFAULT_TYPE format to a molecule in MDL Molfile (V2000) format.

    mysql> SELECT MOLECULE_TO_MOLFILE(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 2-Aminoacetic acid
     OpenBabel11190809062D
    
     10  9  0  0  0  0  0  0  0  0999 V2000
       -0.1068   -1.0521    1.1509 H   0  0  0  0  0  0  0  0  0  0  0  0
        0.0877   -0.0798    0.6477 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.2870    0.7082    1.3331 H   0  0  0  0  0  0  0  0  0  0  0  0
        1.5185    0.1919    0.3951 N   0  0  0  0  0  0  0  0  0  0  0  0
        2.0168    0.1266    1.2568 H   0  0  0  0  0  0  0  0  0  0  0  0
        1.8890   -0.4693   -0.2544 H   0  0  0  0  0  0  0  0  0  0  0  0
       -0.6775   -0.0767   -0.6578 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.4455   -0.6968   -1.6798 O   0  0  0  0  0  0  0  0  0  0  0  0
       -1.7851    0.6991   -0.6705 O   0  0  0  0  0  0  0  0  0  0  0  0
       -2.2101    0.6489   -1.5212 H   0  0  0  0  0  0  0  0  0  0  0  0
      1  2  1  0  0  0  0
      2  3  1  0  0  0  0
      2  4  1  0  0  0  0
      2  7  1  0  0  0  0
      4  5  1  0  0  0  0
      4  6  1  0  0  0  0
      7  8  2  0  0  0  0
      7  9  1  0  0  0  0
      9 10  1  0  0  0  0
      5  4  0  0  0  0  0
    M  END
    
  • PDB_TO_MOLECULE(molecule)

    PDB_TO_MOLECULE converts a molecule in PDB format to a molecule in DEFAULT_TYPE format.

    mysql> SELECT PDB_TO_MOLECULE(pdb_col) FROM tbl_name
        -> WHERE name='1CRN';
            ->
     OpenBabel08210907243D
    
    327337  0  0  0  0  0  0  0  0999 V2000
       17.0470   14.0990    3.6250 N   0  0  0  0  0
       16.9670   12.7840    4.3380 C   0  0  0  0  0
       15.6850   12.7550    5.1330 C   0  0  0  0  0
       15.2680   13.8250    5.5940 O   0  0  0  0  0
       18.1700   12.7030    5.3370 C   0  0  0  0  0
       19.3340   12.8290    4.4630 O   0  0  0  0  0
       18.1500   11.5460    6.3040 C   0  0  0  0  0
       15.1150   11.5550    5.2650 N   0  0  0  0  0
       13.8560   11.4690    6.0660 C   0  0  0  0  0
       14.1640   10.7850    7.3790 C   0  0  0  0  0
       14.9930    9.8620    7.4430 O   0  0  0  0  0
       12.7320   10.7110    5.2610 C   0  0  0  0  0
       13.3080    9.4390    4.9260 O   0  0  0  0  0
       12.4840   11.4420    3.8950 C   0  0  0  0  0
       13.4880   11.2410    8.4170 N   0  0  0  0  0
       13.6600   10.7070    9.7870 C   0  0  0  0  0
       12.2690   10.4310   10.3230 C   0  0  0  0  0
       11.3930   11.3080   10.1850 O   0  0  0  0  0
       14.3680   11.7480   10.6910 C   0  0  0  0  0
       15.8850   12.4260   10.0160 S   0  0  0  0  0
       12.0190    9.2720   10.9280 N   0  0  0  0  0
       10.6460    8.9910   11.4080 C   0  0  0  0  0
       10.6540    8.7930   12.9190 C   0  0  0  0  0
       11.6590    8.2960   13.4910 O   0  0  0  0  0
       10.0570    7.7520   10.6820 C   0  0  0  0  0
        9.8370    8.0180    8.9040 S   0  0  0  0  0
        9.5610    9.1080   13.5630 N   0  0  0  0  0
        9.4480    9.0340   15.0120 C   0  0  0  0  0
        9.2880    7.6700   15.6060 C   0  0  0  0  0
    ...
    
  • SMILES_TO_MOLECULE(molecule)

    SMILES_TO_MOLECULE converts a molecule in SMILES format to a molecule in DEFAULT_TYPE format.

    mysql> SELECT SMILES_TO_MOLECULE(smiles_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            ->
     OpenBabel11190809142D
    
      5  4  0  0  0  0  0  0  0  0999 V2000
        0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
      1  2  1  0  0  0  0
      1  3  1  0  0  0  0
      3  4  2  0  0  0  0
      3  5  1  0  0  0  0
    M  END
    
  • MOLECULE_TO_SMILES(molecule)

    MOLECULE_TO_SMILES converts a molecule in DEFAULT_TYPE format to molecule in SMILES format.

    mysql> SELECT MOLECULE_TO_SMILES(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> C(C(=O)O)N
    

    Note

    If you need a single canonical form for any particular molecule, regardless of atom order, the use of the MOLECULE_TO_CANONICAL_SMILES function should be preferred.

  • MOLECULE_TO_CANONICAL_SMILES(molecule)

    MOLECULE_TO_CANONICAL_SMILES converts a molecule in DEFAULT_TYPE format to a molecule in Canonical SMILES format.

    mysql> SELECT MOLECULE_TO_CANONICAL_SMILES(molecule_col)
        -> FROM tbl_name WHERE name='2-Aminoacetic acid';
            -> NCC(=O)O
    
  • V3000_TO_MOLECULE(molecule)

    V3000_TO_MOLECULE converts a molecule in MDL Molfile (V3000) format to a molecule in DEFAULT_TYPE format.

    mysql> SELECT V3000_TO_MOLECULE(V3000_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 2-Aminoacetic acid
     OpenBabel11190809082D
    
     10  9  0  0  0  0  0  0  0  0999 V2000
       -0.1068   -1.0521    1.1509 H   0  0  0  0  0  0  0  0  0  0  0  0
        0.0877   -0.0798    0.6477 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.2870    0.7082    1.3331 H   0  0  0  0  0  0  0  0  0  0  0  0
        1.5185    0.1919    0.3951 N   0  0  0  0  0  0  0  0  0  0  0  0
        2.0168    0.1266    1.2568 H   0  0  0  0  0  0  0  0  0  0  0  0
        1.8890   -0.4693   -0.2544 H   0  0  0  0  0  0  0  0  0  0  0  0
       -0.6775   -0.0767   -0.6578 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.4455   -0.6968   -1.6798 O   0  0  0  0  0  0  0  0  0  0  0  0
       -1.7851    0.6991   -0.6705 O   0  0  0  0  0  0  0  0  0  0  0  0
       -2.2101    0.6489   -1.5212 H   0  0  0  0  0  0  0  0  0  0  0  0
      1  2  1  0  0  0  0
      2  3  1  0  0  0  0
      2  4  1  0  0  0  0
      2  7  1  0  0  0  0
      4  5  1  0  0  0  0
      4  6  1  0  0  0  0
      7  8  2  0  0  0  0
      7  9  1  0  0  0  0
      9 10  1  0  0  0  0
    M  END
    
  • MOLECULE_TO_V3000(molecule)

    MOLECULE_TO_V3000 converts a molecule in DEFAULT_TYPE format to a molecule in MDL Molfile (V3000) format.

    mysql> SELECT MOLECULE_TO_V3000(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 2-Aminoacetic acid
     OpenBabel11190809132D
      0  0  0     0  0            999 V3000
    
    M  V30 BEGIN CTAB
    M  V30 COUNTS 10 9 0 0 0
    M  V30 BEGIN ATOM
    M  V30 1 H -0.1068 -1.0521 1.1509 0
    M  V30 2 C 0.0877 -0.0798 0.6477 0
    M  V30 3 H -0.287 0.7082 1.3331 0
    M  V30 4 N 1.5185 0.1919 0.3951 0
    M  V30 5 H 2.0168 0.1266 1.2568 0
    M  V30 6 H 1.889 -0.4693 -0.2544 0
    M  V30 7 C -0.6775 -0.0767 -0.6578 0
    M  V30 8 O -0.4455 -0.6968 -1.6798 0
    M  V30 9 O -1.7851 0.6991 -0.6705 0
    M  V30 10 H -2.2101 0.6489 -1.5212 0
    M  V30 END ATOM
    M  V30 BEGIN BOND
    M  V30 1 1 1 2
    M  V30 2 1 2 3
    M  V30 3 1 2 4
    M  V30 4 1 2 7
    M  V30 5 1 4 5
    M  V30 6 1 4 6
    M  V30 7 2 7 8
    M  V30 8 1 7 9
    M  V30 9 1 9 10
    M  V30 END BOND
    M  V30 END CTAB
    M  END
    

Helper Commands

This section details helper functions. They permit to get informations about the Mychem environment.

  • INCHI_VERSION()

    INCHI_VERSION returns the version of the InChI library.

    mysql> SELECT INCHI_VERSION();
            -> 1.02
    
  • MYCHEM_VERSION()

    MYCHEM_VERSION returns the Mychem version.

    mysql> SELECT MYCHEM_VERSION();
            -> 1.0.0
    
  • OPENBABEL_VERSION()

    OPENBABEL_VERSION returns the Open Babel version.

    mysql> SELECT OPENBABEL_VERSION();
            -> 2.3.2
    

Modification Commands

This section describes modification functions that modify chemical structures. The following functions are fully working starting on version 0.6.0 of Mychem.

Note

If a function of this section fails, it returns an empty string.

  • ADD_HYDROGENS(molecule)

    ADD_HYDROGENS adds hydrogens to a molecule (makes explicit the hydrogen atoms).

    mysql> SELECT ADD_HYDROGENS(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 2-Aminoacetic acid
     OpenBabel11190809242D
    
     10  9  0  0  0  0  0  0  0  0999 V2000
       -0.1068   -1.0521    1.1509 H   0  0  0  0  0  0  0  0  0  0  0  0
        0.0877   -0.0798    0.6477 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.2870    0.7082    1.3331 H   0  0  0  0  0  0  0  0  0  0  0  0
        1.5185    0.1919    0.3951 N   0  0  0  0  0  0  0  0  0  0  0  0
        2.0168    0.1266    1.2568 H   0  0  0  0  0  0  0  0  0  0  0  0
        1.8890   -0.4693   -0.2544 H   0  0  0  0  0  0  0  0  0  0  0  0
       -0.6775   -0.0767   -0.6578 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.4455   -0.6968   -1.6798 O   0  0  0  0  0  0  0  0  0  0  0  0
       -1.7851    0.6991   -0.6705 O   0  0  0  0  0  0  0  0  0  0  0  0
       -2.2101    0.6489   -1.5212 H   0  0  0  0  0  0  0  0  0  0  0  0
      1  2  1  0  0  0  0
      2  3  1  0  0  0  0
      2  4  1  0  0  0  0
      2  7  1  0  0  0  0
      4  5  1  0  0  0  0
      4  6  1  0  0  0  0
      7  8  2  0  0  0  0
      7  9  1  0  0  0  0
      9 10  1  0  0  0  0
    M  END
    
  • REMOVE_HYDROGENS(molecule)

    REMOVE_HYDROGENS removes the hydrogens from a molecule (makes implicit the hydrogen atoms).

    mysql> SELECT REMOVE_HYDROGENS(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            ->  2-Aminoacetic acid
     OpenBabel11190809252D
    
      5  4  0  0  0  0  0  0  0  0999 V2000
        0.0877   -0.0798    0.6477 C   0  0  0  0  0  0  0  0  0  0  0  0
        1.5185    0.1919    0.3951 N   0  0  0  0  0  0  0  0  0  0  0  0
       -0.6775   -0.0767   -0.6578 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.4455   -0.6968   -1.6798 O   0  0  0  0  0  0  0  0  0  0  0  0
       -1.7851    0.6991   -0.6705 O   0  0  0  0  0  0  0  0  0  0  0  0
      1  2  1  0  0  0  0
      1  3  1  0  0  0  0
      3  4  2  0  0  0  0
      3  5  1  0  0  0  0
    M  END
    
  • STRIP_SALTS(molecule)

    STRIP_SALTS removes all atoms except for the larger contiguous fragment.

    mysql> SELECT STRIP_SALTS(molecule_col) FROM tbl_name
        -> WHERE name='sodium 2-aminoacetate';
            ->
     OpenBabel11190812342D
    
      5  4  0  0  0  0  0  0  0  0999 V2000
        0.0000    0.0000    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
        0.0000    0.0000    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
      1  2  1  0  0  0  0
      2  3  1  0  0  0  0
      3  4  2  0  0  0  0
      3  5  1  0  0  0  0
    M  CHG  1   5  -1
    M  END
    

Molmatch Commands

The molmatch functions permit to compare chemical structures.

  • BIT_FP_AND(fingerprint, fingerprint)

    BIT_FP_AND operates on two fingerprints (bit patterns) of equal length and performs the logical AND operation on each pair of corresponding bits. In each pair, the result is 1 if the both bits are 1. Otherwise, the result is 0. If the two fingerprints do not have the same length, the function returns NULL.

    mysql> SELECT BIT_FP_AND(fingerprint1, fingerprint2);
            -> binary fingerprint
    

    Note

    The BIT_FP_AND function is very useful when working with structure fingerprints. For example, if a molecule (with a fingerprint fp1) is a substructure of an other molecule (with a fingerprint fp2), the following property is observed:

    mysql> SELECT TANIMOTO(BIT_FP_AND(fp1, fp2), fp1);
            -> 1
    
  • BIT_FP_COUNT(fingerprint)

    BIT_FP_COUNT returns the number of bits that are set in the fingerprint binary representation.

    mysql> SELECT BIT_FP_COUNT(fp_col) FROM tbl_name
        -> WHERE name='1H-indole';
            -> 23
    
  • BIT_FP_OR(fingerprint, fingerprint)

    BIT_FP_OR operates on two fingerprints (bit patterns) of equal length and performs the logical OR operation on each pair of corresponding bits. In each pair, if the first bit is 1 or the second bit is 1 (or both), the result is 1. Otherwise, the result is 0. If the two fingerprints do not have same length, the function returns NULL.

    mysql> SELECT BIT_FP_OR(fingerprint1, fingerprint2);
            -> binary fingerprint
    
  • MATCH_SUBSTRUCT(query_smarts, reference_obmol)

    MATCH_SUBSTRUCT checks if a query_smarts fragment is a substructure of a reference_obmol molecule. The first argument is a SMARTS string, whereas the second argument is a serialized OBMol object. The second argument type is generated by the MOLECULE_TO_SERIALIZEDOBMOL function. If the query_smarts is a substructure of reference_obmol, the function returns 1, otherwise, it returns 0.

    mysql> SELECT MATCH_SUBSTRUCT('C=O', serializedobmol_col)
        -> FROM tbl_name WHERE name='2-Aminoacetic acid';
            -> 1
    

    Note

    If the function encounters an error, it returns NULL.

  • SUBSTRUCT_ATOM_IDS(query_smarts, reference_obmol)

    SUBSTRUCT_ATOM_IDS returns the atom ids of a reference_obmol molecule that are contained in substructures matching a query_smarts fragment. The first argument is a SMARTS string, whereas the second argument is a serialized OBMol object. The second argument type is generated by the MOLECULE_TO_SERIALIZEDOBMOL function. If a reference_obmol molecule contains several fragments matching a query_smarts fragment, a list of items is returned. Each item contains a fragment’s atom ids and is separated from the next item by a semicolon character.

    mysql> SELECT SUBSTRUCT_ATOM_IDS('C(=O)', serializedobmol_col)
        -> FROM tbl_name WHERE name='2-Aminoacetic acid';
            -> 2 3 ;
    

    Note

    If the function encounter an error, it returns NULL.

  • SUBSTRUCT_COUNT(query_smarts, reference_obmol)

    SUBSTRUCT_COUNT returns the number of query_smarts fragments founded in a reference_obmol molecule. The first argument is a SMARTS string, whereas the second argument is a serialized OBMol object. The second argument type is generated by the MOLECULE_TO_SERIALIZEDOBMOL function.

    mysql> SELECT SUBSTRUCT_COUNT('C(=O)', serializedobmol_col)
        -> FROM tbl_name WHERE name='2-Aminoacetic acid';
            -> 2
    

    Note

    If the function encounter an error, it returns NULL.

  • TANIMOTO(first_fingerprint, second_fingerprint)

    TANIMOTO returns the tanimoto coefficient between two fingerprints. Fingerprints are bit patterns and can be generated with the FINGERPRINT function. The returned value is comprised between 0 and 1. The higher the tanimoto coefficient is, the more the molecules are similar.

    mysql> SELECT TANIMOTO(molecule_fp, fp_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 0.8934
    

    Note

    The use of another Mychem functions (like FINGERPRINT or FINGERPRINT2) within the TANIMOTO function makes the query slower. In order to get the best performance, you should use the SET function of MySQL:

    mysql> SET @fp = (SELECT FINGERPRINT2(
        -> SMILES_TO_MOLECULE('C(C(=O)O)N')));
    mysql> SELECT id FROM tbl_name WHERE TANIMOTO(@fp, fp_col)
        -> FROM tbl_name > 0.7;
            -> list of id
    

Property Commands

This section describes several functions that calculate molecular properties.

  • EXACTMASS(molecule)

    EXACTMASS returns the monoisotopic molecular weight of a molecule. The monoisotopic molecular weight is defined as the molecular weight calculated using the mass of the most abundant isotope for each element of a molecule. The unit of the returned value is g.mol-1.

    mysql> SELECT EXACTMASS(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 75.032028
    
  • IS_2D(molecule)

    IS_2D returns 1 if a molecule has 2D coordinates.

    mysql> SELECT IS_2D(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 1
    
  • IS_3D(molecule)

    IS_3D returns 1 if a molecule has 3D coordinates.

    mysql> SELECT IS_3D(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 1
    
  • IS_CHIRAL(molecule)

    IS_CHIRAL returns 1 if a molecule is chiral.

    mysql> SELECT IS_CHIRAL(molecule_col) FROM tbl_name
        -> WHERE name='2S-Butan-2-ol';
            -> 1
    
  • MOLFORMULA(molecule)

    MOLFORMULA returns the molecular formula of a molecule.

    mysql> SELECT MOLFORMULA(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> C2H5NO2
    
  • MOLLOGP(molecule)

    MOLLOGP returns the LogP of a molecule.

    mysql> SELECT MOLLOGP(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> -0.27
    

    Note

    Note that the result of this function is depending on the hydrogen atoms. If there is any doubt on the presence of hydrogen atoms in the molecule, it is recommended to use the ADD_HYDROGENS function.

  • MOLMR(molecule)

    MOLMR returns the molar refractivity of a molecule. The unit of the returned value is J.mol-1.K-1.

    mysql> SELECT MOLMR(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 16.2072
    

    Note

    Note that the result of this function is depending on the hydrogen atoms. If there is any doubt on the presence of hydrogen atoms in the molecule, it is recommended to use the ADD_HYDROGENS function.

  • MOLPSA(molecule)

    MOLPSA returns the topological polar surface area of a molecule. The unit of the returned value is Å2.

    mysql> SELECT MOLPSA(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 63.32
    

    Note

    Note that the result of this function is depending on the hydrogen atoms. If there is any doubt on the presence of hydrogen atoms in the molecule, it is recommended to use the ADD_HYDROGENS function.

  • MOLWEIGHT(molecule)

    MOLWEIGHT returns the molecular weight of a molecule. The unit of the returned value is g.mol-1.

    mysql> SELECT MOLWEIGHT(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 75.066600
    
  • NUMBER_OF_ACCEPTORS(molecule)

    NUMBER_OF_ACCEPTORS returns the number of hydrogen-bond acceptors in a molecule.

    mysql> SELECT NUMBER_OF_ACCEPTORS(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 3
    

    Note

    Note that the result of this function is depending on the hydrogen atoms. If there is any doubt on the presence of hydrogen atoms in the molecule, it is recommended to use the ADD_HYDROGENS function.

  • NUMBER_OF_ATOMS(molecule)

    NUMBER_OF_ATOMS returns the number of atoms in a molecule.

    mysql> SELECT NUMBER_OF_ATOMS(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 10
    
  • NUMBER_OF_BONDS(molecule)

    NUMBER_OF_BONDS returns the number of bonds in a molecule.

    mysql> SELECT NUMBER_OF_BONDS(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 9
    
  • NUMBER_OF_DONORS(molecule)

    NUMBER_OF_DONORS returns the numbers of hydrogen-bond donors in a molecule.

    mysql> SELECT NUMBER_OF_DONORS(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 2
    

    Note

    Note that the result of this function is depending on the presence of hydrogen atoms in the molecule. If the hydrogen atoms are not described by the molecule, the ADD_HYDROGENS function must be used. The two following examples described the effect of the ADD_HYDROGENS function:

    mysql> SELECT NUMBER_OF_DONORS(
        -> SMILES_TO_MOLECULE('O=C(O)CCN'));
            -> 0
    
    mysql> SELECT NUMBER_OF_DONORS(ADD_HYDROGENS(
        -> SMILES_TO_MOLECULE('O=C(O)CCN')));
            -> 2
    
  • NUMBER_OF_HEAVY_ATOMS(molecule)

    NUMBER_OF_HEAVY_ATOMS returns the number of heavy atoms in a molecule (all atoms except hydrogen).

    mysql> SELECT NUMBER_OF_HEAVY_ATOMS(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 5
    
  • NUMBER_OF_RINGS(molecule)

    NUMBER_OF_RINGS returns the number of rings in a molecule.

    mysql> SELECT NUMBER_OF_RINGS(molecule_col) FROM tbl_name
        -> WHERE name='adenine';
            -> 2
    
  • NUMBER_OF_ROTABLE_BONDS(molecule)

    NUMBER_OF_ROTABLE_BONDS returns the number of rotable bonds in a molecule.

    mysql> SELECT NUMBER_OF_ROTABLE_BONDS(molecule_col) FROM tbl_name
        -> WHERE name='2-Aminoacetic acid';
            -> 1
    
  • TOTAL_CHARGE(molecule)

    TOTAL_CHARGE returns the total charge of a molecule.

    mysql> SELECT TOTAL_CHARGE(SMILES_TO_MOLECULE('NCC(=O)[O-]'));
            -> -1