Suitable Docking Protocol for the Design of Novel Coumarin Derivatives with Selective MAO-B Effects

MAO (monoamine oxidase) is an enzymatic system that comprises two isoenzymes: MAO-A and MAO-B, with more than 70% identity in their amino acid sequence1. The prominent role of MAO-A and MAOB is the oxidation of aliphatic and aromatic amines to the corresponding aldehydes. The inhibition of the latter is implemented in two substantial neurodegenerative diseases – Alzheimer’s and Parkinson’s diseases2. Administration of MAO-B inhibitors in treating the latter disorders has shown promising results in early and advanced stages of the conditions, considering the conservation of high dopamine levels in the synaptic cleft. Hence, both endogenous and exogenously administered dopamine concentrations could be retained after treatment with selective MAO-B inhibitors, and the effects are manifested3. Moreover, all monoamine oxidase inhibitors demonstrate additional neuroprotective effects arising from decreased neuronal toxic radicals and peroxides4. A significant breakthrough in the design and the optimization of novel MAO-B inhibitors has been made after the first resolved crystallographic MAO-B structure5. The study revealed three distinct domains in the active site of the receptor – entrance pocket, substrate cavity, and aromatic cage. It has also been postulated that the major ligand interactions in the binding gorge are the hydrophobic ones6. Moreover, four amino residues: Tyr-326, Leu-171, Ili-199, and Phe-168, are reported to act like a “gate” between the entrance and the substrate cavities. For an additional stabilization in the ligand-receptor complex, hydrogen bonds with Tyr-3987, Gln-2068, and FAD9 have been described. Ever since the computer-aided drug design (CADD) simulations were introduced in the drug discovery Suitable Docking Protocol for the Design of Novel Coumarin Derivatives with Selective MAO-B Effects


INTRODUCTION
MAO (monoamine oxidase) is an enzymatic system that comprises two isoenzymes: MAO-A and MAO-B, with more than 70% identity in their amino acid sequence 1 . The prominent role of MAO-A and MAO-B is the oxidation of aliphatic and aromatic amines to the corresponding aldehydes. The inhibition of the latter is implemented in two substantial neurodegenerative diseases -Alzheimer's and Parkinson's diseases 2 . Administration of MAO-B inhibitors in treating the latter disorders has shown promising results in early and advanced stages of the conditions, considering the conservation of high dopamine levels in the synaptic cleft. Hence, both endogenous and exogenously administered dopamine concentrations could be retained after treatment with selective MAO-B inhibitors, and the effects are manifested 3 . Moreover, all monoamine oxidase inhibitors demonstrate additional neuroprotective effects arising from decreased neuronal toxic radicals and peroxides 4 . A significant breakthrough in the design and the optimization of novel MAO-B inhibitors has been made after the first resolved crystallographic MAO-B structure 5 . The study revealed three distinct domains in the active site of the receptor -entrance pocket, substrate cavity, and aromatic cage. It has also been postulated that the major ligand interactions in the binding gorge are the hydrophobic ones 6 . Moreover, four amino residues: Tyr-326, Leu-171, Ili-199, and Phe-168, are reported to act like a "gate" between the entrance and the substrate cavities. For an additional stabilization in the ligand-receptor complex, hydrogen bonds with Tyr-398 7 , Gln-206 8 , and FAD 9 have been described. Ever since the computer-aided drug design (CADD) simulations were introduced in the drug discovery processes, the time required to develop and optimize new molecules was drastically reduced 10 . Molecular docking is one of the most utilized structure-based drug design (SBDD) techniques as it is emerging as frequently applied in the optimization step of active ligands. Furthermore, it could be utilized for the virtual screenings of novel and effective drug candidates 11,12 . However, several challenges regarding the accuracy of molecular docking are still to be resolved. The major issues are related to the estimative character of the scoring functions 13 and the inefficiency of the fully flexible simulations 14 . In order to acquire reliable docking results, the molecular docking protocol should be validated. Several validation methods such as re-docking, crossdocking, high enrichment factors, and enhanced correlation coefficients between experimental affinities and docking scores have been described 15 . The latest technique is often used to evaluate the correctness of the scoring function and the search algorithm through a created relationship with the experimentally acquired data. The correlation coefficients can vary considering the applied chemical dataset and the characteristics of the receptor 16 . However, an optimization of the docking protocol after altering the size of the binding pocket, the utilized scoring function, the flexibility of the side chain residues, and the presence of active waters often lead to superior correlation values 17 . This study aimed to examine the effects of various docking protocols on Pearson's correlation coefficient of a coumarins dataset. All docking settings included in GOLD 5.3 were incorporated in the optimization process with further correlation coefficient calculations at each stage. Moreover, the significant intermolecular interactions between the top-scored ligands and the active site of 1S3B were examined.

Hardware and Software
The docking simulations were carried out on an AMD Ryzen 5 3600 6-core 3.6GHz CPU, GeForce GTX 1060 3 GB GPU, 16 GB RAM installed memory, 64-bit Operating system on Windows 10 Pro. GOLD 5.3 (Genetic Optimization for Ligand Docking) from The Cambridge Crystallographic Data Centre (https://www.ccdc.cam.ac.uk/solutions/csddiscovery/Components/gold/) 18 was used for the current docking simulations. It comprises four scoring algorithms: ChemPLP, GoldScore, ASP, and ChemScore. GoldScore considers mainly Van der Waals interactions and hydrogen bonds. ChemScore is the empirical algorithm of GOLD which was calibrated from numerous complexes with known binding affinities. ASP represents the knowledgebased function, while ChemPLP used piecewise linear potential to score the contacts in the ligand-receptor complex 19 . In this study, all the latest scoring functions were utilized to evaluate the most prominent one for the current dataset. For the pre-docking procedures, the docking visualizer Hermes 18 was applied. The preparation of the ligands and the receptors were conducted in Hermes as well as ChemDraw and Systèmes, Pharmacopeia, Inc. (https://www.3ds.com/productsservices/biovia/products/molecular-modelingsimulation/biovia-discovery-studio/visualization/).

Ligands
Forty-four substituted coumarin derivatives were taken from a published paper by Pisani et al 7 . The ligands were grouped into two sets: a training and a test set. In the training set, we situated 35 compounds, while for the validation of the model, we applied nine structures. All coumarin derivatives used in our study were given in Table I with the corresponding pIC50 values. The drawing of the ligands and the conversion into the corresponding 3D structures were conducted in ChemDraw and Chem3D, respectively. The energy minimization procedures were also carried out in Chem3D with an early termination set at 2000 iterations and minimum root mean square (RMS) gradient fixed at 0.01000. During the docking simulations, the rotations of the ligands were set to "flexible". Monomer B from all receptors was removed with the co-crystallized ligands and the cofactors lying in the corresponding monomer. In the case of present covalent bonds between the cocrystallized ligands and the co-factors, they were removed.

Docking protocol
The GOLD wizard setup was utilized for rapid extraction of co-crystallized ligands and waters. Additional hydrogen bonds were added with the help of the former wizard. The search efficiency was set at 100% (default setting) with no early termination. During all dockings, the ligands were set to flexible, and initial energy minimization was carried out. The starting docking protocol was built out of ChemPLP as a scoring function, 6 Å binding grid, no protein flexibility, and no active waters. All default parameters were altered to obtain a higher correlation coefficient between the experimental data and the obtained fitness scores. Primarily, the scoring functions and the size of the binding space were varied. The scoring algorithm was chosen based on the lowest R 2 value and the shortest simulation time.
Analysis regarding the presence of active water molecules was performed, which examinations with and without waters in the active sites were conducted. After each simulation, the Person's correlation coefficient was calculated. After that, ten amino acids

Re-docking
Re-docking procedures were carried out to assess the ability of the docking software to correctly place the cocrystallized ligands back into the binding pocket of the receptors 24 . The reliability of GOLD 5.3 to correctly place the co-crystallized ligands of 1OJA, 1OJC, 1S3B, and 2V60 was unambiguous. The root-mean-square deviation (RMSD) for all receptors was under 2 Å, as shown in Table II.

Optimizations of the docking protocol
All four scoring algorithms were applied to evaluate which was the most prominent one for the current dataset 25 . The rest of the docking settings were set to default. After each docking simulation, Pearson's correlation coefficient was calculated. ChemPLP had displayed the ability to acquire the highest correlation value, and the former scoring function was employed in the forthcoming protocols, as shown in Table III.
Interestingly, GoldScore, ChemScore, and ASP showed significantly lowered accuracy when they were utilized to score the current dataset. GoldScore could not correctly score compound 16, which led to a significantly lowered Pairwise correlation coefficient of 0.2802. Moreover, when ASP was used, there was no correlation at all -R 2 = 0.0876. The scoring function with the fastest run time was ChemPLP, while GoldScore demonstrated the longest simulation periods (double the time of ChemPLP). Considering the lengthy docking times, together with the unacceptable correlation coefficients obtained with GoldScore, ChemScore, and ASP, further studies with the latter GOLD 5.3 scoring algorithms were not conducted. Subsequently, the size of the binding gorge was modified to 8, 10, and 12 Å, when ChemPLP was used as a scoring function. Overall, when the size of the grid box was expanded, the correlation coefficients increased 26 . The highest value (R 2 = 0.5929) was obtained when the grid size was set to 12 Å. For the magnitude of the applied dataset, the run times were relatively similar, with a slight increase after each expansion of the grid space. The results were presented in Table IV. value, when the waters were taken into consideration, was given in Table V. A drastic drop in the correlation coefficient was observed after the employment of active waters. In addition, the inability of the docking software to correctly score compound 34 was noticed. The former ligand received a false-positive fitness score of 121; thus, the correlation coefficient was calculated to be 0.4603. No further examinations with active waters were conducted considering the latter observations. It is important to note that in most cases, the active water molecules play an essential role in forming a stable complex 28 . However, in this work, the number of falsely scored results significantly increased when eight water molecules were included in the binding site. During the last optimization step, we considered the flexibility of the side chain residues. Ten amino acids: Thr-195, Ile-198, Ile-199, Tyr-326, Phe-343, Leu-345, Tyr-398, Thr-399, Tyr-435, and Met-436, located in the active site, were set to a freely flexible state during the docking simulations. It was noted that the R 2 value dropped significantly from 0.5829 to 0.2921. The latter observation disposes a concern into the positive effect of flexible side chains in the reliable representations and scoring of the occurring MAO-B/coumarins intermolecular interactions. Furthermore, papers discussing higher enrichment values after semiflexible receptor docking were reported 29,30 , which contrasted with this work. In addition to the unsatisfactory results obtained from the side chain flexible dockings, the ensemble docking simulations of the described coumarins into two superimposed complexes: 1S3B-2V60 and 1OJA-1OJC-1S3B was examined. The receptor 2V60 was used owing to the chemical similarity between the cocrystallized ligand and the utilized in this study dataset. At the same time, the second superimposed complex demonstrated the highest enrichment value in a recently conducted study (to be published). Correlation coefficients of 0.4238 and 0.4508 were obtained, as shown in Table VI. However, the described technique was not applicable for a future virtual screening due to lower Pairwise correlation coefficients than the protocol mentioned earlier. Overall, the most prominent GOLD 5.3 docking protocol of novel MAO-B inhibitors with coumarin moiety was composed of the scoring function ChemPLP, size of the binding site 12 Å, absence of active waters, and no partial protein flexibility. Furthermore, the utilization of ensemble docking did not achieve any enhancements in the correlation value.
In order to validate the docking protocol, a test set built of nine chemically similar to the training set ligands, with a wide range of experimentally acquired binding affinities, was applied. The calculated Pearson's correlation coefficient of the test set equaled 0.8217, as shown in Table VII. The latter value was statically significant; thus, the model could be applied for a future virtual screening of novel MAO-B inhibitors.

Visualizations of the major interactions
Two of the top-scored compounds located in the training set were visualized their major interactions with the active site of 1S3B. Both the 2D and 3D interaction diagrams of compounds 29 and 34 were provided in Figures 1 and 2, respectively. As demonstrated below, the poses of both ligands in the active site of MAO-B were exceptionally similar. The amide group in both cases was sandwiched in the aromatic cage, and a strong hydrogen bond was formed with FAD600. The core structure of coumarin was located in the substrate pocket, where it was stabilized by Van der Waals and hydrophobic interactions. The latter weak forces also occurred between the p-substituted phenyl moiety and the entrance cavity of the receptor. The absence of a delocalized cyclic system in the aromatic cage led to - stacking interactions between the benzene ring 31 and the amino residue Tyr-326. A -sulfur bond between Cys-172 and the coumarin's phenyl group in compound 29 was the only deviation in the interaction pattern between the two ligands. All the docking fitness scores and the major amino residues that participate in stabilizing the ligandreceptor complexes in the test set are given in Table  VIII. As discussed before, nine ligands were included in the test set, and an R 2 value of 0.8217 was achieved. Thus, all of the analyzed docking poses should be close to the actual poses of the ligands in the active site of MAO-B. Compound 4 displayed the highest fitness score of 92.11. The complex was stabilized with a hydrogen bond between FAD600 and the amide group. Moreover, a weaker carbon-hydrogen bond was present between Cys-172 and the pyran ring. Compound 1 showed the lowest score of 76.93, which was plausible considering the low number of stabilizing bonds.

CONCLUSION
Good correlation coefficients were achieved in this work between the pIC50 values of 44 coumarins derivatives with MAO-B activity and their fitness scores applying the docking software GOLD 5.3. After optimizing the docking protocol for scoring functions, grid spaces, and rotatable side residues, a pairwise correlation of 0.5929 for the training set and 0.8217 for the test set was obtained. The presence of active waters and the inclusion of partial protein flexibility that was examined did not lead to enhanced correlation coefficients. In addition, compounds 29 and 34 demonstrated strongly identical poses in the active site of 1S3B. Overall, a statistically significant Pearson's correlation coefficient was obtained between coumarins derivatives and ChemPLP docking scores -R 2 = 0.5929. This finding could be beneficial for future virtual screenings in search of novel MAO-B inhibitors.

CONFLICTS OF INTEREST
The authors have no conflicts of interest to declare that are relevant to the content of this article.

FUNDING
None.

DATA AVAILABILITY
All data are available from the authors.

ACKNOWLEDGMENTS
This work was supported by the Bulgarian Ministry of Education and Science under the National Program for Research "Young Scientists and Postdoctoral Students".