In Silico Study for Similar FDA Approved Drugs as Inhibitors of SARS-CoV-2 Spike and the Host Receptor Proteins

The severe acute respiratory syndrome coronavirus 2, known as COVID-19, has been hideously increased worldwide. The disease began in Wuhan, China, around December 2019, then spread to most countries. Social distancing is the best procedure to prevent infection. Screening the available database containing millions of drug molecules or phytochemicals has become rapid and straightforward because of the computer-aided drug design (CADD) methods. In the present study, 300 phytochemicals and cellulose ether derivatives are screened through a docking study. Docking analysis showed that only four molecules (a-neohesperidin, quercetin 3-O-glucosylrutinoside, 14-ketostypodiol diacetate, and hydroxypropyl methylcellulose) were able to interact with the spike protein. However, two among them (quercetin 3-O-glucosylrutinoside and 14-ketostypodiol diacetate) could interact with the host cell receptor (ACE2) of SARS-CoV-2. The binding affinity of the four compounds is high. Still, according to Lipinski's rule of five, only 14-ketostypodiol diacetate was selected as a drug molecule due to its pharmacokinetic and ADMET properties. Screening for drug analogs to the 14-ketostypodiol diacetate detected five approved drugs. Docking analysis of these drugs with the target proteins showed that the five drugs interact with the host receptor protein, and three interact with viral spike protein. Accordingly, we suggest that molecular docking and drug analogs studies could support rapid drug development. In addition, future perspectives on therapeutic applications of 14-ketostypodiol diacetate are required for using it against SARS-CoV-2 infections.


INTRODUCTION
Coronaviruses cause respiratory infections associated with influenza-like illnesses ranging from the common cold to severe symptoms. The  The 2019-nCoV, like the other coronaviruses, has positive-sense, long single-stranded RNA that translates two groups of proteins; two structural proteins; spike (S), nucleocapsid, matrix, and two envelope non-structural proteins; proteases and RNAdependent RNA polymerase 3-5 . Coronaviruses depend on RNA-dependent RNA polymerase for the high frequency of RNA recombination, one of the main factors that cause phenotypically and genotypically diversity of coronaviruses that can jump across species 6 . The S-protein helps the virus initiate the infection by attaching to the host cell receptor and inters into the cell 7 . The S-protein is a large type I transmembrane protein composed of two subunits; the S1 subunit mainly contains a receptor-binding domain (RBD) responsible for recognizing the host cell surface receptor angiotensin-converting enzyme 2 (ACE2) and binding to it. The second subunit (S2) contains the basic elements required for the membrane fusion and entry into the host cells 8- 10 . The 3D atomic scale of the SARS-CoV-2 S-protein was recently published, and structural evidence showed that it binds to the ACE2 with 10-to 20-fold higher affinity than SARS-CoV S-protein. This may explain the rapid transmission of COVID-19 from human to human 11, 12 . Therefore, scientists have focused on the SARS-CoV-2 S-protein as a key target for vaccines, therapeutic antibodies, and diagnostics. In fact, to discover a new vaccine and therapeutic antibody needs many years 13 . The bioinformatics analysis approved a fast way to find potential molecules from the marketed drugs to develop a new drug against SARS-CoV-2. Once the efficacy is determined, it can be approved by the Green Channel or the hospital ethics committee for rapid clinical treatment 4,14 . Through this technology, several molecules, including natural plant compounds, have been screened and confirmed to directly inhibit the viral proteins responsible for viral entry and replication, such as S-protein of SARS or MERS coronaviruses 15, 16 . Commercial antiviral molecules and chemical compounds extracted from traditional Chinese medicinal herbs were investigated 17 . Pharmacokinetic study and in silico ADME modeling is used to speed up drug approval as it indicates if new compounds have side effects on human health 18 . In this work, a computational approach has been used to predict the potential binding, supported by molecular docking, of the natural compounds; aneohesperidin (flavanone glycoside), quercetin 3-Oglucosylrutinoside (flavonoids), 14-ketostypodiol diacetate (meroditerpenoids) as well as hydroxypropyl methylcellulose (cellulose ether derivative) against the SARS-CoV-2 S-protein and its host cell receptor. We also performed pharmacokinetics and ADME studies on the four compounds and carried out screening for similar FDA-approved drugs to the best compounds with the best pharmacokinetics properties. Furthermore, the selected structure-based drugs were exposed to docking with the target proteins to predict their binding potential to their active sites.

Receptors
The sequence of the SARS-CoV-2 S-protein and ACE2 host receptor was downloaded from Protein Data Bank (https://www.rcsb.org) under accession numbers 6VSB and 6M0J, respectively. For the predocking process, all water molecules from the PDB structure of the proteins and ligands were removed while hydrogen atoms were added to the target proteins. The docking system was built using SAMSON 2020.

Ligands selection
Selected compound structures were converted to simplified molecular-input line-entry system (SMILES) notations and submitted to the online server for calculation and filtration by the SwissADME to identify the physicochemical features as well as to predict the absorption, distribution, metabolism, and excretion (ADME) parameters, drug-like nature, pharmacokinetic properties, and medicinal chemistry of the selected compounds 20 . The compounds that become ready for docking with the target protein were reduced to 250 ligands using the SwissADME, depending upon their solubility and cytotoxicity to humans.

Virtual screening and docking protocol
Virtual screening utilizes docking and scoring of each compound from the previous dataset. This technique was employed based on predicting each compound's binding modes and binding affinities by docking to two proteins structure (experimental proteins) 21 . The docking program behaves to get the docking parameter in the SAMSON 2020, where the program could make docking for a library of ligands with a single protein.
Considering these aspects, diverse compounds from plants and protein targets were evaluated. It was generally important to visualize the docked poses of high-scoring compounds because many ligands were docked in different orientations. This kind of study becomes more complicated when the size of the dataset increases. Therefore, it was important to eliminate unuseful compounds by SwissADME for ligand filtration before docking by restricting the dataset to drug-like compounds and taking into consideration appropriate property, sub-structural features, solubility, and toxicity to be deal with human use and eliminate the probability of side effects to get the best feature of the ligands then the docking was placed 22 . Therefore, the bounded ligands were analyzed with Discovery Studio Visualizer, which was used to analyze the ligand properties to reach the functional protein domain in humans. At the same time, the docking protocol that was carried out followed the protocol reported in our previous study 23 .

Pharmacokinetics properties
Certain pharmacokinetic features must be followed to consider the compound as a drug. Bioavailability of absorption, the volume of distribution, the half-life for ADMET were the essential pharmacokinetics features that play a vital role in discovering a drug candidate 24 . Pharmacokinetics properties of the four compounds described as molecular weight (Mw, g/mol), the logarithm of partition coefficient (log P), number of hydrogen bond acceptors (HBA), number of hydrogen bond donors (HBD), number of rotatable bonds (ROT), and topological polar surface area (TPSA, Å 2 ) were calculated by using SwissADME. The percentage of absorption (%abs) was calculated by using the formula suggested by Mitra et al. 25 as presented in Equation 1:

Toxicity prediction
The four compounds computed in silico toxicity prediction was calculated using the PreADMET. The toxicity of compounds was measured as the Ames test, carcinogenicity on animals, and hERG inhibition to simulate the in vitro assay to know whether the compounds had any interaction with other proteins in the human to get the maximum effect to cure the disease and reduce the side effect to ensure that the four ligands did not have any carcinogenicity effect. Also, the PreADMET contains the three-level of computational methods: drug-likeness, ADME, and toxicity prediction, so the three steps simulate the practical part for the four compounds.

Ligand-protein docking
This work is based on finding novel compounds targeting viral S-protein and the host ACE2 receptor to predict a new drug against coronavirus. Coronaviruses use the homotrimeric spike glycoprotein to bind to ACE2 cellular receptors leading to the fusion between cell and viral membranes for cell entry 16, 26 . The S-protein consists of two subunits (S1 and S2), mediating the virus entry into host cells. The S1 subunits first bind to the host cell receptor and then fuse viral and host membranes through its S2 subunit 12 . The receptor-binding domain between SARS-CoV-2 S-protein and the ACE2 is located at the amino acid position of Arg-319 to Phe-541 within the S-protein, while located at position Ser-19 to Asp-615 within ACE2 9,27-29 . As binding to the ACE2 receptor is a critical initial step for SARS-CoV-2 to enter into target cells thus, the S-protein and the ACE2 are targets for coronavirus therapeutics.
In this work, 300 phytochemicals and cellulose derivatives were visually screened. These compounds were filtered to eliminate the undesirable compounds depending on their appropriate properties, substructural features, solubility, and toxicity to deal with human use and eliminate the probability of their side effects. This process yielded four ligands, i.e., a-neohesperidin, quercetin 3-O-glucosylrutinoside, 14ketostypodiol diacetate, and hydroxypropyl methylcellulose. The four ligands were docked to the target receptors (S-protein and ACE2 host cell receptor protein). The binding energy scores and the interaction residues are presented in Table I. All four ligands were binding to the S-protein, although only two ligands (quercetin 3-O-glucosylrutinoside and 14-ketostypodiol diacetate) were binding to the ACE2 protein. Both aneohesperidin and quercetin 3-O-glucosylrutinoside exhibited better binding affinity with the S-protein (-15.2 and -16.7 kcal/mol, respectively) in comparison to hydroxypropyl methylcellulose and 14-ketostypodiol diacetate (both -13.7 kcal/mol). On the other hand, quercetin 3-O-glucosylrutinoside and 14ketostypodiol diacetate bind to the ACE2 protein with -5.3 and -7.1 kcal/mol, respectively. Each ligand shows a different docking position and orientation, adjusting the pharmacophore of each ligand. A-neohesperidin binds to S-protein by forming conventional hydrogen bonds with Thr-547, Thr-549, Thr-573, Leu-587, and Phe-855 (Figure 1). Quercetin 3-O-glucosylrutinoside forms conventional hydrogen bonds with Tyr-756, Phe-970, and Thr-998, while it forms one carbon-hydrogen bond with Gly-999 (Figure 2). 14-ketostypodiol diacetate shows Van der Waals bonds with Leu-118, Val-120, Val-127, Lys-129, Phe-133, Leu-141, Phe-157, Val-159, Tyr-160, and Leu-241 (Figure 3). Hydroxypropyl methylcellulose forms two conventional hydrogen bonds with Thr-1027 and Arg-1039; and one carbon-hydrogen bond with Ala-1020 (Figure 4). Quercetin 3-O-glucosylrutinoside interacts with the ACE2 protein by making alkyl/Pialkyl bonds with Leu-29, Ala-36, and Val-93; conventional hydrogen bonds with His-34, Glu-35, Glu-73, and Gln-96; amide-Pi stacked bonds with Phe-40 and Tyr-41; a carbon-hydrogen bond with Gly-352; and Pi-lone pair with Phe-32 ( Figure 5). Lastly, 14ketostypodiol diacetate forms alkyl/Pi-alkyl bonds with Leu-29, Ala-36, and Val-93; a Pi-Pi T-shaped bond with Phe-32; a Pi-Sigma bond with Leu-97; and Van der Waals bonds with Asp-30, Asn-33, Gln-96, and Leu-391 (Figure 6). The four compounds showed a binding ability to the S-protein in the active sites, but only two compounds (quercetin 3-O-glucosylrutinoside and 14ketostypodiol diacetate) were able to bind to the ACE2 protein. Although a-neohesperidin and quercetin 3-Oglucosylrutinoside could bind to the S1 domain, the four ligands showed binding affinity to the S-protein far from the actual RBD. However, the four ligands could bind to the S2 ectodomain subunit (residue 686 to 1237) and prevent fusion of the viral membrane with acellular membrane. These interactions are assumed to affect the virulence of the virus by reducing the activity of S-protein 30 . Docking with the ACE2 receptor showed that only quercetin-3-Oglucosylrutinoside and hydroxypropyl methylcellulose could interact with the binding sites of the host receptor ACE2 (residues 19 to 615) with the viral S-protein. That indicates that those ligands could prevent viral binding to the host receptor, although they do not interact with the viral RBD.

Pharmacokinetics properties
Natural compounds must be tracked the Lipinski's rule of five to be considered drug-like, using four criteria (Mw ≤500, log P ≤5, HBD ≤10, and HBA ≤10).
Molecules violating more than one of these rules may have problems with oral bioavailability 25 . Data presented in Table II show that 14-ketostypodiol diacetate violates one rule only (molecular weight >500). However, a-neohesperidin, quercetin 3-Oglucosylrutinoside, and hydroxypropyl methylcellulose violate all four rules. Thus, 14ketostypodiol diacetate may have no problem with oral bioavailability. An excellent oral bioavailability compound must have ten or fewer ROT and a TPSA of 140 Å 2 or less 31 . The ROT in a-neohesperidin, quercetin 3-O-glucosylrutinoside, and 14ketostypodiol diacetate matched the rule, while hydroxypropyl methylcellulose was out as the ROT was 40. Only 14-ketostypodiol diacetate has a TPSA value less than 140 Å 2 , also a good absorbance percentage with 82.61%.

Toxicity prediction
The computed in silico toxicity prediction of the four ligands was calculated using PreADMET, and results were shown in Table III. On the Ames test that assesses mutagenicity of the compounds, quercetin 3-O-glucosylrutinoside and hydroxypropyl methylcellulose were predicted to be mutagenic while the a-neohesperidin and 14-ketostypodiol diacetate were non-mutagenic. Moreover, on the analysis of the carcinogenicity in animals (mouse), all the compounds were predicted as negative except hydroxypropyl methylcellulose. While for the carcinogenicity test in animals (rat), all ligands were predicted as negative. According to the hERG encodes potassium channels test, a-neohesperidin presented high, quercetin 3-Oglucosylrutinoside and hydroxypropyl methylcellulose presented an ambiguous, and the 14ketostypodiol diacetate presented a low risk.

Docking of FDA-approved drugs with target proteins
The main stumbling block to using the phytochemical compounds in the medical community is to get approval from the US Food and Drug Administration (FDA) 32 . Therefore, the likeness between the 14ketostypodiol diacetate and the other approved therapeutic drugs by FDA was determined. The analog drugs to the 14-ketostypodiol diacetate and their docking to the S-protein and ACE2 receptor were presented in Table IV. Results show that all drugs can be docked with SARS-CoV-2 S-protein and ACE2 receptors. In this respect, hydromorphone, oxycodone, and oxymorphone interacted with the S-protein by energy scores of -6.8, -7.1, and -6.9 kcal/mol, respectively. While hydromorphone, oxycodone, oxymorphone, nabilone, and hydrocodone showed binding to ACE2 receptor with -5.0, -6.5, -5.4, -5.5, and -4.9 kcal/mol, respectively. Results indicated that they could interact with the two proteins, indicating that these drugs play a role in the SARS-CoV-2 viral cycle. In general, oxycodone was the best drug as it has the best binding scores with both SARS-CoV-2 S-protein and ACE2 receptors, as presented in Figures 8 and 9.

CONCLUSION
The quercetin 3-O-glucosylrutinoside, aneohesperidin, 14-ketostypodiol diacetate, and the hydroxypropyl methylcellulose derivative are predicted to be potent inhibitors for S-protein as they prevent the S-protein from binding and interacting with the host receptor of the SARS-CoV-2. Only quercetin 3-O-glucosylrutinoside and 14ketostypodiol diacetate can interact with ACE2, but 14-ketostypodiol diacetate only was tracked the rule of five. Therefore, we suggest the potential of 14ketostypodiol diacetate as a prophylactic medication in COVID-19 prevention. Moreover, the predicted drugs that involve a similar structure of 14ketostypodiol diacetate were bound with the target proteins. Three of five drugs were bound with the Sprotein and ACE2 proteins, while the five drugs were bound to the ACE2.

CONFLICTS OF INTEREST
The authors have no conflicts of interest to declare that are relevant to the content of this article.

FUNDING
None.

DATA AVAILABILITY
All data are available from the authors.