Novel indole derivatives as hepatitis C virus NS5B polymerase inhibitors: Pharmacophore modeling and 3D QSAR studies

Hepatitis C Virus (HCV) encodes its own RNA dependent RNA polymerase (NS5b) in order to replicate its genome. An efficient pharmacophore was identified, by executing structural analysis of a set of 49 indole-based inhibitors of the HCV NS5B polymerase. Identified pharmacophoric features, two hydrophobic regions, and 4 aromatic rings i.e. HHRRRR.649. Ligand based 3D-QSAR was performed, partial least square regression analysis was employed which gave a regression coefficient R2 of 0.98 and Q2 of 0.88, and Pearson-R of 0.96. Article Info Received: 14 May 2014 Accepted: 18 June 2014 Available Online: 18 July 2014 DOI: 10.3329/bjp.v9i3.18894 Cite this article: Varun G, Lokesh M, Sandeep M, Shahbazi S, Reddy GD. Novel indole derivatives as hepatitis C virus NS5B polymerase inhibitors: Pharmacophore modeling and 3D QSAR studies. Bangladesh J Pharmacol. 2014; 9: 290-97. This work is licensed under a Creative Commons Attribution 3.0 License. You are free to copy, distribute and perform the work. You must attribute the work in the manner specified by the author or licensor. Novel indole derivatives as hepatitis C virus NS5B polymerase inhibitors: Pharmacophore modeling and 3D QSAR studies G. Varun1, M. Lokesh1, M. Sandeep1, Sajad Shahbazi2 and G. Deepak Reddy3 Medicinal Chemistry Research Division, Vishnu Institute of Pharmaceutical Education and Research, Narsapur, AP, India; Department of Biotechnology, Punjab University, Chandigarh, India; Department of Pharmaceutical Chemistry, JNTUA-OTRI, Anantapur, AP, India.


Introduction
Hepatitis C Virus (HCV) infection evolved as a global pandemic, affecting about 3% of the world population (approximately 170 million people) (Clin et al., 2009).About 80% of the pathology is chronic, leading to liver cirrhosis and hepatocellular carcinoma (El-serag et al., 2004).Virology of HCV uncovers, a single positive stranded RNA virus belongs to Flaviviridae family (Verna et al., 2008).Non-structural 5B (NS5B) polymerase is responsible for the replication of viral genome (Behrens et al., 1996).It has become a potential target for inhibition of replication of HCV genome and perhaps terminating the prevalence of HCV disease.
Based on chemical composition and/or mechanism of action NS5B inhibitors are categorized into four major classes such as nucleoside or nucleotide analogs (as competitors of NTPs during RNA synthesis), nonnucleoside inhibitors (allosterically aim the NS5B) (Bressaneli et al., 1999;Lesburg et al., 1999), inhibitors covalently change the residues near the active site of NS5B, and compounds that target cellular proteins needed for HCV polymerase function (Biswal et al., 2005;Love et al., 2003;Wang et al., 2003).Since there is still no effective, well-tolerated treatment for HCV infection, alternative novel therapies are needed.In the present investigation we focused on the identification and elucidation of common pharmacophore model from the previously published series of Indole derivatives which are having significant inhibitory profile over HCV NS5B (Kevin et al., 2011).And also, we have developed a 3D QSAR model for validation of obtained pharmacophore model.

Dataset ligands
A series of 49 indole derivatives are used in this study (Kevin CX et al., 2011).Table I (A-E) shows the structures of the compounds used and their observed activity (pIC50).The in vitro biological activity data was Novel indole derivatives as hepatitis C virus NS5B polymerase inhibitors: Pharmacophore modeling and 3D QSAR studies G. Varun 1 , M. Lokesh 1 , M. Sandeep 1 , Sajad Shahbazi 2 and G. Deepak Reddy 3 stated in terms of IC50.These IC50 values were converted to pIC50 using the formula (pIC50= −log IC50).The distribution of pIC50 for the whole data set ranges from 4.7 to 9.0.We divided the data set, randomly choosing 39 compounds to be in the QSAR training set and 10 compounds for the test set on the basis of pIC50 threshold range.

Ligand preparation
Ligand library was produced by using "LigPrep" module of Schrodinger suite.The simplest use of LigPrep, input structures (2D) were changed over to a single, energy minimized (3D) structure with correct chirality's.At most 32 stereoisomers will generate for each ligand of these 1 low energy ring conformer with best ionization state is preferred.Tools used for ionization states, tautomers, stereo chemistries, and ring conformations, are OPLS 2005 for Force Field energy minimizer, Epik module is selected for ionization process, tautomeric states (Shelley et al., 2007).Energy minimization for whole ligand library was performed with same parameters mentioned above (Ligprep,version 2.6).

Generation of common pharmacophore hypotheses
Pharmacophore hypothesis and 3D QSAR were performed using PHASE module.This work concerns about pharmacophore perception, structural alignment    and activity prediction.Given a set of 49 molecules with affinity for a particular proposition target, the finegrained conformational sampling analysis and a range of scoring techniques to identify common pharmacophore hypothesis of the module, convey the characteristics of 3D chemical structures that are reported to be crucial for binding (PHASE, version 3.5;Dixon et al., 2006).The pharmacophore model was developed using a set of criterion pharmacophore features to generate sites for all the compounds.PHASE provides a standard set of six pharmacophore features, hydrogen bond acceptor (A), hydrogen bond donor (D), hydrophobic group (H), negatively ionizable (N), positively ionizable (P), and aromatic ring (R).All the ligands were categorized into active (pIC50>7.5),inactive (pIC50<7) and intermediate (pIC50: 7-7.5) according to the activity thresholds.Maximum of six and a minimum of five sites were selected in order to obtain an efficient pharmacophore model.Hypotheses were generated by a systematic variation of number of sites (nsites) and the number of matching active compounds (nact).With nact = nacttot.Initially (nact -tot) is the total number of active compounds in the training set, nsites (Rajendra Prasad et al., 2013).The scoring protocol provides ranking of different hypotheses to choose most appropriate for further investigation.The larger is the difference between the score of active and inactive, the best is the hypothesis at discriminating the active from inactive molecules.

QSAR studies
For QSAR development, pharmacophore models of training set molecules were localized into regular grid of cubes, with each cube allotted zero or more "bits" to account for the different type of pharmacophore features in the training set that occupy the cube.This representation gives rise to binary-valued occupation patterns that can be used as independent variables to create partial least-squares (PLS) factors 3D-QSAR models.Statistical correlation of predicted with actual activity data were collated for the hypothesis.Our dataset is congeneric, but have many rotatable bonds, so we addressed a pharmacophore-based QSAR model.Pharmacophore-based QSAR models were generated for hypothesis using 39 training set ligands (80% of dataset were selected randomly) and 1.0 Å of grid spacing.QSAR models from one to nine PLS factors were generated, and the models were validated by predicting the activity of test set ligands.
The predictive value of the models was evaluated by leave one-out (LOO) and leave-half-out (LHO) cross- validation.The cross validated coefficient, R2cv, was calculated using the following equation 1: (1) Here, Ypredicted, Yobserved, and Ymean are the predicted, observed and mean values of the target property (pIC50) respectively.(Yobserved−Ymean) 2 is the predictive residual sum of squares (PRESS).The predictive correlation coefficient (r 2 pred), based on molecules of test set, is defined as, (2) Here, SD is the sum of the squared deviation between the biological activities of the test set and mean activities of the training set molecules, PRESS is the sum of squared deviation between predicted and actual activity values for every molecule in test set.According to the literature, 3D-QSAR models were accepted if (Golbraikh et al., 2002;Lu et al., 2010;Basu et al., 2009).
We set a threshold for the active ligands and a threshold for the inactive ligands.Ligands with activity greater than or equal to the active threshold are marked as active ligands with activity less than the inactive threshold are marked as inactive and included in the pharm set.Ligands whose activity lies between the thresholds included as intermediate.

Results and Discussion
Pharmacophores from all conformations of the ligands in the active set are examined, and those pharmacophores that match identical sets of features with very similar spatial arrangements are grouped together.If a given group contains at least one pharmacophore from each ligand, then this group gives rise to a common pharmacophore.
We have selected maximum number of sites should be six because if the number of sites is too large, it will be too hard to find any common pharmacophores, but if the number of sites is too small, the common pharmacophores might not contain all required features, and therefore might not discriminate between actives and inactives very well.Then search starts from the highest number and shifts to the lower number of sites until it either finds common pharmacophores.We got 77 different variant lists from which eight hypotheses were obtained.They are AHHRRR (37), HHRRRR (20), HHH-RRR (40), AHHHRR (46), HNRRRR (15), AHNRRR (19), AHHNRR (8), and HHNRRR (9).Total of 25 common pharmacophore hypotheses were obtained from the three variant hypotheses HHRRRR.649,HHH-RRR.731,AHHRRR.355.We have identified a six feature pharmacophoric model consisting two hydrophobic groups, and four aromatic ring systems (HHR-RRR.649)and examined its structural features, the inter-pharmacophoric sit distances (Table II) and 3D spatial arrangement (Figure 1).
Phase entails to build 3D QSAR models for a set of ligands that are aligned to a selection of hypotheses, and to visualize these models along with the ligand structures and the hypotheses.The QSAR models are developed from a series of ligands that have a range of activities.80% of Dataset was randomly segregated into training and remaining as test sets for internal validation.
Partial least squares (PLS) Regression analysis was employed to build a potential QSAR model of the dataset ligands over HHRRRR.649hypothesis (Table III).Regression analysis of total nine factors were given of which result of PLS-6 was considered to be the best as the regression coefficients r 2 is 0.98 (for training set), q 2 is 0.88 (for test set) and Pearson-R is 0.96 .Predicted activity of all the dataset ligands obtained from QSAR studies considering PLS 6 were listed in Table I.
QSAR result can also be validated by using Craig's plot.
In this, actual (experimental) and predicted activities obtained from the QSAR based on PLS regression analysis was extrapolated and results were correlated with each other.The slope of the line represents the regression coefficient of the ligands considered.As the regression coefficient is nearer to one, the slope of the line passes nearer to the origin of the plot.Efficiency of the system was based on the slope of the line and the alignment of the ligands around the line.
We signify, the derived common pharmacophore through ligand based 3D-QSAR consists of six pharmacophore features HHRRRR, provides possible structural modifications for the strategic design of more potent derivatives in the treatment of hepatitis C virus.