QSAR Analysis of Aminoquinoline Analogues as MCH 1 Receptor Antagonist

Quantitative structure activity relationship (QSAR) has been established for 2aminoquinoline-6-carboxamide melanin-concentrating hormone (MCH) 1R antagonists. The multiple linear regressions were used to generate the relationship between biological activity and calculated descriptors. From the 100 of models with r > 0.700 was developed. Final selected model was prepared using four descriptors (DMXC, KChiV4, Rcom, and IM1L) which are belong to topological, steric, spatial and electrotopological class descriptor. The validation of the model was done by cross validation; randomization and external test set prediction. The binding pattern of most active compound B15 was postulated based on the pharmacophoric features to further exploration of QSAR model


Introduction
The prevalence of obesity continues to increase throughout the world and the burden of obesity and related co morbidities is large.However, existing drug therapies for obesity are limited and agents with high efficacy, safety and tolerability are expected to meet patient needs and lead to more substantial commercial success.In recent years, obesity has become a major health problem for many post industrial societies.The number of deaths per year attributable to obesity is about 30,000 in the UK and nearly 400,000 in the United States, where obesity is set to overtake smoking as the main preventable cause of illness and premature death [1][2][3].The total direct and indirect cost of obesity was estimated to be approximately €32,800 million per year in the EU and $99.2 billion per year in the USA [3,4].Obesity itself is not life threatening however; it can significantly increase the risk of life threatening diseases like cardiovascular disease, neurological, respiratory, musculoskeletal, endocrine, gastrointestinal, genitourinary and psychological disorders [5].For these reasons, the World Health Organization declared obesity a global epidemic [6][7][8], and obesity is now considered as disease that needs pharmacological treatments [9,10].Therefore, it is necessary to develop effective and safe antiobesity drugs to reduce the worldwide obesity epidemic.
Melanin-concentrating hormone (MCH) is a cyclic 19-amino-acid peptide is synthesized exclusively by neurons, present in the lateral hypothalamus that is believed to be involved in energy homeostasis and feeding behavior.[11] MCH axons and receptors are found throughout the brain.MCH is expressed in the lateral hypothalamus and zona incerta and has been shown to be important for feeding and energy homeostasis in rodents [12,13].MCH1R agonists and antagonists have increased interest in MCH1R as an important central nervous system G-protein coupled receptor (GPCR) drug target.This biology and pharmacology result generated a great interest in the development of MCH1R antagonists for the possible treatment of obesity [14] and depression or anxiety.Many non peptide MCH1R antagonists have appeared in the patent literature in recent years [15] in an effort to obtain compounds viable for clinical validation of MCH1R antagonism as a therapeutic target for human health.
Quantitative structure activity relationship (QSAR) is a useful method for the design of bioactive compounds and the prediction of activity from the parameters calculated from chemical structure of compound through ligand based drug design approach.There are many examples available in literature where QSAR models have been used for screening of compounds from the chemical databases [16,17].The QSAR models can be developed by linearly correlating the biological activity to the descriptors or the non-linear regression methods such as artificial neural network (ANN) can be used [18].
In the present work even if no conclusive therapeutic agents have been identified, with the huge recent increase in our knowledge on the molecular modeling processes involved in drug design, we have tried to identify the associated molecular properties and exploited them to optimize MCH1 antagonistic activity from the available chemical databases.The model can be used for virtual screening by applying Lipinski's rule filters for initial screening and then predicting the activity by QSAR model.

Materials and Method
The inhibitory activity of the aminoquinoline analogues was taken from literature in terms of IC50 values in nM [19].Total set of 29 compounds was divided in training and test set of 23 and 6 compounds respectively, by dividing compound into four groups based on range of activity.Selection of training and test set was made by keeping in mind that all four groups compound are included in both set.To make the interpretation more clear, the originally reported IC50 values of nM is converted to picomolar values, so that the pIC50 values lies in positive range for easy and clear interpretation.
The structures of compounds used in the study along with observed IC50 values are provided in Table 1.The molecular structures of all 29 compounds were sketched using the chemdraw module of chemoffice 2004 software and energy minimized via steepest descent, conjugative gradient, and truncated Newton methods in sequence using MMFF94 as force field with energy tolerance value of root mean square gradient 0.001 kcal/mol and maximum number of iteration was allowed to 1000 [20,21] in MOPAC 6.0.The following specific software options were employed while performing AM1 studies: convergence = normal, optimization = full, state = singlet, net charge = 0 e.u., time limit = 3600 s, keyword = mmok.Conformational search of each energy-minimized structure was performed using the stochastic approach.The stochastic conformational search method is similar to the RIPS method, which generates new molecular conformation by randomly perturbing the position of each coordinate of each atom in molecule followed by the energy minimization.A total of more than 900 descriptors were calculated using chemoffice 2008, Adriana code [22], and Tsar 3.3 software package [23].A brief description of descriptors used which include topological descriptors, spatial descriptors, E-state indices, thermodynamic, electronic and structural descriptors is provided in Table 2.
From the total calculated descriptors, some of the descriptors were rejected because they contain a value of 0 for all the compounds.The reason for the value of 0 for all the compounds for these descriptors was that there is no atom corresponding to these descriptors in any of the compounds.Further, the inter-correlation of descriptors was taken in to account and highly correlated descriptors were grouped together manually by analyzing the correlation matrix.Only one descriptor was then taken for further study from each group of highly correlated descriptors.Only remaining descriptors were considered for model development by multiple regression method.The multiple regression method works in the following way: first of all few equations (set at 100 by default in the TSAR software) are generated randomly by MATLAB 7.0 [24].The sequential multiple linear regression analysis method was employed.The ± data within the parentheses are the standard deviations associated with the coefficient of descriptors in regression equations.The best model was selected from the various statistically significant equations on the basis of the observed squared correlation coefficient (r 2 ), the standard error of estimate (SE), the sequential Fischer test (F), the cross validated squared correlation coefficient using leave one-out procedure (r 2 cv ), chance statistics (evaluated as the ratio of the equivalent regression equations to the total number of randomized sets; a chance value of 0.001 corresponds to 0.1% chance of fortuitous correlation), outliers (on the basis of residual pIC50 value), and the predictive squared correlation coefficient of the test set (r 2 pred) for final selected model.The training set was subjected to sequential multiple linear regression analysis, in order to establish a correlation between physicochemical parameters and MCH1 receptor antagonist activity.Several significant equations with coefficients of correlation (r) > 0.700 were obtained, a high correlation coefficient alone is not enough to select the equation as a model and hence the internal consistency of the training set was confirmed using the leave one out (LOO) and leave many out (LMO-33%) cross validation method to ensure the robustness of the equations.Although a few equations showed good internal consistency (q 2 = 0.300-0.700),they may not be applicable for the analogs which were never used in the generation of the correlation and therefore, the predictive power of Eqa 3 (model 4; Table 3) was further confirmed by a test set of six compounds.
The goodness of each progeny equation is assessed by Friedman's lack of fit (LOF) score, which is described by following formula where LSE is the least square error, c is the number of basic functions in the model, d is smoothing parameter, p is the number of descriptors and m is the number of observations in the training set [12].The smoothing parameter that controls the scoring bias between equations of different sizes was set at default value of 1.0 and the new term was added with a probability of 50%.Only the linear equation terms were used for model building.
The best equation out of the 100 equations was taken based on the statistical parameters such as regression coefficient, adjusted regression coefficient, regression coefficient cross validation and F-test values.

Results and discussion
As a rule of thumb, data set should be approximately five times more than the number of descriptors used in the model [25].Thus, descriptor reduction was done as described above.The results of the best QSAR model developed using forward selection method for one to five descriptors are given in Table 3.As the r 2 value can be easily increased by increasing the number of descriptors in the model, so cross validated correlation coefficient (q 2 ) was used as a parameter to select the optimum number of descriptors.The variation in cross validation correlation coefficient (q 2 ) as a function of number of descriptors is shown in Fig. 1.The best model according to the value of q 2 was obtained with four descriptors and is given as: Here n is number of compounds in training set, LOF is Lack of Fit score, r 2 is squared correlation coefficient, r 2 adj is square of adjusted correlation coefficient, F is a variance related static which compares two models differing by one or more variables to see if the more complex model is more reliable than the less complex one, the model is supposed to be good if the F-test is above a threshold value, SE is standard error, r is correlation coefficient, q 2 is the square of the correlation coefficient of the cross validation, r 2 pred is the predicted correlation coefficient calculated from the predicted activity of the test set compounds.The five descriptors selected by Tsar 3.3 to develop the model, belong to four different descriptor classes.DMXC: Electronic parameters are of critical importance in determining the types of intermolecular forces which underlie in drug-receptor interactions.Extensive studies using electronic parameters reveal that electronic attributes of molecules are intimately related to their chemical reactivities and biological activities.The extent to which a given reaction responds to electronic perturbation constitutes a measure of the electronic demands of that reaction, which is determined by its mechanism.The introduction of substituent groups into the framework and the subsequent alteration of reaction rates helps delineate the overall mechanism of reaction.Quantum chemical descriptors such as net atomic charges, Dipole moment and their component derivation to specific axis, highest occupied molecular orbital lowest unoccupied molecular orbital (HOMO-LUMO) energies, frontier orbital electron densities, and super delocalizabilities have been shown to correlate well with various biological activities.Dipole moment X component (DMXC) is the important molecular descriptor which depicts the directional electronic energy of the compound which is very useful in postulating the ligand interaction [26].KchiV4: Hall and Kier have developed molecular connectivity indices (Chi) that reflect the atom identities, bonding environments and number of bonding hydrogen.These Kier indices are consequently useful in a wider variety of applications.Hall and Kier defined four series of fragment categories: Path, Cluster, Path/Cluster, and Ring.The spread and numbers of fragment membership for each category is determined by molecule connectivity.ChiV indices are based on these fragment categories, also incorporating information about the bonding environment.Chiv indices represent structure information that organizes molecular structures into chemically meaningful patterns.Based on this information, one can navigate through this structure space.The immediate neighborhood of any structure in this space consists of similar structures.Library screening in this space takes advantage of this structure information and provides a basis for similarity screening [26].RCom: It is a ring complexity molecular descriptor calculated according to the approach by Gasteiger and Jochum [28].For a given ring system (single, bridged or fused), the ring complexity is the ratio of the sum of the number of atoms of each individual ring of the ring system to the sum of all atoms that belong to at least one ring in the entire ring system and it is derived from the 2D structure diagram of a molecule.The descriptors belong to global molecular descriptor which explains the overall molecule rigidity for interaction with receptor.Ring complexity for aminoquinoline series of analogues has important application in predicting their activity in inhibiting the MCH1 receptor.IM1L: The moments of inertia and principal axes of inertia for a molecule are calculated using the inertia tensor, with standard methods of calculation [21].These descriptors are reported as Moment 1 Size, Moment 1 Length, etc.The volume defined by these values is calculated and reported as the Ellipsoid Volume.In addition, one can view the molecule and an ellipsoid of inertia.The ellipsoid's principal axes are aligned with the axes of the inertia tensor.The length of each axis is inversely proportional to the moment of inertia around that axis.The resulting ellipsoid is then scaled so that the atom furthest from the centre of gravity of the molecule appears on the ellipsoid surface.IM1L explains the steric parameter to the specific axis in the form of ellipsoidal volume to define the biological interaction between ligand and compound.The inter correlation of the descriptors used was checked and is provided in the form of correlation matrix (Table 4).The developed Eq.(3) (Table 3) was used to predict the activity of test set compounds and the predicted activity for training and test set compounds are given in Tables 5 and 6, respectively.The correlation of predicted activity to the observed activity is shown scatter plot shown in Fig. 2   The randomization test was performed at 95% and 99% confidence interval.The higher the confidence level, the more randomization tests are run.22 trials at 95% and 99 trials at 99% were permuted to check randomization of the model.The r value of the original model was much higher than any of the trials using permuted data.Hence, the model is statistically significant and robust.The results of randomization test at various confidence levels are shown in So it is clear that the model satisfy all the validation criteria i.e. leave one out cross validation, randomization test and external set prediction which are considered to be optimum validation test of QSAR model.Good results are obtained in each of the validation technique.
From Table 3, dipole moment X component (DMXC) is the most important molecular descriptor for the predicting the MCH1 receptor activity.DMXC is applicable in reciprocal format and its contribution is negative in predicting the biological activity.It shows that the functional group such as in compounds B11, B15 shows good electronic feature in specific direction and their values for DMXC is so lower.Simultaneous in compounds such as A1, A7 and A10 functional groups are aromatic and bulkier functional group but they are not supporting the activity.It is also clear that Kier Chi V4 path index is the most important descriptor for the MCH1 receptor antagonistic activity as it is present in all the equations.So it can give an idea about the activity of the compounds.In all accepted model, the coefficient associated with KChiV4 is positive, which shows that equations are valid and with increase in the value of KChiV4, there is an increase in the activity.Since kier Chi V indices are the measure of the atomic arrangement in the compound which explained Hydrogen bond interaction with receptor.So functional groups which increase KChiV4 value will increase the activity of the compound by producing MCH1 receptor antagonistic activity.Compounds with higher KChiV4 value high such as A5, A15, B11, and B15 are having good binding coefficient with the receptor.Third most important molecular descriptor for prediction of MCH1 receptor binding efficiency of aminoquinoline series of compound is Ring complexity (Rcom).The descriptor contributing positively, i.e. as its value increase there is good binding.Further, the descriptors inertial moment descriptor IM1L contribute negatively so functional group increases its activity will reduce the activity.Overall, molecular descriptor DMXC, KChiV4, Rcom, and IM1L show that the overall shape, size, rigidity and branching in the molecule is critical for activity and should be considered during designing the drug for MCH1receptor antagonistic behavior.

Conclusion
The QSAR model of MCH1 antagonistic activity have been developed based on Dipole moment X component, Kier Chi V index, is a topological indices, inertial moment a steric parameter and molecular orbital and energy related electronic descriptors to estimate and predict relative antagonistic activity of 29 aminoquinoline derivative of MCH1 antagonist.The predictive ability of model was demonstrated by using LOO cross validation technique, randomization test as well as external test set prediction.The binding pattern and proposed pharmacophore features of compound B15 provide a good platform in designing new molecule of the series.The results presented above show that these descriptors can be used to describe the structure activity relationship of MCH1 antagonist and its performance based on statistical parameters is satisfying and can provide a good platform for designing new MCH1 receptor antagonist of the series.

F
General structure of aminoquinoline derivative.
p(IC50) = -0.550(±0.063)1/DMXC + 0.711(±0.242)*KChiV4 + 12.009(±2.662)* Rcom -268.312(±81.421)*1/IM1L -8.083(±0.892)(3) n = 23; LOF = 0.193; r 2 = 0.879, r 2 adj = 0.853; F = 32.827;SE = 0.370; r = 0.938; q 2 = 0.597, r 2 pred = 0.535, r 2 cv LOO = 0.923, r 2 cv LMO = 0.8806 for training and test set.Scatter plot shows that the predicted and actual pIC50 values are having linear relationship and is fitting to the linear line plotted in the graph.The value of residual p(IC 50 ) gives idea about outliers of the test which is defined as the more than a twice the standard deviation of the valid model.Further validation of the developed model was done by randomization test.The test was done by repeatedly permuting the activity values of the data set and using the permuted values to generate QSAR models and then comparing the resulting scores with the score of the original QSAR model generated from non-randomized activity values.If the original QSAR model is statistically significant, its score should be significantly better than those from permuted data [29].

Table 1 .
Chemical and biological dataset, class for test or training set, IC50, pIC50 of compound.

Table 2 .
Description of the parameters used in the study.

Table 3 .
Statistical assessment of equations with increasing number of descriptors.

Table 4 .
Correlation matrix of descriptors used in Eq. 3.

Table 5 .
Observed and predicted activity and residual variance of training set compounds.

Table 7 .
Table7.Results of randomization test performed to check the validation of model.