Sensor Based Classification and Evaluation Methods using Machine Learning Algorithm for the Evaluation of Indian Traditional Medicine (Siddha)

The present work analyses sensor based classification and evaluation methods for the evaluation of churna. The churna is a powdered form of Siddha medicine. The churna is evaluated based on organoleptic and physicochemical parameters. The organoleptic parameters such as color and physicochemical parameters such as moisture content value and pH value are analysed in this work. The proposed methodology facilitates the development and integration of hardware and software modules for churna identification and classification. The proposed hardware setup comprises Raspberry pi camera, color sensor, moisture sensor and pH sensor interfaced with raspberry pi 3b. Churnas are discriminated by classifying the color values using machine learning algorithms such as the Support Vector Machine (SVM) and Random Forest (RF) classifiers separately. The experimental results depict that the performance of the RF Classifier excels the SVM Classifier in churna name identification with greater accuracy, sensitivity and specificity. at large using the to fullfill the consumer’s preferences, changing lifestyles and present day commercial factors in the global market.


Introduction
Siddha is one of India's traditional medication system, based on ancient medicinal practices and authentic. Siddha medication is strong enough to combat disease. In ancient times, it was the practitioner's responsibility to prepare the medication. Now a days, with the advent of science and technology, herbal drugs are being commercially manufactured at large scale using machinery and distributed across the country to fullfill the consumer's preferences, changing lifestyles and present day commercial factors in the global market.
Consequently, the practitioners, as well as the consumers, now seek assurances from the manufacturer about the quality, safety, consistency and efficacy of a readymade herbal supplement or medication [1]. Churna (Siddha medication in powdered form) is a mixture of powders prepared from different parts of plants like roots, stems, leaves, flowers or fruit extract. A herb is chosen and used depending on its medicinal value and curative effects on the body. The basic processes involved in the preparation of churna are harvesting, drying, and storage. All the ingredients are completely dried and powdered using either crusher or grinder. The thoroughly mixed and sieved powders constitute the churna, which is stored in a clean and airtight glass container [2]. Most of the Siddha medications are more effective but still they are not standardized. Consequently, there is a need for a standardization technique that ensures its safety. The evaluation of the drug is a key procedure in the formulation of quality assessment of churna [3]. The parameters used to evaluate the churna are organoleptic parameters such as color [1] and also the physicochemical parameters are pH and moisture content values [4].
The churna prepared by different manufacturers varies slightly in colour shades. Color identification helps to categorize the churna prepared by different manufacturers. To protect the users from the duplicate products, it is necessary to develop a standard protocol for the uniform manufacturing of the formulations with standard consistency of the medicine [5]. The churna and the colour of the churna are found by the visual inspection of the churna.
The pH value of the 1 % w/v solution of the formulated churna is determined by the calibrated pH meter [4]. The moisture content value present in the churna is evaluated by weighing the 5 g sample of the churna in a tarred evaporated dish to dry at 105 °C and continuing the drying process and weighing at 10 min interval [2]. Another method used to evaluate the moisture content value present in the churna is by using the Halogen moisture determining apparatus (Metter Toledo) [6]. If the moisture value is exceeded the standard level, fungi and bacteria can easily affect the churna.

Related Works
Deshmukh et al. undertook the standardization of shatapushpadyachurna [7]. Rao et al. prepared the formulation of panchasakarachurna as per the Ayurvedic Formulary of India (AFI) using a pharmacognostical study, physicochemical study and thin layer chromatographic study [8]. Shulammithi et al. discussed spectroscopic, chromatographic and electrophoretic methods for quality assurance and standardization of herbal medications [9]. Usha et al. determined the standards for quality evaluation of pharmacognostic, phytochemical and photomicrographic studies as well as an analysis of the values of aqueous and alcoholic extractives of vaishvanarachurna [10]. Lekharu et al. undertook the standardization of the compound ayurvedic formulation of Yograjchurna through a pharmacognostical and pharmaceutical evaluation [11]. Srunitha et al. discussed support vector machine-based classification of soil types and reported classification accuracy of 95 % [12]. Percial et al. carried out the standardization and phytochemical screening of nimbadi churna with a phytochemical, physical parameter and a physicochemical evaluation [13]. Huanxue et al. used the random forest (RF) classifier to classify crops using multispectral Rapid Eye imagery over two study sites [14]. Kharkwal et al. examined standardization parameters, such as a physicochemical and phytochemical analysis with thin-layer liquid chromatography (TLC) and high-performance thin-layer chromatography (HPTLC) of both cyperusrotunduslinn.
(Mustaka) and cyperusprocerusrottb. (Nagarmustaka) [15]. Mittal et al. have presented an automatic, real-time and cost-effective image processing based system for the classification of rice grains into various categories according to their inferred commercial value. The image of rice grain was captured using the USB (Universal Serial Bus) camera attached to the Raspberry Pi 3 computer. The geometric features were extracted from the image of rice grain and this feature set was given as input to the SVM (Support Vector Machine) classifier [16]. Tan et. al. have used an analytical model for estimating the moisture content, acidity and alkalinity in the soil through measurement of soil resistivity by an onsite and real-time method. In this proposed method, the author has used a low-cost soil Resistivity sensor YL-69 to characterize the dynamics of soil moisture content and its pH value. The prediction model determined can be used as an accurate estimation for soil moisture content and soil pH value with percentage error ranging from 0.9331 % to 1.7188 %. The research project allows to predict the soil moisture content and pH value simultaneously with a single sensor and minimal soil disturbances [17]. Patel et al. have presented the effect of water absorption on iron ore samples in the performances of the SVM-based machine vision system. The author has examined the performance by using the images of the ore samples captured in both conditions (wet and dry). The author concluded that the model performance was further degraded and thus a separate feature subset should be used for quality monitoring of dry and wet iron ore samples [18]. Hossain et al. have checked whether probiotic pineapple juice could show a significant difference in physiochemical parameters and the microbial status during storage. The physicochemical parameters such as titratable acidity (expressed as per cent lactic acid), moisture, protein, total soluble solids (TSS) contents, and total sugar content were evaluated to check the physicochemical stability of pineapple juice [19]. Sowmya et al. have developed a deep intelligent system to facilitate vehicle detection and recognition techniques for robust traffic management of heavy vehicles. The following sophisticated mechanisms were used: Support Vector Machine (SVM), Convolutional Neural Networks (CNN), Regional Convolutional Neural Networks (R-CNN), You Only Look Once (YOLO) model [20]. Gupta et al. have presented ML models followed by feature selection in consensus with domain experts and feature extraction to resolve multicollinearity issues in diagnosing heart disease. The ML models used are SVM, RF, BNB (Binomial NB), LoR (Logistic Regression) and kNN (k Nearest Neighbor). The performance of the ML model was analysed by metrics such as weighted F score, precision, recall and accuracy [21].
Observations drawn from the existing works reveal that the standardization of churna is based only on the manual evaluation of organoleptic and physicochemical parameters.
It is clear that the researches accepts the evaluation and standardization of churna. An automated system has been developed to evaluate the organoleptic (color) and physicochemical (moisture content value and pH value) parameters to determine the quality of churna.
So, the present work is the combination of hardware and software for the evaluation of churna. The study involves the integration of the Raspberry pi camera, colour sensor, moisture sensor and pH sensor with the Raspberry pi 3b to evaluate the organoleptic parameters such as color and also, the physicochemical parameters such as moisture content value and pH value. The churna name identification is performed by evaluating the red (R), green (G) and blue (B) colour values. The machine learning technique such as Support Vector Machine (SVM) and Random Forest (RF) classifier discriminates the churna based on the colour values.
The present work is discussed in five sections. The basic needs and strategies of the research is given in the introductory section. Review of the related research is analysed in the above section. Then, the hypothesis, research methodology and the significant features of extraction and classification model are all analysed in the section 3. The results of various experiments related to the present study is analysed in section 4. Section 5 sums up the findings of the present research and throws light on the scopes of futurity in the field.

Materials and Methods
This section illustrates the features and methodologies that are used in this research paper. The research work presents two phases of Churna evaluation namely (i) evaluation of organoleptic parameters such as colour (ii) evaluation of physicochemical parameters such as moisture content value and pH value.

Steps of the proposed algorithm
The following are the steps of the proposed algorithm: i. The Pi camera, color sensor, moisture sensor and pH sensor are connected to the Raspberry Pi 3 B through the Arduino UNO board. ii. The image of the churna with 2592 x 1944 pixels is acquired by the pi camera and the image is processed in Raspbian OS via python IDLE using python image library (PIL). By processing the image of the churna, red (R), green (G) and blue (B) color values are found for each churna. iii. The color sensor, moisture sensor and pH sensor are programmed by Arduino IDE in Raspbian OS via python IDLE. The RGB color values, moisture content value and pH value are extracted for each churna.
iv. The RGB color values extracted by the pi camera and colour sensor are offered as an input to the SVM (support vector machine) and RF (random forest) classifier to identify the name of the churna. v. The moisture content value and pH value evaluated are used to assure the quality of the churna.

Evaluation of organoleptic parameter
The present work concentrates on the evaluation of organoleptic parameter such as color.
The colour of the churna is identified by using raspberry pi camera and colour sensor as shown in the figure. 1. The proposed work attempts to identify the name of churna depending on its color by using two different methods: i. Processing the image of the churna captured by Raspberry Pi camera.
ii. Extraction of the colour values from the churna using a colour sensor programmed by Arduino IDE in raspberry pi.
In the first method, the image of the churna with 2592 x 1944 pixels is captured by using the Raspberry Pi Camera. The captured image is processed by using the Python IDLE software in Raspberry pi 3B. The 240 images of the 12 varieties of churna are captured and processed to extract the features such as red (R), green (G) and blue (B) colour values.In the second method, the color sensor is interfaced with Raspberry Pi 3 B through the Arduino Uno board. The color sensor is programmed in Arduino IDE. The color sensor acquires the red (R), green (G) and blue (B) color values of the 12 varieties of churna.
These features such as red (R), green (G) and blue (B) color values are given as input to the classifiers such as Random Forest (RF) and Support Vector Machine (SVM) classifier. Then, the classifiers identify the name of the churna and classify it. The two methods of identification and classification of the churna are significant to analyse the consistency of the prepared churna. The particular churna prepared by different manufacturers varies slightly in color shades. The color identification technique helps to identify and classify the churna prepared by different manufacturers in order to maintain the consistency. Thus, the proposed color identification method is used for standardising and evaluating the churna even though it varies slightly in colour.

Classifiers
The features such as red, green, blue color values are extracted for the 12 varieties of churna by using a Raspberry Pi camera as well as a color sensor.

Support Vector Machine (SVM) classifier
SVM classifier is a supervised learning algorithm used for classification and regression. Input data refers to the two sets of vectors in an n-dimensional space. SVM constructs a separating hyperplane in the space, which maximises the margin between the given two data sets. To calculate the margin, two parallel hyperplanes are plotted. A good classification is achieved by the hyperplane that has the largest distance to the neighbouring data points of both classes. Classification of data is a common need in machine learning. If some given data points belong to one of two classes and the goal is to decide, which class has a new data point. In the case of the SVM classifier, a data point is viewed as a p-dimensional vector (a list of p numbers), and it can separate such data points with a (p-1) dimensional hyperplane. This classifier is also known as a maximum margin classifier [22].

Random forest (RF) classifier
Random forest is a machine learning algorithm proposed by Breiman for regression and classification. RF has multiple decision trees. For each tree, this method performs bootstrap sampling and enables the calculation of an error estimate based on the instances. The number of trees, which must be sufficiently larger to capture all variability of the training data and yield good classification accuracy. RF has several advantages compared to other algorithms. It is a non-parametric method, therefore does not require that the values of variables do follow a particular statistical distribution. Moreover, it is not sensitive to noise or overfitting compared to other classification methods [23].

Evaluation of physicochemical parameters
The present work concentrates on two main physiochemical parameters such as moisture content value and pH value. The present work attempts to evaluate the physicochemical parameters by employing a moisture sensor and pH sensor. The Fig. 2 demonstrates the evaluation of physicochemical parameters. The moisture value of the churna is evaluated to measure the loss of drying [8] during the period of preparation. If the moisture value exceeded the standard level then bacteria can easily affect the churna [2]. The moisture content value present in each churna is evaluated using the moisture sensor which is interfaced with raspberry pi 3 b through Arduino Uno board as shown in Fig. 2. By using the moisture sensor, the moisture content value of the 12 varieties of churna can be evaluated easily. The moisture sensor is programmed with Arduino IDE. The moisture content value is found by dipping the moisture sensor into the churna. As per the Ayurvedic pharmacopoeia of India, the standard value of moisture content should not exceed 10 %.
The pH value represents the acidity or alkalinity of an aqueous solution of the churna. In Ayurvedic pharmacopoeia, standard limits of pH have been provided for particular substances in which hydrogen-ion activity plays a major role in the stability of the substance [24]. The pH value churna is evaluated by using the pH sensor which is interfaced with raspberry pi 3 b through Arduino Uno board. The pH sensor is programmed using Arduino IDE programming as demonstrated in Fig. 2. The pH value of the churna is found by mixing the 1 g powder of churna and 100 mL of drinking water [25]. The pH sensor is dipped into the aqueous solution of the churna. As per the Ayurvedic pharmacopoeia of India, the standard value of pH should be 3.0 to 8.0. The pH sensor is used as the alkalinity checker to check the alkalinity of the solution. The moisture content value and pH value is found for 12 varieties of churna to assure the quality.

Specifications of the sensors
The detailed explanation for sensors specification are presented in table 1.

Results and Discussion
The proposed work uses machine learning algorithms to evaluate the churna based on color. In this section, the performances of the proposed techniques are evaluated. The robustness of the two classifiers such as SVM (support vector machine) and RF (random forest) are evaluated and compared. Also, the moisture content value and pH value are evaluated by sensors. The experimental setup and the datasets used in this research paper are mentioned in section 4.1. The evaluation metrics are illustrated in section 4.2. Lastly, section 4.3 presents the experimental results, performance analysis and comparisons.

Experimental set-up and datasets
The experimental dataset I and dataset II consist of red, green, blue colour values are extracted for the 12 varieties of churna by using Raspberry Pi camera and colour sensor respectively.
To  The various hyper-parameters are selected in this proposed algorithm. For the random forest classfier, the number of trees is selected as 10, the maximum number of considered features are 3, and the maximal tree depth is 3. The algorithm stops when splitting nodes with maximum instances is 5. For the SVM classifier, the kernel used is 'rbf', C (cost parameter) is 1.0, gamma is assigned as '0.1', numerical tolerance is 0.001 and the iteration limit is 100.

Evaluation metrics
Evaluation of the performance of the classification algorithm in the identification of churna based on colour values is essential. Accordingly, the performance measures used are (i) Accuracy, (ii) Specificity, (iii) Sensitivity or Recall, (iv) Precision and (v) F1_score.
In the proposed method, the classification of churna based on color values is carried out using the random forest classifier and support vector machine classifier. The performance measures of the classifiers are computed through the confusion matrix. The classification accuracy is the measure of the ratio between the correct predictions over the total number of samples evaluated. It also refers to the ability of the model to correctly predict the new churna or previously unseen churna. Sensitivity is also called a True Positive Rate. This measure relates to the proportion of positive data points that are correctly considered as positive concerning all positive data points. Specificity is otherwise called a True Negative Rate. It is a measure, that leads to the proportion of negative data points that are falsely considered as positive concerning all negative data points. Precision is a metric that quantifies the number of correct positive predictions in identifying the churna name. F1_ score is used to measure the model's accuracy on a dataset.

Experimental results, performance analysis and comparisons
In this section, the experimental results are presented to analyse the performance of the proposed algorithm. The experiments are divided into two analysis methods. The first analysis is the identification of churna names by SVM and RF classifier. In the first analysis, experimental dataset 1 and dataset 2 are created. The two datasets are given as input to SVM and RF classifier to identify the churna name. The main objective of this evaluation is to exhibit the robustness of SVM and RF in identifying the name of the churna. Performances of the classifiers are analysed and compared using the metrics mentioned in section 4.2. The second analysis is computing the moisture content value and pH value of each churna. Table 2 demonstrates the sample intensity values of red (R), green (G), blue (B) extracted by processing the image of different, standard churna captured by Pi Camera and colour sensor respectively.    Tables 3 and 4.  Table 4 shows the performance measures of the RF classifer. The accuracy, sensitivity and specificity of the datasets 1 and datasets 2 are 99.16 % and 100 %, 95 % and 100 %, 99.54 % and 100 % respectively. The precision, f1_score are found to be 95 % and 100 % , 95 % and 100 % respectively.   The performance of SVM and RF classifiers are compared based on the evaluation metrics such as accuracy, sensitivity, specificity, precision and F1_score. Fig. 6 shows the comparison graph of the SVM and RF classifier for dataset I and Fig. 7 shows the comparison graph of SVM and RF classifier for dataset II. In the proposed algorithm, the RF classifier shows superior generalization performance for the two datasets. The comparison graph indicates that the RF classifier fitted better than SVM classifier with higher accuracy, sensitivity, specificity, precision and F1_score.
The experimental results for the physicochemical parameters such as moisture content value and pH value are highlighted in Table 5. The moisture content value and pH value are evaluated using the moisture sensor and pH sensor interfaced with Raspberry Pi 3 B through the Arduino Uno board. The moisture content value represents the loss on drying. The pH sensor is used as the alkalinity checker. Moisture content value and pH value are used to ensure the quality of churna.

Conclusion
The findings of the present research reveals the effectiveness and accuracy of the novel sensor-based quality assurance method to evaluate each churna based on colour, moisture content value and pH value. The technique developed is effective in evaluating organoleptic and physiochemical paameters rather than manual evaluation. A hardware setup is developed by interfacing the moisture sensor, pH sensor, color sensor and camera module with the Raspberry Pi 3 B. The SVM (support vector machine) and RF (random forest) classifiers are used to classify churna based on color values. The random forest (RF) classifier has outperformed the support vector machine (SVM) classifier. The future scope is to evaluate the organoleptic parameter (color), physicochemical parameters (moisture content and pH) for other types of Siddha medicine such as Kudineer, Parpam, Chenduram, Thylam, etc.