A Hybrid Approach of Preprocessing and Segmentation Techniques in Automatic Rice Variety Identification System

2021 Abstract Image processing techniques play an important role with various images such as rice grain identification, wheat, fruits, medical, vehicle, and digital text images in image acquisition, image preprocessing, clustering, segmentation, and classification techniques. In the application of object detection and classification images, preprocessing and segmentation techniques are used. This paper delves into the specifics of automated segmentation processes, focusing on rice variety identification and classification images. The aim is to talk about the issues that come up when segmenting digital images and the relative merits and drawbacks of the various methods for preprocessing and segmenting images that are currently available. In this paper, we propose a hybrid approach of Preprocessing and Segmentation techniques to develop an automatic rice variety identification


Introduction
Digital Image processing is a term used in imaging science to describe the mathematical operations used to improve images or derive useful information from them. The study of image processing techniques for different images has been explored in this paper. One of the image processing techniques used in the agriculture industry to recognize the disease in crop plant images and distinguish varieties of crop plants such as rice and wheat can be used to identify different varieties automatically using automation computer vision [1]. New advancements and developments in computer technologies are contributing immensely to rice variety identification, classification, quality inspection, and rice quality grading. Digital images help recognize and classify defects in crop plants, such as rice, which vary in size, color, and form [2]. Another option is to use a BP neural network or a support vector machine to segment the image. Object shape and color features were used to train and integrate a classifier, a hybrid classifier, and a K-means clustering segmentation [3]. Object's region was used to boost edge definition in the image using the histogram equalization image enhancement, and binary segmentation algorithms are performed using a Bayberry image segmentation approach based on salient object detection homomorphic filtering. It may segment a variety of crop plants, including rice, wheat, and spices. It uses preprocessing to enhance the image for human viewing and to make subsequent processing steps on the resulting images by machines easier. It also provides precise details about the vein pattern in the captured image [4,5]. In order to predict nitrogen deficiency in rice, a convolutional neural network (CNN) model was employed in which raw images were preprocessed to subtract white backgrounds [6]. Transfer learning is a machine learning approach that is redefined as a start to take care of an alternate issue utilizing the knowledge gathered from an established model. The authors of that study improved this established model by utilizing pre-trained CNN models dependent on transfer learning using rice leaf disease images [7]. The degree of severity is given due importance to classify the panicle blast of Orzya Sativa using an automatic methodology consisting of a small and robust CNN model [8]. Moisture content is also considered an important feature of rice seed, which determines the water quantity in the rice seed [9]. The rice chalkiness is also evaluated using image processing techniques using the whole area and chalky rice grains [10]. The texture features exhibit similarity, and geometric features prove distinct when deep analysis of rice seed classification performance using different textural and geometric features was performed [11].
The following are some typical image processing steps, but not all image processing systems would need all of them simultaneously [12].
1. Image Acquisition is "A sensor, such as a Scanner, camera, captures and digitizes an image." 2. Image Preprocessing is "the first phase in preparing an image for higher-level processing." The goal is twofold: to remove undesirable features that will obstruct further processing and extract desirable features that reflect valuable information in the image. In this stage, the computer reduces noise and, in some cases, improves certain object features that are critical to comprehending the picture [13].
3. Image Segmentation is "used to minimize the amount of data to make analysis easier. Image Analysis and Image Compression also benefit from segmentation" [14]. "The process of breaking down an image into its constituent regions or objects is known as segmentation," to read the image correctly and accurately identify the image's material, image segmentation must separate the object from the context [15]. 4. Representation and Description here "Representation is the process of converting raw data into a form that can be processed by a computer [16]. Description, also known as feature extraction, is the process of extracting features that result in quantitative information of interest or features that are important for distinguishing one object type from another". 5. Recognition and Interpretation here "Recognition is the process of assigning a mark to an object based on its descriptors' details [17]. The process of assigning meaning to a group of recognized objects is known as interpretation."

Pre-processing
Pre-processing [18] is critical due to noisy, unreliable, and incomplete data. It is one of the preliminary steps that must be completed in order to achieve high step accuracy. "Data in object identification and classification uses different Preprocessing techniques [19]. Although geometric transformations of images (e.g., rotation, scaling, and translation) are listed among preprocessing methods, preprocessing aims to improve the image data by suppressing unwanted distortions or enhancing certain image features necessary for further processing [20]. Image preprocessing methods exploit the significant redundancy of images. Different denoising methods have been suggested. 1) When photos are normalized before being endorsed, the size and position of the endorsements would be consistent across the data set's pages.
2) If images are printed, using normalized images will avoid printing issues caused by the page size and orientation changes.
1) Image normalization is a time-consuming process that, in large cases, can add considerable time to the e-Discovery export process.
2) Using normalization software that is poorly built will degrade the overall image quality.

Histogram Equalization
Guzman and Peralta [20] 1) Simple and effective in enhancing picture contrasts. 1) Minimizing mean square error is a time-consuming operation. 2)Ability to deal with both deterioration and noise 1) A fair degradation function estimation is ineffective.

Segmentation
It divides an image into regions with different properties, such as color, form, brightness, contrast, and grey. A digital grayscale image is used as the process's data. (for example, an image with some object). The process produces anomalies as a result of its output. Segmentation is used to provide more detail than is available in images. Various techniques such as neural networks, decision trees, rule-based algorithms, and Bayesian networks are used." Various methods for segmenting object detection and classification image regions are proposed. "Pixel-based methods, Edge-based methods, Region-based methods, Model-based methods, Texture-based methods, ANN-based methods, Fuzzy Theory-based methods, and Genetic Algorithm-based methods [22]." 1) This algorithm works well when the region homogeneity criterion is simple to define.
1) Costly in terms of computational time and memory.
2) There may be under and over-segmentation in the picture, as well as holes in the field.

Edge Based
Anami et al. [16] 1) It does not necessitate any prior awareness of the image's material.
2) It has a low computational time.
1) If there are too many edges, it is a time-consuming operation.
3. Region Based Chaugule and Mali [17] 1) It's easier to describe and put into practice.
1) It's difficult to figure out how many clusters there are.
2) It's difficult to make use of spatial data.
3) Feature selection is difficult to comprehend.

Deformable Models Approach
Devi et al. [18] 1) This method is resistant to noise and erroneous edges.
2) Ability to create a surface from a collection of images 1) To position the initial model, a manual approach is needed.
2) Choosing parameters is difficult. 5. Texture Based Ghatkamble [19] 1) Using training data, it is simple to solve complex problems.
2) Error detection is a simpler procedure.
1) Training is expensive and subject to human error.
2) It is important to examine different image forms.

Artificial Neural Network Based
Guzman and Peralta [20] 1) It does not necessitate the development of time-consuming services.
2) Could fully take advantage of the parallel existence of neural networks.
1) A longer period of training.
2) The initialization can have an effect on the result.

Fuzzy theory Based
Hobson et al. [21] 1) A single fuzzy rule was used to emphasize the importance of the image's feature-based and spatial details.
2) The membership functions' structure and related parameters were derived automatically.

3)
Determining fuzzy membership is a difficult task.

Genetic
Algorithm Based

Kambo and Yerpude [22]
1) Prove to be good at breaking out of local optima while still providing much versatility.
2) It's good at boosting contrast and delivering natural-looking images.
3) A straightforward method for tackling complex optimization problems.
1) Simple GA can converge extremely slowly or fail in complex designs due to convergence to an 2) Inappropriate local optimum.

Proposed Methodology
In this research paper, we propose a hybrid approach of Preprocessing and Segmentation techniques to develop an automatic rice variety identification system that is capable of identifying the variety or quality of food grains based on extracted features from the rice images. The proposed system preprocesses the images from datasets, extracts the features, segments these images, and classifies the images based on extracted features. The proposed system is capable of identifying rice grains as well as rice seeds. In the phases of preprocessing and segmentation, we propose the following algorithm to gain better results in the variety identification process of rice images. Using a proposed algorithm, we have succeeded in removing impurities present in the samples of images and obtaining clean images for the classification process.

Conclusion
Various preprocessing and segmentation approaches are examined in this paper. It is well known that all approaches are effective for a variety of purposes. In the table, the benefits and drawbacks of these approaches are discussed. "Preprocessing filters such as Gabor and Histogram Equalization have been used to enhance different imaging modalities and diagnoses. The simplest segmentation methods are based on grey level techniques like thresholding, i.e., pixel-based and region-based methods, but they have limited implementations". However, by combining it with other techniques, the overall performance of all techniques can be increased. Segmentation using a Genetic Algorithm is being used to find approximate solutions to optimization problems. In this paper, we have also proposed an algorithm of preprocessing and segmentation techniques to gain better results in the identification and recognition process of images in automatic object detection and recognition system.