Correlation and path analysis in black and brown seeded soybean [ Glycine max (L.)

Sixty-four soybean genotypes were evaluated to study the association among yield and related traits and determine the directions of association. The field experiment was conducted during the main cropping season in 2019 at Jimma and Bonga Southwestern Ethiopia. The experiment was laid with a simple lattice design with two replications. Data were collected on quantitative traits. Analysis of variance showed significant to highly significant differences among genotypes for all of the studied traits. Correlation analysis exhibited that grain yield was positively and significantly associated with harvest index and the number of pods per plant. Harvest index exerted the maximum positive direct effect on grain yield, followed by pod per plant, seed yield per plant at a genotypic level, and these traits could be used for selection to improve grain yield in soybean.


Introduction
Soybean [Glycine max (L.) Merrill] is one of the most valuable and widely cultivated crops among grain legumes. It is an important position among grain legumes and is rich in protein and oil content. It is a miracle crop of the 20th century, containing 40% high-quality protein and 20% oil. It is also rich in lysine (6.4%) and Vitamin A, B, and C (Shruti and Basavaraja, 2019). In Ethiopia, soybean has increased trends in production and productivity over the last decade. A total of 38,072.7 hectares of land were covered by soybean (CSA, 2017), first in yield per ha -1 between pulse and oil and 5 th in coverage among oil crops grown in the country. In the year 2017, the national average productivity of soybean was 2.27 ton ha -1 (CSA, 2017), which is as far as the potential productivity of the crop, compared to its potential productivity in the research fields, which might reach up to 3 ton ha -1 (Abush, 2012). However, there is no research conducted particularly on black and brown-seeded soybeans in the country, and no authors have identified those characteristics contributing to yield in black and brown-seeded soybean genotypes. Considering the gap, this study was initiated to measure the genotypic and phenotypic association between seed yield and related traits and partition these associations into direct and indirect effects for black and brown-seeded soybean genotypes.

Description of experimental sites
The experiment was conducted at two locations, namely: Melko and Modio, located 365 km and 471 km from Addis Ababa, respectively. Melko is the site of Jimma agricultural research centre, whereas Modio is found under Bonga Agricultural Research Center. The two locations represent two different soybean-growing agroecologies of Southwestern Ethiopia. Jimma experimental site is located at 7 o 46' North, 36 o 00' East at 1753 m.a.s.l. Its average annual rainfall is 1561 mm with a minimum and maximum temperature of 9-28 o C. The soil is characterized by two common types of soils, i.e., nitosol and combisol, with a pH of 5.6. Modio testing site of BARC is located at 7º11′ North, 36º17′ East with an altitude of 1775 m.a.s.l., and the average annual temperature ranges from 14.3-27.5 o C. The soil type is sandy loam, and the pH is 4.9-5.5. The area receives maximum rainfall from June to September; the mean annual rainfall is 1839 mm.

Experimental design and trial management
The experiment was laid out in an 8 x 8 simple lattice design with two replications. Each replication consisted of sixty-four genotypes in eight blocks. The plot size was 9.6 m 2 (4.0 m x 2.4 m). Each plot consisted of four rows with 60 cm inter-row and 5 cm intra-row spacing. The spacing between plots, blocks, and replications was 0.6 m, 1.0 m, and 1.5 m, respectively. There were four rows per plot, out of which each plot's first and fourth rows were border rows. Data were collected from the middle two rows (4.8 m 2 ) of each plot. NPS-B fertilizer at the rate of 122 kg ha -1 was applied at the time of sowing. Weeding and other agronomic practices were conducted based on a production package.

Data collected
The following quantitative data were collected on the plot or plant base: Plant height (PH, cm): was measured and recorded from five randomly selected plants from each plot from the plant's base to the plant's tip at physiological maturity.

Number of pods per plant (PPP)
: number of pods from five randomly selected plants was counted.

Number of seeds per plant (SPP, g):
The average number of seeds from five randomly taken plants was counted.

Number of primary branches per plant (PBPP):
The average number of primary branches from five randomly taken sample plants was measured.
Pod length (PL, cm): the length of five pods from each sample was measured in cm.
Days to 50% flowering (DF): was recorded as the number of days from planting to 50% of the plants flowered with at least one open flower on a plot basis.
Days to maturity (DM): Number of days from sowing till 95% pod turned in to yellow was recorded.
Hundred seed weight (HSW, g): 100 seeds were counted, and the weight was measured with a sensitive balance.
Biological yield (BY, g): above ground, total biomass in a gram of all the plants in all the two rows of each plot were recorded at harvest after sun-dried.
Harvest Index (HI): estimated as the ratio of economic yield to biomass yield.
Grain yield (GYLD, kg ha -1 ): dried plants from each plot were threshed, and seeds obtained from them were weighed and averaged to get the seed yield per plot in grams. Then, grain yield in grams obtained from each plot was adjusted at a moisture content of 13% and converted to kilograms per hectare.

Statistical analysis
Before the combined analysis of variance across locations, all the data collected were checked for the assumptions of analysis of variance (ANOVA), such as homogeneity and normality, using Hartley's F. max test method. Before computing the combined analysis, the test of homogeneity of error variances was performed using Bartley`s test (Bartley et al., 1955), using Ftest, i.e., the ratio of the most significant error mean square to the minor error mean square, and the quotient is (2.142), which is less than three-fold. Therefore, data from the two locations were combined (Gomez and Gomez, 1984). Then, traits that showed homogeneity (<3) were considered for combined analysis. Mean comparisons among treatment means were conducted using the least significant difference (LSD) test at 5% levels of significance. The ANOVA model and structure for simple lattice design are indicated as follows: Pijks = µ + gi + bk(j)(s) + rj(s) + ls + (gl)is + eijks where Pijks = phenotypic value of i th genotype under j th replication at s th location and k th incomplete block within replication j and location s; µ = grand mean; gi = the effect of i th genotype; bk(j)(s) = the effect of incomplete block k with in replication j and location s; rj(s) = the effect of replication j within location s; is = the effect of locations; (gl) is the interaction effects between genotype and location, and eijks = residual or effect of random error. Mean separation was done using ANOVAprotected LSD (least significant difference) at 5% probability level. Yil (j)

Analysis of phenotypic and genotypic correlation
The correlation analysis was performed using SAS software version 9.0 (SAS Institute, 2002). To estimate the phenotypic and genotypic correlation coefficients, the covariance estimates between all pairs of the traits were calculated using the formula: Where, MSPe = mean sum of cross-product for error, MSPg = mean sum of cross-products for genotypes, and r = number of replications.

Path coefficient analysis
The direct and indirect effects of yield-related traits on yield per plot were worked out through path coefficient analysis. The analysis followed the method suggested by Dewey and Lu (1959). The formula was described as follows:

rij = ΣΣrikpkj + Pij
Where, rij=Mutual association between the independent traits (i) and dependent trait (j), as measured by the correlation coefficient. Pij = Component of direct effects of the independent trait (i) on the dependent variable (j) as measured by the path coefficient and, Σrikpkj = Summation of components of the indirect effect of a given independent trait (i) on the given dependent trait (j) via all other independent traits (k).

Results and Discussion
Analysis of variance (ANOVA) indicated significant to highly significant (P<0.05 and 0.01) differences among genotypes for all of the studied traits (Table 3). This suggests the presence of considerable genetic variability among the tested soybean genotypes for various traits, suggesting that the genotypes were genetically diverse and there is a chance to select elite genotypes.

Association among yield and yieldrelated traits
Grain yield is a complex trait and is highly affected by environmental factors. Moreover, it is complicated in inheritance and may involve several related traits. Hence, correlation coefficient analysis is commonly used to measure the magnitude and direction of relationships between various traits and grain yield. In the current investigation, genotypic and phenotypic correlation coefficients between elven traits were calculated (Table 4). Highly positive genotypic correlations (P≤0.01) were identified for grain yield with plant height, number of pods per plant, number of primary branches per plant, hundred seeds weight, and harvest index. Similar results were reported by (Berhanu et al., 2021;Mili et al., 2017;Koraddi et al., 2015). The possible reasons for such a genotypic correlation may result from a pleiotropic effect or linkage of a gene governing the inheritance of these characters.
Moreover, these traits correlated with grain yield were highly heritable and highly correlated with each other. Thus, if they proved to be controlled by a few numbers of genes, selection for their combination was simple (Tigga and Nag, 2021). Phenotypically highly positive significant correlation coefficients (P≤0.01) were observed for grain yield with plant height, number of pods per plant, number of primary branches per plant, number of seeds per plant, hundred seed weight, biological yield, and harvest index. An equivalent result was reported by (Pawar et al., 2020). This indicates that grain improvement could be achieved by improving the traits, which positively correlated with grain yield.

Genotypic direct and indirect effects of different traits on grain yield
The genotypic direct and indirect effects of traits on grain yield are presented in  (Balla and Ibrahim, 2017). This indicated that these traits with positive direct effects on grain yield were important traits selection based on these traits indirectly increasing grain yield; therefore, during selection, consideration of these traits is very important to recommend elite varieties for soybean production agroecologies.

Conclusion
Grain yield had a highly significant and positive association with plant height, number of pods per plant, number of seeds per plant, and harvest index at the genotypic level. Path coefficient analysis based on grain yield as a dependent variable revealed that harvest index and plant height were the significant contributors to grain yield. Positive direct effects of these traits on grain yield indicated their importance in determining these complex traits. Therefore, simple selection based on these agronomic traits could have indirectly improved grain yield in soybean.