Article information

2016 , Volume 21, Special issue, p.104-115

Cheshkova A.F., Aleynikov A.F., Stepochkin P.I.

Application of graphical features of the R programming environment for analysis of experimental data on the breeding of triticale

Purpose.The paper describes an application of the R environment for visualization and statistical analysis in breeding studies. The material for research includes the data of field experiments conducted by the GNU SibNIIRS during spring and winter triticale world VIR collection samples in 2009 (51 samples of spring triticale and 103 samples of winterone) and 2011 (120 samples of winter triticale). 12 morphological characters were taken for evaluation of the samples. The studies were conducted in order to define groups of triticale samples that differ from each other on a range of traits, as well as to detect regularities in correlated variability of triticale quantitative traits.

Methodology.We used standard statistical methods of multivariate data analysis (Pearson's method, principal component analysis, cluster analysis by Ward method), implemented in the R software environment.

Findings.First, the data exploration was fulfilled to check the necessary conditions for applying the presented adequate statistical techniques. The protocol of data exploration includes: detecting outliers and collinearity, testing on homogeneity of variance and normality, revelation of the relationships character between variables. Then we applied cluster analysis technique to classify triticale samples and to divide them into several groups. By choosing samples from different clusters for hybridization it is possible to achieve greater genetic diversity. The effectiveness of the division to clusters was checked by bootstrap methods.

Finally.The primary components analysis (PCA) was carried on to identify correlated variability of triticale quantitative traits. It was revealed that the variability of quantitative traits of triticale was determined by 3-4 main components, which accounted for 70 to 80 % of the total variance. The first component includes with large loading coefficients such traits as “length of spike”, “number of spikelet per ear”, “number of grains per spike” and “spike grains weight” that form a correlation group, which gives the opportunity to interpret this component as “spike productivity”.

Conclusion.Use of the R software environment allowed to carry out data exploration, cluster and component analysis and to demonstrate their results.

[full text]
Keywords: R environment, statistical analysis, triticale, breeding

Author(s):
Cheshkova Anna Fedorovna
PhD.
Position: Leading research officer
Office: Siberian Federal Scientifice Center of Agro-BioTechnologies
Address: 630501, Russia, Krasnoobsk
Phone Office: (383) 3484493
E-mail: anna.cheshkova@sorashn.ru
SPIN-code: 8241-4271

Aleynikov Alexander Fedorovich
Dr. , Professor
Position: General Scientist
Office: Siberian federal scientific center of biotechnologies of the Russian Academy of Sciences
Address: 630501, Russia, Novosibirsk
Phone Office: (383) 3483460
E-mail: fti2009@yandex.ru
SPIN-code: 6885-7554

Stepochkin Petr Ivanovich
Dr.
Position: Leading research officer
Office: Siberian Institute of Plant Growing and Breeding - Branch of the Institute of Cytology and Genetics SB RAS
Address: 630501, Russia, Krasnoobsk
Phone Office: (383) 3481947
E-mail: petstep@ngs.ru
SPIN-code: 7015-5315

References:
[1] Mastitskiy, S.E., Shitikov, V.K. Statistical analysis and data visualization with R. Available at: http://r-analytics.blogspot.com (accessed 26.09.2016). (In Russ.)
[2] Zuur, A.F., Ieno, E.N., Elphick, C.S. A protocol for data exploration to avoid common statistical problems. Methods in Ecology and Evolution. 2010; (1):3–14.
[3] Chang, W.R Graphics Cookbook. O’Reilly Media; 2012: 413.
[4] Ephimov, V.M., Kovaleva, V.Yu. Mnogomernyy analiz biologicheskikh dannykh. Uchebnoe posobie [Textbook on multivariate analysis of biological data]. Sankt- Peterburg: VIZR; 2008: 98. (In Russ.)
[5] Kim, J.O., Mueller, C.W., Klecka, W.R., Aldenderfer, M.S., Blashfield, R.Ź. Factor, discriminant and cluster analysis. M.: Finance and Statistics, 1989. 215 p. (In Russ.)
[6] Shitikov, V.K., Rozenberg, G.S. Randomizatsiya i butstrep: statisticheskiy analiz v biologii i ekologii s ispol'zovaniem R [Randomization and bootstrap: statistical analysis in biology and ecology using R]. Tol'yatti: Kassandra; 2013: 314. (In Russ.)
[7] Smiryaev, A.V., Martynov, S.P., Kil'chevskiy, A.V. Biometriya v genetike i selektsii rasteniy [Biometrics in genetics and plant breeding]. Moscow: MSKhA; 1992: 269. (In Russ.)
[8] Cheshkova, A.F., Aleynikov, A.F., Stepochkin, P.I. Analysis of Covariation of Quantitative Characters of Triticale. Achievements of Science and Technology of AICis. 2016; 30(5):50–52. (In Russ.)

Bibliography link:
Cheshkova A.F., Aleynikov A.F., Stepochkin P.I. Application of graphical features of the R programming environment for analysis of experimental data on the breeding of triticale // Computational technologies. 2016. V. 21. Special issue: Information technologies, systems and equipment in agroindustrial complex. P. 104-115
Home| Scope| Editorial Board| Content| Search| Subscription| Rules| Contacts
ISSN 1560-7534
© 2024 FRC ICT