Maize is a globally cultivated staple crop and one of the most successful examples of heterosis utilization in food production. The development of elite inbred lines is critical for breeding hybrid varieties and achieving sustained yield improvements. However, efficient breeding of inbred lines faces significant challenges, including the broad origins of germplasm resources, complex and diverse genetic structures, and low accuracy in phenotypic prediction. Advances in modern genomics and artificial intelligence technologies provide powerful tools for gaining deeper insights into the genetic backgrounds of germplasm resources and improving the accuracy of trait prediction.
Recently, scientists from the Institute of Crop Science, Chinese Academy of Agricultural Sciences, published a research paper titled "Genomic analysis of modern maize inbred lines reveals diversity and selective breeding effects" in SCIENCE CHINA Life Sciences. The research team collected 2,430 inbred lines derived from elite commercial hybrids promoted across different geographical regions, along with 503 inbred lines from natural populations, and constructed a broadly sourced and genetically diverse inbred population. The study began with resequencing this large-scale population of maize inbred lines. After variant calling and quality control, 437,081 high-quality SNP markers covering the entire genome were identified.
Based on these data, the researchers conducted an in-depth analysis of genomic variation distribution, genetic diversity, heterotic group types, and population differentiation characteristics within the population. The findings highlighted the impact of artificial selection on shaping the maize genome and identified two potential new heterotic groups. Through selection sweep analysis between the 2,430 modern inbred lines and 503 natural population inbred lines, the study uncovered numerous loci potentially under selection during the breeding process. Within these loci, genes associated with critical biological processes such as flowering time, root development, stress resistance, yield, and plant architecture were identified, emphasizing the importance of these genomic regions.
Using the identified selected loci, the research team employed deep learning-based genomic prediction algorithms to develop predictive models for eight traits related to plant architecture and yield. These models demonstrated high accuracy in predicting target traits, confirming the reliability of the selected loci.
Additionally, the study introduced the concept of the selection proportion to explore the relationship between the size of validation populations and breeding efficiency in genomic selection breeding. Simulation analyses reveal that the selection proportion is a critical factor influencing genetic gains for yield-related traits.
In summary, this research established a diverse maize germplasm resource pool, identified new potential heterotic groups, and uncovered numerous elite breeding loci. These findings provide valuable genetic resources for maize breeding, and the genomic prediction models developed using deep learning algorithms can guide the utilization of this population in breeding programs.