Cassava improvement in the era of “agrigenomics”

Ismail Yusuf Rabbi (, Melaku Gedil, Morag Ferguson, and Peter Kulakow
I. Rabbi, Postdoctoral Fellow (Molecular Genetics); M. Gedil, Head, Bioscience Center, IITA, Ibadan, Nigeria; M. Ferguson, Molecular Geneticist, IITA, Nairobi, Kenya; and P. Kulakow, Cassava Breeder, IITA, Ibadan, Nigeria

Pro-vitamin A 'yellow root' cassava developed by the IITA cassava breeding program. Photo by IITA.
In the last 45 years, IITA has played a pivotal role in the genetic improvement of cassava for resource-poor farmers in sub-Saharan Africa (SSA). More than 400 cassava varieties have been developed that are not only high yielding but also resistant to diseases and pests. Many of these improved varieties have been extensively deployed in SSA and have helped to avert humanitarian crises caused by the viral disease pandemics that devastated local landraces in East and Central Africa. The cassava breeding program in Ibadan has a collection of more than 750 elite cassava clones representing current and historical materials accumulated over the last 45 years. These materials, referred to as the genetic gain collection (GGC), are accompanied by extensive field evaluation (phenotypic) data. In addition, the active breeding collection contains over 1000 African landraces and more than 400 new advanced breeding clones that are also accompanied by phenotypic data, including observations of disease and pest resistance, plant architecture, flowering ability, and performance in storage root yield. The most recent success of the conventional cassava breeding program culminated in the release of three vitamin A cassava varieties by the Government of Nigeria. These varieties (IITA TMS I011368, IITA TMS I011371, and IITA TMS I011412) were first cloned from seedlings in Ibadan in 2001 and have been subjected to extensive field testing throughout Nigeria. While almost all cassava in Nigeria are currently white fleshed, vitamin A cassava produces yellow-fleshed roots with nutritionally significant concentrations of carotenoids that produce vitamin A in the human body when consumed as yellow gari or fufu. In cooperation with HarvestPlus, IITA and partners will distribute vitamin A cassava planting materials to more than 25,000 farmers in 2013. New yellow-fleshed genotypes in the pipeline promise continued improvement in pro-vitamin A content, yield, and dry matter in the coming years.

Preparation of cassava DNA for genotyping by sequencing. Photo by IITA.
As the vitamin A cassava illustrates, the genetic improvement of cassava has mostly been achieved through conventional breeding methods based on phenotypic selection. The only known direct application of molecular markers in cassava breeding is selection for resistance to cassava mosaic disease and cassava green mite. Recent advances and a reduction in the cost of the next-generation sequencing technologies now promise to usher in a new era for cassava breeding that will combine the success of conventional hybridization, selection, and multilocational yield trials with the latest advances in genomic resources.

Setting the stage for “next-generation cassava breeding”
Cognizant of the potential of marker technologies to improve the efficiency and effectiveness of cassava breeding, IITA, in collaboration with partners, embarked on the development and deployment of molecular markers1. With the recent accumulation of genomic resources in cassava research, including the first full cassava genome sequence2, our emphasis at IITA has shifted towards the application of these resources in molecular breeding3. One recent achievement is the identification and validation of nearly 1500 single nucleotide polymorphism (SNP) markers through an international collaboration led by IITA’s geneticist, Morag Ferguson4. These SNPs have been converted to a highly parallel hybridization-based genotyping system that has been shared with the international cassava research community through partnership with the Generation Challenge Program (GCP).

An example of an SNP genotyping data plotted with KBioscience’s SNPviewer software. Inset: raw SNP genotyping data from Illumina’s GoldenGate®assay.
In addition, the first SNP-based genetic linkage map of cassava has been developed by IITA in collaboration with Heneriko Kulembeka of the Agricultural Research Institute (ARI), Ukiriguru, Tanzania. A linkage map is analogous to landmarks (SNP markers in this case) placed along chromosomes that guide researchers to genes or genomic regions controlling traits of interest. Such a linkage map is an indispensable tool for marker-assisted selection (MAS). SNP and SSR markers have also been applied to uncover quantitative trait loci (QTL) associated with resistance to cassava brown streak disease (CBSD)―which is ravaging cassava production in Eastern and Southern Africa―in a collaboration between IITA, CIAT, and ARI-Tanzania. Another dramatic development in cassava genomics is the recently completed sequencing of the cassava genome through the partnership of the US Department of Energy’s Joint Genome Institute and 454 Life Sciences2.

The progress in next- generation technologies has drastically reduced the costs of DNA sequencing so that genotyping-by-sequencing (GBS) is now feasible for species such as cassava, ushering in a new era of agricultural genomics5. This will revolutionize the application of genomic tools for cassava improvement. GBS involves the cutting of genomic DNA into short pieces at specific locations using a restriction enzyme. The ends of these pieces are sequenced using techniques that allow sequencing of many samples at the same time. The beauty of this method is the use of adaptors containing barcodes (unique tags) that are enzymatically joined to the digested DNA fragments, enabling simultaneous sequencing or multiplexing of up to 384 samples in one sequencing reaction. This economy of scale greatly reduces the cost of processing each individual DNA to less than $10/sample. Approximately 200,000 markers can be identified and mapped in a very short time. With this powerful tool, breeders may conduct genomics-based research that was inconceivable a couple of years ago. Some of the exciting new research applications include polymorphism discovery, high-density genotyping for QTL detection and fine mapping, genome-wide association studies, genomic selection, improving reference genome assembly, and kinship estimation.

High-density QTL mapping and fine mapping
In the past, a limitation for QTL mapping was the number of markers on a genetic linkage map. With new SNP-based technologies this is no longer a limitation. This allows for fine mapping of QTLs so long as a sufficient number of individuals in the mapping population can be developed. IITA, in collaboration with national partners [ARI-Tanzania and National Crops Resources Research Institute (NaCRRI), Uganda], is using SNPs to discover QTLs associated with sources of tolerance for CBSD.

Preparation of gari, the most popular food product from cassava. Photo by IITA.
The next frontier for cassava genomics
Using the genotyping by sequencing approach, scientists from IITA and Cornell University, USA, are currently genotyping more than 2000 accessions of cassava, including released varieties, advanced breeding lines, and landraces from Africa. This is a pilot study of genomic selection funded by the Bill & Melinda Gates Foundation to explore the potential for using the IITA breeding collection, including genetic gain, local germplasm, and current advanced breeding lines, as the base population to begin genomic selection for West Africa. The IITA breeding collection has been extensively characterized in many locations and over many years. The convergence of high-density SNP data and extensive phenotypic data in IITA’s cassava collection sets the stage for the implementation of genome-wide association studies (GWAS) and genomic selection (GS) in breeding. The aim of GWAS is to pinpoint the genetic polymorphisms underlying agriculturally important traits. In GWAS, the whole genome is scanned for significant marker-trait associations, using a sample of individuals from the germplasm collections, such as a breeder’s collection. This approach of “allele mining” overcomes the limitations of traditional gene mapping by (a) providing higher resolution, (b) uncovering more genetic variants from broad germplasm, and most importantly, (c) creating the possibility of exploiting historical phenotypic data for future advances in breeding cassava.

A schema of genomic selection (GS) processes, starting from phenotyping and genotyping of the training population and selection of parental candidates via genomic estimated breeding value (GEBV)–based selection. Note that selection model improvement can be performed iteratively as new penotype and marker data accumulate.
GS is a breeding strategy that seeks to predict phenotypes from high-density genotypic data alone, using a statistical model based on both phenotypic and genotypic information from a “training population”. For cassava, phenotyping is the slowest and most expensive phase of the crop’s breeding cycle because of the crop’s low multiplication ratio of between 5 and 10 cuttings/plant. Thus, it takes several cycles of propagation (up to 6 years) to carry out a proper multilocational field trial evaluation. The implementation of GS at the seedling stage should: (a) dramatically reduce the length of the breeding cycle, (b) increase the number/unit time of crosses and selections, and (c) increase the number of seedlings that could be accurately evaluated. The reduced breeding cycle means that the ”engine of evolution,” i.e., recombination and selection, can proceed at a rate that is three times as fast as phenotypic-based selection, while saving resources. In conclusion, cassava breeding in IITA is being redefined, thanks to the increasing availability and deployment of genomic resources. Combining these resources with IITA’s long-standing conventional breeding pipeline means that the best days of cassava improvement lie ahead. These efforts will ultimately satisfy the increasing need for more healthy and nutritious food produced in environmentally sustainable ways.

1 Lokko et al. 2007. Cassava. In: Kole et al (ed). Genome mapping and molecular breeding in plants, Vol. 3. Pulses, Sugar and Tuber Crops. Springer-Verlag Berlin Heidelberg.
2 Prochnik S., P.R. Marri,B. Desany, P.D. Rabinowicz, et al. 2011. Tropical Plant Biol. doi:10.1007/s12042-011-9088-z. 3 Ferguson M., I.Y. Rabbi, D-J.Kim, M. Gedil, L.A.B. Lopez-Lavalle, and E. Okogbenin. 2011a. Tropical Plant Biol. DOI 10.1007/s12042-011-9087-0.
4 Ferguson M.E., S.J. Hearne, T.J. Close, S. Wanamaker, W.A. Moskal, C.D. Town, J. de Young, P.R. Marri, I.Y. Rabbi, and E.P. de Villiers. 2011b. Theor Appl Genet. DOI: 10.1007/s00122-011-1739-9.
5 Elshire R., J. Glaubitz, Q. Sun, J. Poland, and K. Kawamoto. 2011. PLoS ONE 6:e19379.

Leveraging “agrigenomics” for crop improvement

Melaku Gedil ( and Ismail Rabbi
M. Gedil, Head, Bioscience Center; I. Rabbi, Postdoctoral Fellow (Molecular Genetics), IITA, Ibadan, Nigeria

Harnessing state-of-the art genomics technologies
The potential application of “Omics” technology, as demonstrated by the steadily growing impact of biosciences, in alleviating the multitude of constraints in agricultural production is rapidly becoming a reality with the advent of next-generation DNA sequencing and genotyping technologies, high throughput (HTP) metabolomics and transcriptomics, informatics, and decision-making tools. These technologies, together with rapidly evolving bio-computational tools, are accelerating the discovery of genes and closely linked molecular markers underlying important traits, leading to the rapid accumulation of genomic resources necessary for devising an efficient and effective breeding strategy geared toward the faster development of varieties of choice.

Researchers in IITA's Bioscience Center. Photo by L. Kumar.
The state-of-the-art technologies including the next-generation sequencing (NGS) for genome and transcriptome analysis, as well as genotyping-by-sequencing (GBS) are being adopted in R4D programs at IITA. For instance, the NGS through outsourcing and multi-partner collaboration; the RNAseq for HTP expression study in cassava; the Illumina’s Golden Gate Assay for HTP single nucleotide polymorphism (SNP) genotyping in cassava, soybean, and maize as well as GBS in maize and cassava. Data generated by these techniques are being applied for marker-assisted recurrent selection (MARS) of drought-tolerant maize, and genome selection (GS) for high-yielding, disease-resistant cassava.

Development of an integrated molecular breeding platform
The new technologies, however, are very data-intensive and demand advanced computational and communication technologies and infrastructure for data acquisition, analysis, and management. For the effective integration of genomics technologies in our breeding schemes, we are building capacity (connectivity to the internet, the necessary hardware/software, and skilled personpower) to acquire, store, and analyze terabytes of data.

The Generation Challenge Program (GCP) of the CGIAR is developing an integrated breeding platform (IBP) to build a comprehensive and integrated crop information system enabling linkages among molecular, phenotypic, and pedigree data. The maize version of International Crop Information System (ICIS), dubbed International Maize Information System (IMIS), has been expanded to include all pedigrees of IITA maize under the Drought Tolerant Maize for Africa (DTMA) project. It has some functionality in terms of molecular data storage but this is limited and we are now generating data sets of hundreds of thousands of markers per line that require different storage solutions. The GCP is consulting with other initiatives such as iPlant and DArT and is working on collaboratively creating solutions for the needs of several user-cases including DTMA, Tropical Legumes (TL)-I, and TL-II projects. In the IBP initiative, IITA is the leading crop center to host the main web-accessible databases of cassava, cowpea, yam, and soybean. The form and functionality of the databases are still a work in progress although activities are ongoing in the application of current versions of ICIS to cassava, yam, and cowpea.

In view of the IBP initiative, we are developing a bioinformatics capacity to (a) manage the newly generated genomic resources of IITA’s research crops, particularly those clonally propagated, (b) use the genomic resources in the public sector for soybean and maize, (c) use comparative genomics techniques for other African orphan crops of high importance, such as cassava, yam, and cowpea, and (d) create a bioinformatics center of excellence to train and provide access for African research scientists.

HTP by genotyping and informatics support tools
The increasing affordability of the NGS technologies has shifted critical consideration from genotyping to phenotyping. According to leading experts, it is now cheaper to genotype than to phenotype a plant. Quality phenotypic data are essential for the interpretation and use of the deluge of genomic data to identify the changes in DNA sequences that influence important traits. The fact that priority agronomic traits are complex and polygenic and interact with the environment necessitates conducting extensive and precise multi-environment evaluations of candidate breeding materials (over several years and in several locations). Therefore, there is a need to invest in precision phenotyping of traits and data capture (from electronic sample tracking to non-invasive HTP) through the use of hand-held devices such as barcode readers and near-infrared spectroscopy. Efforts are being made to develop rapid and accurate phenotyping protocols to integrate with genomic tools in establishing breeding schemes at IITA.

A wide array of techniques and tools is being deployed to associate molecular markers with desirable phenotypic traits. Associated markers can be used to accelerate germplasm enhancement via MARS, marker-assisted backcrossing for the introgression of disease resistance and other simple traits, hence bypassing the necessity of evaluating breeding materials in the field; MARS for rapid cycle population improvement in bi-parental crosses based on genomic estimated breeding value; and GS based on a model developed with a training population to select untested samples.

Our efforts to harness the unparalleled scientific progress in the fields of genomics and bioinformatics are expected to find solutions to the recalcitrant problems confronting small-holder farmers in sub-Saharan Africa.