Some of my favorite foods are truffles, and perhaps the best tasting truffle – in my humble opinion – is the famous Périgord Black Truffle, also known as Tuber melanosporum, which is known as a prized delicacy capable of fetching a pretty penny.
Tuber melanosporum is an important ectomycorrhizal fungus that can be cultivated with crop trees such as Hazelnut, and other truffles can be cultivated with other nut trees such as Pecan. Despite a concerted effort to understand the biology of T. melanosporum, both through a genome sequence and other molecular tools to understand population biology – as well as government efforts to promote cultivation with nut trees – harvests of the Périgord Black Truffle have been declining since the 1970s. There has been no agreement in what has been causing this decline from a community of researchers.
In a brief report entitled “Drought-Induced Decline in Mediterranean Truffle Harvest” in the journal Nature Climate Change, Büntgen et al. recently described how climate change may be affecting truffle production, either directly, or by affecting the biology of the truffle’s host trees. Such measurements are challenging in numerous regards; inspecting climate data is difficult enough, but reports of truffle harvesting are scarce for many reasons, one of which is the fact that many successful truffle collectors are reluctant to give information about their productive grounds.
The authors correlated climate details from 12 climate models with truffle harvests from various parts of Europe (namely Aragón in Spain, Périgord in southern France, and Piedmont and Umbria in Northern Italy). They observed that tree ring growth in Oak trees and truffle production were correlated and showed that increased measurements of summer evapotranspiration could explain both the reduction in plant growth and truffle production.
The authors hypothesize that tree and fungus competition for summer soil moisture may be reducing the production on truffle sporocarps. Unless the present course of climate change is reversed, it is expected that truffle harvests in Europe will continue to decline. This is bad news not just for the truffles and trees, but the people who enjoy both.
UPDATE: The New York Times have posted an article (December 20th) entitled “$1,200 a Pound, Truffles Suffer in the Heat“
The Cucurbitaceae is an agriculturally important family of plants (think melons, pumpkins, cucumbers, squashes, etc.) and one of the most popular species in this family is Watermelon. Watermelon has been cultivated for more than 4,000 years and was most probably spread by nomadic people as a portable source of both water and pre-packaged nutrients. The estimated center of diversity of the Cucurbits is in Southern Africa. Watermelon has many cultivars – more than 200 in production worldwide – with a wide range of phenotypic diversity and a wide area of production that accounts for 7% of land grown for vegetables.
Unfortunately, Curcubits are generally susceptible to pathogens – most typically in the form of bacterial and fungal pathogens. The genomes in this group are starting to pile up which makes the family an interesting group for comparative genomics studies –particularly in the development of model species for plant pathogen studies.
The recently published paper “The draft genome of watermelon (Citrullus lanatus) and resequencing of 20 diverse accessions” by Guo et al. in the journal Nature Genetics, described the draft genome for the Citrullus lanatus East Asian cultivar 97103 and then re-sequenced 20 different watermelon accessions – representing three different sub-species – in order to observe genetic diversity in wild.
Almost 47 Gb of sequence data was generated using Illumina’s sequencing platforms to give 108X coverage on the relatively small estimation of 426 Mb C. lanatus genome, while the draft is approximately 353 Mb or 83.2% of the estimated genome size. Unmapped reads, totaling almost 20% of the sequencing data, could not accurately be constructed into contigs because of explicit regions of genome duplication.
The authors estimated 23,440 genes in the watermelon genome – very close to both the cucumber genome (no surprise) and the human genome (surprise). About 85% of the genes from watermelon could be predicted on the basis of homology to other plant genes. The authors did a throughout assessment of transposable elements, various repeats, and classified functional RNAs from ribosomal RNA subunits to microRNAs. Like other plants, watermelon shows gene enrichment in subtelomeric regions. On the basis of comparison to other genome sequences, watermelon possesses the seven paleotriplications shared with the eudicots.
The authors assessed genetic diversity across varieties of C. lanatus by sequencing 20 representative accessions anywhere between 5X and 16X coverage. The estimated diversity of these accessions was considerably lower than similar arrays of accessions in maize, soybean, and rice. One explanation of the disease susceptibility of the Cucurbitaceae is this low level of genetic diversity. As a result, one objective of breeding programs for watermelon is to introduce more diversity from wild accessions.
Lastly, the authors assessed a number of key features of the C. lanatus genome (along with the other Cucurbitaceae): vascular transport of water and nutrients along vine-like stems, sugar content and accumulation, and the presence of an interesting non-essential amino acid – originally described from watermelons – called Citrulline.
The watermelon genome database is located both here and here.
I recently returned from the Mycological Society of America annual meeting – this year held at Yale University in New Haven. There were lots of great talks about fungal genomics, systematics, and ecology – and it’s always good to see old mycological friends and make new ones.
Håvard Kauserud of The University of Oslo, who spoke about recent research from his laboratory, gave one of my favorite talks of the meeting. His talk took place during a very rewarding afternoon session on fungal ecology. Already highly prolific, there’s been an increase in the flood of papers to come out of the Kauserud lab over the last year. Just this month, there’s a nice commentary on the phenomenon of metagenomic tag switching during amplicon sequencing published in the journal Fungal Ecology.
Another paper published this month in the journal New Phytologist is the study “Seasonal trends in the biomass and structure of bryophyte-associated fungal communities explored by 454 pyrosequencing”, authored by Davey et al., a group of researchers both members and affiliates of the Kauserud laboratory, and it is this paper I will address here.
Bryophytes represent a portion of the dominant vegetation in boreal forests, but very little is understood about the taxonomy, seasonality, or biomass of the fungi associated with them. Additionally, microbes associated with mosses may be responsible for nitrogen fixation and nutrient immobilization as epiphytes or on forest soils. A previous study from the Kauserud lab reported high levels of fungal biomass and active plant cell wall degrading enzymes identified from moss-associated fungi.
As I have mentioned here numerous times, fungi are notoriously hard to identify by cultural and morphological means and are extremely diverse. To understand this diversity, the authors performed 454 pyrosequencing of the ITS2 region of the ribosomal DNA operon for molecular taxonomic identification against a database of known fungal sequences. This sequencing was done in concert with an ergosterol HPLC assay that is used to estimate living fungal biomass.
The authors identified a large numbers of fungi, some presumably moss associated, and the total amount of fungi recognized was comparable to that found in forest soils. The majority of fungi were identified as Ascomycetes, which agrees with other studies investigating vascular plant phyllosphere communities using the primer pair ITS3 and ITS4. Additionally, this study identified a consistent taxonomic profile as a previous study from the Kauserud laboratory using a cloning strategy and Sanger sequencing approach. Not surprisingly, this study reports orders of magnitude more fungi but identified roughly the same groups of fungi (Helotiales, Chaetothyriales, Agaricales, and Tremellales).
The researchers addressed seasonal variation by sampling every eight weeks between April and January over the course of a year. Quite interestingly, there is a strong consensus in this study with other research that provides evidence that fungi not only survive under snowpack, but also continue to grow during the winter months. While the researchers found consistent trends with regard to season, there were fluctuations in fungal biomass when considering host bryophyte. By using principle component analyses, the authors show that the fungal communities are structured mainly by host plant and secondarily by the type of bryophyte tissue that was sampled. This paper is an important contribution to the growing literature that show that plant-associated fungi are extremely diverse, dynamic, and show complex relationships with host plants.
You’d have to be living under a rock – as some amphibians do – to not be aware of the massive extinction facing our vertebrate friends living within aquatic habitats. Researchers still don’t fully understand what is causing the amphibian mass-extinction – stress from habitat loss, increased chemical concentrations in the environment, and an auto-immune degrading infection have all been proposed. What is known is that the chytrid fungus Batrachochytrium dendrobatidis – opportunistic or not – is infecting and killing a large number of amphibians.
What is not fully understood about B. dendrobatidis is its pathogenicity and what mechanisms it employs to cause infection. A recent paper, “Species-Specific Chitin-Binding Module 18 Expansion in the Amphibian Pathogen Batrachochyrium dendrobatidis”, published in the mBio journal by John Abramyam & Jason Stajich at UC Riverside, begins to address this pathogenicity. As the authors point out – more than 100,000 species of fungi have been described to date and very few of them are pathogenic. This means that the ability to be pathogenic is derived from somewhere: genome expansion events, gene family duplication and diversification events – and we’re only starting to understand horizontal gene transfer events in fungi. This paper addresses the expansion of a gene family across two B. dendrobatidis genomes that are associated with pathogenicity.
When comparing the genomes of B. dendrobatidis with the genomes from other chytrid fungi there has been an expansion of genes within the family Carbohydrate-Binding Module Family 18 (CBM18). The CBM18 family is a large group of proteins that have been implicated in other fungal pathogenic infections on both plants and animals. The authors here question whether this interesting lineage specific expansion of CBM18 in B. dendrobatidis could be associated with the virulence of its pathogenicity on amphibians.
The authors used the CBM18 protein family domain HMM to search across the B. dendrobatidis genomes and found an increase in the number of domains when comparing it to genome of its closest relative. When constructing phylogenetic trees of the CBM18 gene family, three monophyletic and strongly supported clades emerged. When focusing on divergence of the protein domains, the authors determined that individual domain groups were monophyletic and showed a general pattern with regards to their genome locations.
More specifically, clades of the CBM18 family appears to possess different gene functions, some of which appear to be similar to lectins (LL), tyrosinase/catechol oxidases (TL), and chitin deacetylases (DL). The function of these genes has yet to be experimentally determined, but the authors make some deductions based on DNA sequences. The lectin-like genes may be involved in the sequestering of chitin, which could then be disrupting the amphibian immune response. The tyrosinase/catechol oxidase gene family is associated with melanin synthesis, which could be disrupting the electron transport of the infected amphibians. Lastly, chitin deacetylases may be involved in suppressing defense mechanisms in place to suppress the fungal infection of the host. The authors plan to continue to elucidate the pathogenicity of B. dendrobatidis in an attempt to understand the ecology and evolution of its infection on amphibians.
The average person in the United States eats more than 10 kilograms of tomatoes a year – underscoring the fact that the fruit is one of the most important plant crops in cultivation. To improve taste, texture, and disease resistance – just to name a few traits – a large consortium of researchers has initiated and provided a draft tomato genome. In fact, the research consortium has published the genome sequence from two varieties of tomatoes: the domesticated inbred Solanum lycopersicum strain Heinz 1706 – the variety famous for ketchup – and the wild breeding Peruvian ancestor, Solanum pimpinellifolium.
The consortium published the draft genome sequences with a paper entitled “The tomato genome sequence provides insights into fleshy fruit evolution” in the journal Nature. The consortium started sequencing the genome officially in 2003, but heterozygosity and duplication events made assembling the genome difficult. The tomato genome is approximately 900 Mb – smaller than the Human genome – but certainly not small by eukaryotic standards. Genetically and phenotypically diverse, the genus Solanum is one of the largest in the angiosperms.
The genomes of Solanum lycopersicum and S. pimpinellifolium only show 0.6% divergence and there is evidence of recent hybridization between the two species. Both species show approximately 8% genome divergence compared against close relative potato, Solanum tuberosum. Across the genus Solanum there has been two genome triplications with subsequent gene loss: one genome triplication is ancient and shared with all the rosid clade and another triplication is shared within the Solanaceae, which appear to be highly syntenic across the family. The genomes were completed with both Sanger- and Illumina-derived sequences and assembled with the help of physical and genetic maps developed from a long history of tomato breeding efforts.
There are 34,727 and 35,004 genes identified across the genomes of Solanum lycopersicum and S. pimpinellifolium respectively. These findings are similar to other plant genomes as 8,615 of these genes are found to be common to tomato, potato, rice, grape, and Arabidopsis. Expression was assessed by replicated RNA-Seq of root, leaf, flower, and fruit tissues. A total of 18,320 orthologous gene pairs were found in tomato and potato indicating diversifying selection between the two species of Solanum.
The consortium specifically compared tomato to grape in this study, as grape and tomato shared a common ancestor at approximately 100 million years ago, before the first whole genome triplication event that preceded the rosid-asterid divergence. Additionally, both grape and tomato have similar molecular fruit maturation mechanisms. When comparing the genomes of tomato and grape, approximately 73% of gene models are orthologous. By estimating genome triplication events, the researchers conclude that the genome duplication event within the Solanaceae occurred roughly 71 million years ago and approximately 7 million years prior to the tomato-potato divergence.
Having a draft genome sequence is an important mechanism to understanding the molecular biology of the tomato plant. Genome duplication events gave rise to the diversification of genes responsible for enhanced fruit physiological and chemical development – such as lycopene synthesis – and include photoreceptors and transcription factors that influence fruit ripening. Additionally, tomato has had a contraction in the number of gene families associated with toxic alkaloid synthesis – the chemical hallmarks of many members of the Solanaceae. One interesting question not answered by this research is the genomic mechanism by which the tomato regulates nutrient investment in above-ground fruits while the potato regulates starch investment in below-ground tubers.
These two tomato genomes, along with the genomes of fellow Nightshades completed or in the works (potato, pepper, tobacco, petunia, eggplant, etc.), will help breeders to develop traits desired by producers, like long shelf life, and fruit quality traits desired by tomato-consumers, such as taste, color, and texture. In addition to these benefits, the draft tomato genomes will provide insights into the biology and nutrition of the Solanaceous plants, and provide more information for comparative genomics within this important economic group of plants.
The understanding of effector proteins has advanced by leaps and bounds in the last few years. Secreted by microorganisms interacting with plants, these small proteins enter a host cell and modify physiological changes, most notably influencing the suppression or activation of host directed immunity. Fungal effector proteins have been characterized in the pathogen infection process as well as the suppression of host defenses in mutualistic associations such as mycorrhizae. There is a recently published book, edited by Francis Martin & Sophien Kamoun, which addresses the current state of knowledge on the biology of microbial effector proteins.
Published on January 6th in the journal PLOS One, the paper “Using Hierarchical Clustering of Secreted Protein Families to Classify and Rank Candidate Effectors of Rust Fungi” authored by Saunders et al., seeks to describe unknown effector proteins by exploring the diversity of secreted proteins of rust fungi. The Kamoun Lab has been at the forefront of understanding effector biology in the fungi, and this paper is a significant contribution to understanding how rust fungi invoke pathogenicity on their host plants.
Rust fungi are a monophyletic group of pathogens which cause damage on many important economic crop plants. In this study, the authors investigated two pathogens with sequenced genomes, Puccinia graminis f. sp. tritici, the cause of wheat stem rust, and poplar leaf rust, Melampsora larici-populina. By developing an analysis pipeline, the authors inspected the secretome of both fungi to search for putative effector proteins and describe their structure and possible mode of infection into a plant. Few plant defense mechanisms have been identified for rust fungi. This study was considered a preliminary step to identify candidate rust effectors in the eventual selection of new resistance (R) genes in plant breeding programs and genome sequencing initiatives.
Eight families of putative effector proteins were identified in the secretomes of P. graminis f. sp. tritici and M. larici-populina, and a total of 6663 proteins were identified by the pipeline, with 2826 proteins containing secretion signal peptide regions. Analysis of the protein motifs identified several conserved cysteine motifs common to other effector proteins previously characterized from fungal and oomycete plant pathogens. Not surprisingly, the authors identified many previously unrecognized proteins with domains that exhibited similarity to known pathogenicity-related or haustorial-expressed fungal proteins. Both P. graminis f. sp. tritici and M. larici-populina showed differences in the types of effectors secreted and the numbers of each putative effector tribe.
As I mentioned before, this study should be considered a first step in the identification of pathogenicity related effector proteins from rust fungi. Next steps would include wet lab characterization and experimental validation for putative effectors identified in this paper. Additional studies will be needed to address the functional expression of these proteins, as well as the R genes expressed in planta, during the infection process initiated by the rust fungus. This paper provides an interesting priority list for further studies in this rapidly advancing area of understanding the biology of the intimacies of the plant-fungus interaction.
One of the most exciting aspects of the genomic revolution in biology is understanding the genetic mechanisms behind an organisms natural history. Within the fungi it has been fascinating to begin to tease apart how seemingly similar life histories can be explained by disparate genomic landscapes, or opposingly, how completely different modes of survival can be explained by a common set of genes. There has been a steady increase in research over the last few years documenting a large array of tools in the genomic toolbox.
A recently published paper in the journal PLOS One, authored by Andrew et al., entitled “Evidence for a Common Toolbox Based on Necrotrophy in a Fungal Lineage Spanning Necrotrophs, Biotrophs, Endophytes, Host Generalists and Specialists” comes from the Kohn Lab at the University of Toronto, which has long studied the fungal family Sclerotiniaceae. The Sclerotiniaceae is a group of Ascomycetes known for being typically necrotic pathogens which are host generalists or specialists.
This is a great group of fungi to address questions like: why is one species necrotic while another is just biotrophic? …and why are some fungi specialists, while others generally infect plants without regard to taxonomy? To address these questions the authors sampled across 52 strains of fungi in the Sclerotiniaceae representing 30 taxa covering the spectrum of host specificity/generality and trophic types. They chose a suite of genes responsible for both general cell housekeeping (controls, etc.) and associated with pathogenicity and constructed phylogenies of these genes to observe relationships between these 52 strains. Evidence of positive selection acting on these genes, as well as site-specific selection, was also assessed. Lastly, the authors assessed the pathogenicity of these strains by initiating infection studies on Arabidopsis thaliana plants.
The authors found that there are at least two origins of biotrophy from a necrotrophic ancestor in the Sclerotiniaceae and that there is evidence of selection on all the genes associated with pathogenicity tested in this study. The housekeeping genes in this study were used to control for the phylogenetic analysis and showed no evidence of positive selection as opposed to the pathogenicity genes. Furthermore, likelihood analyses showed no statistical differences in the genes of strains from different trophic lifestyles, as well as host generalists and specialists.
Within this study on the Sclerotiniaceae, it appears that there is a common tool box of genes shared by the fungal strains studied here. The level of expression of genes differed in this study, which could explain trophic and host specificity differences exhibited by these fungi. It will be interesting when we have more genome sequences from this family to see how genome structure and rearrangements have contributed to the expression and diversity of genes associated with necrotrophy in this family of fungi.
I recently wrote about a paper that surveyed the diversity of bacteria in public restrooms using metagenomic techniques. While that paper focused on bacteria on bathroom surfaces, another recent paper – “Widespread Occurrence of Diverse Human Pathogenic Types of the Fungus Fusarium Detected in Plumbing Drains”, authored by Dylan Short and colleagues – focused specifically on probing the diversity of the large Ascomycete genus Fusarium found in sink drains, with specific focus on isolates that are human pathogens.
The authors sampled 471 drains – more than 95% of which were from public bathroom sinks – from 131 buildings throughout the mid-eastern to southern United States (and California too). They selectively isolated Fusarium species from sink drains using cotton swabs and then streaked petri plates of Nash-Snyder Agar, which is a semi-selective medium containing the fungicide pentachloronitrobenzene. The plates were inspected after the fungi had some time to grow, were propagated, and then verified as Fusarium species using microscopic morphology and DNA sequencing.
Six different loci – translation elongation factor (TEF), the internal transcribed spacer region (ITS) into the large ribosomal subunit (LSU), the nuclear rDNA intergenic spacer region (IGS), the RNA polymerase II large subunit (RPB2), portion of the alpha-tubulin (TUB) gene, and calmodulin (CAM) – were identified using Sanger sequencing to assess the diversity of Fusarium in the sink drains. The sequence data was compared to an extensive database of the genus Fusarium maintained by the Geiser Lab and others.
Fusarium species were extremely common in sink drains; 66% of the sink samples – and 82% of all the buildings sampled – yielded at least one isolate. These isolates could largely be placed within three Fusarium species complexes: the Fusarium solani species complex (62% of samples), the F. oxysporum species complex (28%), and the F. dimerum species complex (8.5%). Sink drains from 91% of private residences and 80% of public buildings yielded Fusarium isolates. Of all the buildings that yielded Fusarium within sink drains, approximately 80% contained one of the six major isolates recognized from human infections.
It is interesting to note that human infections from Fusarium species are rare, but the six most common Fusarium isolates found in sink drains are also the six most common involved in human infection. The authors note that it’s apparent that people are in constant contact with these fungi within indoor environments. It’s also notable that novel species complexes were identified using these techniques and that there was a wide phylogenetic breadth to the Fusarium isolates that were sampled from sink drains.
This paper is a substantial contribution to the growing literature documenting the indoor environment for fungi. The next step would be to use metagenomic techniques – and marker loci for fungi to encompass a meta-taxonomic assessment – to identify all the fungi found in sink drains.
I recently came across an article I found interesting about a widespread geological mystery I was not previously aware of: the presence of filamentous microfossils found worldwide in sediments from the Permian-Triassic boundary transition. It’s been previously debated that these fossils could be the descendant of either filamentous Ascomycete fungi or freshwater Zygnematateous algae based on their morphology and chemistry. Either choice represents drastically different scenarios for environmental change that occurred 250 million years on a global scale. Could the predominance of this organism be the cause of massive plant destruction or the effect of plant destruction from flooding, which is also characteristically found at the end of the Permian period?
In a paper entitled “Fungal virulence at the time of the end-Permian biosphere crisis?”, published in the journal Geology, a group of researchers push the argument toward identifying these fossils – the morphospecies named Reduviasporonites stoschianus – as ancient relatives of the asexually reproducing fungus Rhizoctonia.
Levels of 13C in the fossils do not exclude them from being either fungi or algae, and nitrogen isotope composition would point to a fungal lifestyle. Cellulosic walls of known filamentous green algae are usually not geologically preserved as well as those identified as Reduviasporonites. Since there has been no conclusive chemical studies of these microfossils the authors hare rely on microscopic morphological comparisons.
Reduviasporonites stoschianus is found in more than 90% of many geological formations at the time of the Permian–Triassic boundary. These organisms formed a characteristic “barrel” shaped filaments anywhere between 10 and 90 μm, which look like monilioid hyphae that are typified by Rhizoctonia. This article states that Rhizoctonia “are mostly Basidiomycota, but some represent Ascomycota” which is incorrect. Rhizoctonia are placed in the family Ceratobasidiaceae, which is in turn placed in the order Cantharellales of the Basidiomycetes.
It’s certainly difficult to tell the extent of pathogenicity on a host from fossilized material as there few real observations of the invasion of plant tissues. Furthermore, by observing fossils you are not certain if the presence of these putative fungi is the cause of plant death or a symptom of decline. While it’s difficult to determine virulence based on fossil evidence, this paper introduces some interesting speculative evidence.
Legumes are a very successful lineage of plants which have developed associations with soil microbes, most notably endosymbiotic nitrogen fixing bacteria. Nitrogen fixation is found in specialized plant root structures called nodules. Published online on November 16th in the journal Nature was the article “The Medicago genome provides insight into the evolution of rhizobial symbioses” by Young et al. (Another paper concerning the Medicago genome recently appeared in the journal PNAS). Medicago truncatula, the plant sequenced in this paper, is related to the economically important crop alfalfa (Medicago sativa) and is a commonly used model plant to study above and below ground plant biology, most notably interactions with symbiotic microorganisms.
The Medicago genome (like most genomes) is still in the draft stage. Through the use of bacterial artificial chromosomes (BACs) and direct sequencing of genomic DNA, the researchers estimate the genome of Medicago is upwards of 350 Mb in length. As an estimation of the completeness of the M. truncatula genome, approximately 94% of expressed genes (as ESTs) map to the draft genome. An estimated number of genes for M. truncatula is 62,388, with an average gene size of 2,211 base pairs per gene, and an average of 4 exons per gene. These numbers seem to be in the same “ballpark”, or perhaps larger, than the genomes of Poplar, Rice, and Arabidopsis.
The sequencing of numerous plant genomes, including M. truncatula here, indicates a whole genome duplication event which occurred prior to the split of the rosids from the asteroids at approximately 150 million years ago. Another whole genome duplication event occurred at approximately 60 million years ago in the Legumes, which yielded several subclades, with Medicago being placed in the Hologalegina clade.
Significant synteny is shared between Medicago and the genomes of other sequenced legumes, Glycine max and Lotus japonicus. A common ancestor of the legumes underwent a whole genome duplication event, occurring approximately 58 million years ago, and as a result, specific euchromatic regions of Medicago share synteny with numerous regions in each of the Lotus and Glycine genomes, as well as other regions of the Medicago genome. Additionally, due to a pre-Rosid whole genome duplication event, the genome of Medicago shows synteny to the grape genome in at least three elongated regions.
There has been a high rate of local gene duplication events – some by tandem duplication – in the Medicago genome, and these events are approximately three fold higher than Glycine and one and a half times greater than both Populus and Arabidopsis. Gene duplication events in Medicago could explain the average to above average number of genes observed in the genome. Based on the estimated time of origin for the legumes, Medicago has undergone synonymous substitutions at a rate almost twice that of the average rate of vascular plants.
Production of a specialized organ, the root nodule, in many members of the legumes is a trait with both ecological importance and human agricultural interest. Through the structure of the root nodule, leguminous plants harbor anaerobic actinorhizal bacteria which are capable of fixing atmospheric nitrogen. It appears that the trait of nodulation has evolved numerous times in the Fabales, and was reliant on whole genome duplication events which allowed the emergence of novel gene functions from redundant genes.
There are numerous plant genomic features present in the Legumes with regard to signaling with rhizobial microorganisms, such as nitrogen fixing bacteria and mycorrhizal fungi. Duplicated genes have evolved roles in nodulation formation (the genes NFP and ERN1) and mycorrhizal colonization (the genes LYR1 and ERN2). The researchers used RNA-Seq data from six different plant organs to differentiate gene expression of putative whole genome duplicated paralogs. Not surprisingly for Medicago, roots had the highest amount of differential expression of paralogous genes, followed by flower, nodule, leaf, seed, and flower bud. Transcription factors, putatively responsible for tissue differentiation in gene expression, were estimated to be 6% of all Medicago genes.
I generally don’t work with Arabidopsis, but have used it as a model plant in previous experiments. I’m currently contributing to a genome sequencing project for a member of the Brassicales, so I’ve been getting up to speed on some of the recent research. Just in the last year, numerous papers have touched on genome evolution across this important plant family, and I’ll be highlighting a few of these papers in a series of posts on genomics of the Brassicales.
Two recent papers in the journal Science have highlighted the use of Arabidopsis thaliana in understanding genomic responses to climatic and local adaptation. The paper by Fournier-Level et al “A Map of Local Adaptation in Arabidopsis thaliana” and the paper by Hancock et al “Adaptation to Climate Across the Arabidopsis thaliana Genome” are novel because they correlate well-characterized genome wide analysis of markers, such as SNPs, to estimates of fitness. Both papers were summarized by Outi Savolainen in a commentary titled “The Genomic Basis of Local Climatic Adaptation” in the same issue.
Fournier-Level et al. estimated survival and reproductive fitness from 150 different reciprocal Arabidopsis transplants from four research sites across Europe. By measuring SNPs in plants from each environment, the researchers found different polymorphisms associated with fitness for differing environments. This finding suggests that environmentally induced genomic variation is plastic and that natural selection has a strong effect across these genomes.
In a separate but complimentary study, Hancock et al. observed more than 200,000 SNPs for 1000 Arabidopsis accessions isolated from a wide selection of unique geographic locations. Using two approaches, the researchers found that, one, there was a correlation between polymorphisms within general protein-coding regions and climate-associated SNPs, and, two, that the climate-associated SNPs could explain some level of fitness in field experiments.
These papers addressed SNP data, but an interesting angle would also be to determine if genomic rearrangements are found in locally adapted populations of Arabidopsis (for a recent study of genomic rearrangements see here). These types of studies are vital to understanding the genomic foundations underpinning survival in the environment. With greater and more refined genome-wide resources, especially tied to environmental phenotypic plasticity, researchers are poised to make great contributions to our understanding of the interactions of the genome and environment.
One of the hurdles to the production of cellulosic biofuel is the economic breakdown plant biomass. Currently, fungi used to break down plant biomass operate at, or slightly above, room temperature. Chemical reactions at room temperature proceed slowly, are less efficient, and may be riddled with contaminating fungi which lower the efficiency of the breakdown process. One scientific goal is to increase the heat in bioreactors with the hopes of speeding up the degradation using efficient fungal enzymes that operate at higher temperatures.
In an effort find thermostable fungal degradative enzymes, researchers have sequenced the genomes of two fungi, Thielavia terrestris and Myceliophthora thermophila, known for their ability to survive at high temperatures, namely 40oC to 75oC. A report entitled “Comparative Genomic Analysis of the Thermophilic Biomass-Degrading Fungi Myceliophthora thermophila and Thielavia terrestris” has been published online on October 2nd in the journal Nature Biotechnology. (Image: Myceliophthora thermophila link)
The 38.7 Mbp genome of M. thermophila and the 36.9 Mbp genome of T. terrestris are the first thermophilic eukaryotes to have their genomes sequenced, and contain seven and six complete chromosomes, respectively. The genome of M. thermophila contains 9,110 protein-coding genes and there are 9,813 such genes in the genome of T. terrestris. Both filamentous Ascomycetes – placed in the class Sordariomycetes and family Chaetomiaceae – have a similar level of genomic organization, barring numerous translocations and transversions. When considering the three species with sequenced genomes in the Chaetomiaceae, large portions of the genomes, some of which are greater than 6000 contiguous genes, are shared in syntenous blocks.
Enzymes for the breakdown of plant matter – which can include a wide array of materials from agricultural and forestry waste, recycled pulp and paper products, leaves, etc. – were discovered across the genomes of both T. terrestris and M. thermophila. These enzymes include numerous carbohydrate-active proteins (CAZymes) which include enzymes in the glycoside hydrolase, polysaccharide lyase, carbohydrate esterase, and glycosyl transferase families. With some slight differences in regard to the breakdown of specific plant polysaccharides, such as pectin, both fungi can be categorized as general decomposers with regards to their enzyme repertoire.
The researchers then tested the expression of some enzymes identified in these newly sequenced fungal genomes, as well as comparing their diversity to well characterized enzymes from Trichoderma reesei. Differing from T. reesei, both M. thermophila and T. terrestris have exhibited a proliferation in the GH61 enzyme family, responsible for the degradation of plant cell wall polysaccharides, as well as the GH10 and GH11 xylanase gene families. The researchers used RNA-Seq to compare the expression of these enzymes on differing plant materials, such as alfalfa and barley straw, which represented characteristic dicot and monocot plants, respectively. While there are noticeable differences to the degradation of plant material from dicots and monocots by both T. terrestris and M. thermophila, orthologs from both fungal genomes show similar patterns of gene expression, particularly when growing on complex plant substrates.
Research commentaries on this publication can be found here and here.
The October issue of the journal New Phytologist contains a commentary article by a group of plant scientists who conducted a survey to identify the 100 most pressing scientific questions facing plant biologists. The article “One Hundred Important Questions Facing Plant Science Research” is very thought provoking.
I’ve replicated the questions here for you to read and ponder. I know the list is heavy on the text, but I think these questions are worthy of the space. You should definitely then read their article (and supplementary commentary) and see how they have collectively addressed these questions. They may have addressed these questions in their commentary, but these questions are far from answered and may demand many careers to answer fully.
Most important questions relating to plants and society:
1. How do we feed our children’s children?
2. Which crops must be grown and which sacrificed, to feed the billions?
3. When and how can we simultaneously deliver increased yields and reduce the environmental impact of agriculture?
4. What are the best ways to control invasive species including plants, pests and pathogens?
5. Considering two plants obtained for the same trait, one by genetic modification and one by traditional plant breeding techniques, are there differences between those two plants that justify special regulation?
6. How can plants contribute to solving the energy crisis and ameliorating global warming?
7. How do plants contribute to the ecosystem services upon which humanity depends?
8. What new scientific approaches will be central to plant biology in the 21st Century?
9. (a) How do we ensure that society appreciates the full importance of plants? (b) How can we attract the best young minds to plant science so that they can address Grand Challenges facing humanity such as climate change, food security, and fossil fuel replacement?
10. How do we ensure that sound science informs policy decisions?
11. How can we translate our knowledge of plant science into food security?
12. Which plants have the greatest potential for use as biofuels with the least effects on biodiversity, carbon footprints and food security?
13. Can crop production move away from being dependent on oil-based technologies?
14. How can we use plant science to prevent malnutrition?
15. How can we use knowledge of plants and their properties to improve human health?
16. How do plants and plant communities (morphology, color, fragrance, sound, taste etc.) affect human well-being?
17. How can we use plants and plant science to improve the urban environment?
18. How do we encourage and enable the interdisciplinarity that is necessary to achieve the UN’s Millennium Development Goals which address poverty and the environment?
Most important questions relating to environment and adaptation:
1. How can we test if a trait is adaptive?
2. What is the role of epigenetic processes in modulating response to the environment during the life span of an individual?
3. Are there untapped potential benefits to developing perennial forms of currently annual crops?
4. Can we generate a step-change in C3crop yield through incorporation of a C4 or intermediate C3/C4 or crassulacean acid metabolism (CAM) mechanism?
5. How do plants regulate the proportions of storage reserves laid down in various plant parts?
6. What is the theoretical limit of productivity of crops and what are the major factors preventing this being realized?
7. What determines seed longevity and dormancy?
8. How can we control flowering time?
9. How do signaling and cross-talk between the different plant hormones operate?
10. Can we develop salt/heavy metal/drought-tolerant crops without creating invasive plants?
11. Can plants be better utilized for large-scale remediation and reclamation efforts on degraded and/or toxic land?
12. How can we translate our knowledge of plants and ecosystems into ‘clever farming’ practices?
13. Can alternatives to monoculture be found without compromising yields?
14. Can plants be bred to overcome dry land salinity or even reverse it?
15. Can we develop crops that are more resilient to climate fluctuation without yield loss?
16. Can we understand (explain and predict) the succession of plant species in any habitat, and crop varieties in any location, under climate change?
17. To what extent are the stress responses of cultivated plants appropriate for current and future environments?
18. Are endogenous plant adaption mechanisms enough to keep up with the pace of man-made environmental change?
19. How can we improve our cultivated plants to make better use of finite resources?
20. How do we grow plants in marginal environments without encouraging invasiveness?
21. How can we use the growing of crops to limit deserts spreading?
Most important questions relating to plant species interactions:
1. What are the best ways to control invasive species including plants, pests and pathogens?
2. Can we provide a solution to intractable plant pest problems in order to meet increasingly stringent pesticide restrictions?
3. Is it desirable to eliminate all pests and diseases in cultivated plants?
4. What is the most sustainable way to control weeds?
5. How can we simultaneously eradicate hunger and conserve biodiversity?
6. How can we move nitrogen-fixing symbioses into non-legumes?
7. Why is symbiotic nitrogen fixation restricted to relatively few plant species?
8. How can the association of plants and mycorrhizal fungi be improved or extended towards better plant and ecosystem health?
9. How do plants communicate with each other?
10. How can we use our knowledge of the molecular biology of disease resistance to develop novel approaches to disease control?
11. What are the mechanisms for systemic acquired resistance to pathogens?
12. When a plant resists a pathogen, what stops the pathogen growing?
13. How do pathogens overcome plant disease resistance, and is it inevitable?
14. What are the molecular mechanisms for uptake and transport of nutrients?
15. Can we use non-host resistance to deliver more durable resistance in plants?
Most important questions relating to the understanding and utilization of plant cells:
1. How do plant cells maintain totipotency and how can we use this knowledge to improve tissue culture and regeneration?
2. How are growth and division of individual cells coordinated to form genetically programmed structures with specific shapes, sizes and compositions?
3. How do different genomes in the plant talk to one another to maintain the appropriate complement of organelles?
4. How and why did multicellularity evolve in plants?
5. How can we improve our understanding of programmed developmental gene regulation from a genome sequence?
6. How do plants integrate multiple environmental signals and respond?
7. How do plants store information on past environmental and developmental events?
8. To what extent do epigenetic changes affect heritable characteristics of plants?
9. Why are there millions of short RNAs in plants and what do they do?
10. What is the array of plant protein structures?
11. How do plant cells detect their location in the organism and develop accordingly?
12. How do plant cells restrict signaling and response to specific regions of the cell?
13. Is there a cell wall integrity surveillance system in plants?
14. How are plant cell walls assembled, and how are their strength and composition determined?
15. Can we usefully implant new synthetic biological modules in plants?
16. To what extent can plant biology become predictive?
17. What is the molecular/biochemical basis of heterosis?
18. How do we achieve high-frequency targeted homologous recombination in plants?
19. What factors control the frequency and distribution of genetic crossovers during meiosis?
20. How can we use our knowledge about photosynthesis and its optimization to better harness the energy of the sun?
21. Can we improve algae to better capture CO2and produce higher yields of oil or hydrogen for fuel?
22. How can we use our knowledge of carbon fixation at the biochemical, physiological and ecological levels to address the rising concentrations of atmospheric CO2?
23. What is the function of the phenomenal breadth of secondary metabolites?
24. How can we use plants as the chemical factories of the future?
25. How do we translate our knowledge of plant cell walls to produce food, fuel and fibre more efficiently and sustainably?
Most important questions relating to plant diversity:
1. How much do we know about plant diversity?
2. How can we better exploit a more complete understanding of plant diversity?
3. Can we increase crop productivity without harming biodiversity?
4. Can we define objective criteria to determine when and where intensive or extensive farming practices are appropriate?
5. How do plants contribute to ecosystem services?
6. How can we ensure the long-term availability of genetic diversity within socio-economically valuable gene pools?
7. How do specific genetic differences result in the diverse phenotypes of different plant species? That is, why is an oak tree an oak tree and a wheat plant a wheat plant?
8. Which genomes should we sequence and how can we best extract meaning from the sequences?
9. What is the significance of variation in genome size?
10. What is the molecular and cellular basis of plants’ longevity and can plant life spans be manipulated?
11. Why is the range of life spans in the plant kingdom so much greater than in animals?
12. What is a plant species?
13. Why are some clades of plants more species-rich than others?
14. What is the answer to Darwin’s ‘abominable mystery’ of the rapid rise and diversification of angiosperms?
15. How has polyploidy contributed to the evolutionary success of flowering plants?
16. What are the closest fossil relatives of the flowering plants?
17. How do we best conserve phylogenetic diversity in order to maintain evolutionary potential?
I already mentioned (here and here) the New Phytologist Symposium on Bioenergy Trees, but I’d like to let you know that my meeting commentary has been published in the journal. I highly recommend attending one of the many New Phytologist Symposia based on their intimate size and the excellent quality of speakers.
A paper published in the recent issue of PNAS “Insights into the oxidative degradation of cellulose by a copper metalloenzyme that exploits biomass components” by Quinlan et al. succeeds in characterizing an important aspect of the breakdown of cellulose by enzymes. I’m interested in the use of cellulose in bioenergy purposes, but one of the major problems in its use is extreme recalcitrance of the polysaccaride. By fully understanding the enzymatic mechanisms of the breakdown of cellulose we can surpass a major scientific and economic challenge for the effective release of bioenergy from biomass.
Cellulose is typically broken down by fungi employing a suite of different enzymes. These enzymes are traditionally placed into two classes: endoglucanases and cellobiohydrolases. In this paper, the authors identify the enzymatic abilities of a newly recognized enzyme class, called the GH61 glycoside hydrolases (see Harris et al. for more information on GH61 glycoside hydrolases). The GH61 glycoside hydrolases greatly increase the efficiency of the endoglucanases and cellobiohydrolases and recent genome sequencing of brown rot fungi, such as Postia placenta, show numerous GH61 glycoside hydrolases.
The authors describe the 3D structure of a GH61 glycoside hydrolase from Thermoascus aurantiacus identifying the active site details and catalytic activity of the enzyme. It was identified that the GH61 glycoside hydrolase enzymes are oxidizing agents and the authors show the direct degradation of cellulose. Furthermore, the authors identify copper as the metal cofactor of the enzyme and show a unique methyl modification of a metal-coordinating histidine residue.
See here for commentary on the paper.
I found the Li et al. paper – “Structural Variation in Two Human Genomes Mapped by Whole Genome de novo Assembly” – published in the August issue of Nature Biotechnology interesting for a number of reasons. As someone mainly interested in fungal and plant genomics this paper is somewhat outside my research focus, but I found both the novel approach to de novo genome assembly and the emphasis on structural genome variation over single nucleotide polymorphisms (SNPs) in explaining genetic diversity to be very interesting.
By using short read sequencing technology from the Illumina platform, the researchers began by sequencing the genomes of two individuals, one person of African descent (NA18507) and one of Asian descent (YH). As with many genome sequencing studies, there were numerous problems during the assembly process, such as alignment accuracy, recovery of long contiguous stretches of nucleotides, stretches of low or no coverage, and identifying sequencing background noise. The authors tried to eliminate these issues by developing a strategy focusing on de novo assembly instead of mapping reads to reference genomes.
The novel pipeline was able to identify structural variants – such as insertions, deletions, rearrangements, inversions, etc. – in each of the homozygous assembled genomes, some of which were upwards of 23,000 base pairs in length. The researchers then validated the structural variations using both experimental and computational methods, and, using data generated for the 1000 Human Genomes Project, they mapped their identified structural variations in the genomes of 106 other individuals.
While SNPs are easier to observe (perhaps the reasons why they have been emphasized so much in recent years?) it seems that structural rearrangements are perhaps the major form of variation in human genomes, and maybe, all genomes. Structural variations were less common than SNPs, but are more individual specific and appear to be associated with phenotypic characteristics. A next research direction would be to observe the association of structural variations to disease traits or susceptibility.
This paper also suggests that accurately assembling long genomic regions are very important to understanding structural variation. This can be accomplished by either using technologies that naturally generate longer reads (i.e. Sanger or PacBio sequencing) or ensuring that short reads can be accurately assembled by computational methods.
As an aside: this group at BGI (formerly the Beijing Genomics Institute) also sequenced the Giant Panda genome.
This week, it’s been hard to miss the new paper, “How many species are there on Earth and in the Ocean?” published by Mora et al. in the August 2011 issue of the journal PLOS Biology. There have been commentaries or news articles printed in the New York Times, The Economist, The Guardian, Damian Carrington’s Guardian Blog, National Geographic, Yahoo News, AlterNet, MSNBC, Reuters, UNEP, NewsDaily and Ed Yong has posted a commentary on his Google+ page. Furthermore, some well respected scientists who study biological diversity have joined the debate too: Jonathan Eisen has devoted two blog posts to the paper (one about the actual paper in PLOS Biology and another on the National Geographic commentary) and there is a commentary from Robert May in PLOS Biology about the study and its significance. Since there is ample information on the study elsewhere, let me communicate a brief summary of the study and some of my feelings about the paper.
It’s quite embarrassing that we have really no clue how much biological diversity is found on this planet. Adding insult to injury is the fact that we have no concept of the current magnitude of the loss of diversity due to human induced mass extinctions. This paper seeks to predict total global biological diversity by documenting current taxonomic numbers and extrapolating consistent patterns to estimate the number of species that have yet to be identified.
The methods of the authors essentially consisted of three parts. First, the authors compiled a list of approximately 1.2 million species pulled from numerous biological databases. Second, they surveyed a little over 500 taxonomists who were asked to identify the validity of current scientific names and comment on the intensity of current taxonomic efforts to describe new species. Third, the authors analyzed this data to find the estimated global numbers of biological taxa for each phylum.
The authors show a predictable pattern in the classification of species (at the phylum, class, order, family, and genus level) at least consistently for animals. By evaluating these patterns using regression, the authors validated this by closely examining 18 taxonomic groups that we think we understand their total biological diversity. By doing this, the authors come up with a total estimate of 7.7 million species of animals (mostly insects), close to 300,000 species of plants, more than 600,000 species of fungi, and a total estimate of roughly 9 million eukaryotes on Earth. The authors estimate that 86% of species on Earth and 91% of species in the oceans still have not been formally described. Previous estimates of species diversity have been wide: anywhere between 3 million to a 100 million species.
This paper is a novel and worthwhile attempt to determine the total amount of species diversity on this planet. Despite this, I think – and the authors have their own reservations – that there are some serious problems with some of their calculations.
One problem is that the study is based mainly on using animals, and vertebrates for that matter – which are the best described of any phylum, as the baseline for measuring the completeness of species diversity. I would argue that plants and fungi, and obviously bacteria, archaea, and “the protists” are clearly not well known enough to extrapolate any serious estimate species numbers especially when considering vertebrate animals as a baseline and whose numbers are largely skewed.
Another problem is in our collective definition of species, as well as taxonomic subjectivity of the categorization of other taxonomic hierarchies, which are based on the on the homology of shared characters and, I would argue, are largely incomparable outside of each phylum. For example, what one taxonomist calls an order in one grouping may not be equivalent to what another taxonomist calls a completely different order in another completely different grouping.
I should point out that the authors don’t ignore these caveats, but they still exist in their study. In any event, this paper is important because it adds to the dialogue concerning species diversity, the need to estimate, inventory and preserve the massive amount of diversity we share on the planet.
UPDATE: More commentaries in the news can be found here, here, and here.
With next-generation sequencing technologies dropping in price and increasing in throughput, it’s not surprising to find multiple genomes published every week in scientific journals. Most of these articles don’t qualify for publication in the top tier of journals like they did at the onset of the next-generation sequencing boom, but some genome sequencing projects, such as the potato genome, are high profile enough to warrant publication in top tier journals.
In the July 14th issue of the journal Nature, a draft of the potato (Solanum tuberosum) genome was described in a paper authored by the Potato Genome Sequencing Consortium – a huge group of researchers from 26 institutions.
The potato is the world’s fourth most consumed food crop, the most commonly grown vegetable crop, and a member of the economically important Solanaceae family –otherwise known as the nightshades – which include tomato, peppers, aubergine (eggplant if you live in the United States), tobacco, and petunia. Widely distributed in western South America, tuber forming Solanum species are highly morphologically diverse and easily cross with other varieties for breeding purposes.
It’s been a bumpy road sequencing the potato genome since the project was started in 2006. The potato genome is an extremely heterozygous autotetraploid, which translates to four highly variable copies of each of the 12 chromosomes. It’s also the first sequenced Eudicot genome in the Asterid clade, so there are no close genetic relatives to provide the basis for a guided genome assembly.
The consortium began the sequencing by creating a bacterial artificial chromosome (BAC) library of 78,000 clones from a well studied diploid line providing high quality potatoes, named RH89-039-16. The group used the BAC library and 10,000 AFLP markers to create more than 7000 contigs which were constructed into a physical map. The group then identified up to 150 BACs for every chromosome on the potato genome, and verified their locations using fluorescent in situ hybridization.
Heterozygosity was so high in the RH line that after thorough sequencing the group hit an impasse with the assembly of the genome. In an attempt to complement the sequencing of the RH line, the consortium began sequencing a doubled monoploid potato clone, DM1-3 516R44, derived from a diploid wild South America accession. The DM line has a simpler genome than the RH line and is highly homozygous.
Using both the Illumina Genome Analyzer II and Roche 454 pyrosequencing platforms, and supplementing this data with traditional Sanger sequencing, approximately 96 Gb of data was acquired for the DM line. The group then used the SOAPdenovo computer program to assemble the reads with a final assembly of 727 Mb for the DM line and a final estimation of 844 Mb for the genome.
The consortium generated more than 31 Gb of transcriptome data from both the DM and RH line libraries. These 48 libraries represented major tissue types, developmental stages, and included various responses to abiotic and biotic stresses. All the reads from the RNA-Seq libraries were mapped to the assembled DM genome. Using gene prediction methods, along with protein and EST data, the potato genome was predicted to contain 39,000 protein coding genes, an amount which is in agreement with other plant genomes. Within these genes, there were an estimated 2,642 asterid-specific and 3,372 potato-lineage-specific genes. Some of the predicted asterid-specific genes include many novel transcription factors, self-incompatibility factors, and defence-related proteins. The draft assembly of the genome consists of more than 60% repeated elements. The largest class of the transposable elements is the long terminal repeat retrotransposons (LTRs) which are estimated at 30% of the potato genome.
The potato is notorious for being susceptible to many pathogens and pests. This well known susceptibility was one of the priorities for sequencing the genome and determining genes responsible for disease resistance and pathogen defense. The DM genome assembly contains more than 800 putative R genes, responsible for conferring disease resistance, including 408 NBS-LRR-encoding genes, 57 Toll/interleukin-1 receptor (TIR) domains, and 351 non-TIR type resistance genes. An extreme number of pseudogenes – attributed to indels, frameshift mutations, and misplaced stop codons –were identified within known R gene motifs, which possibly explains the potato’s inability to fight off some specific diseases.
One such well known disease, Late Blight, caused by Phytophthora infestans, was responsible for the Irish Potato Famine in the 1840s.. Using information from this genome sequencing project and other studies, we now know the variety brought to Europe in the late 16th century happens to lack specific disease resistance genes for Phytophthora infestans. One could speculate that unbridled transposon jumping caused the inactivation of many R genes in this potato variety.
Unique for the potato is the formation of tubers (the actual potatoes) through the modification of a stolon. The tomato is very closely related to potato, but does not produce stolons or modified tubers. The group used transcript data from both potato and tomato to address genetic regulation of the formation of stolons and the transition of stolons to tubers. Quite interestingly, the formation of stolons and tubers coincides with an up-regulation of genes associated with starch biosynthesis, protein storage, and Kunitz protease inhibitor genes associated with pests and pathogens.
Possibly due to extremely high levels of heterozygosity, it has been difficult to improve the potato through traditional breeding efforts. It’s estimated that there is a worldwide economic loss of 4.5 billion US dollars to potato crops from diseases each year. Just to attempt to suppress these diseases copious amounts of pesticides and fungicides are applied to potato crop land each year. The potato cyst nematode, for example, is an important pest that researchers hope to improve resistance to via breeding initiatives. Having this draft potato genome sequence will aid in the characterization of existing germplasm collections and description of allelic variance in breeding efforts to avoid diseases. The potato genome will also serve as a resource for breeders wanting to improve the quality of other economically important Solanaceous plants such as tomato, pepper, eggplant, and tobacco.
Published in the June 2011 issue of the journal Nature Biotechnology was a paper reporting on the genome sequence of the data palm, Phoenix dactylifera. This paper, authored by Al-Dous et al., addressed the genome sequencing and de novo assembly of this agriculturally important monocot tree, along with comparative genomics with other plants.
Dates have been found in the tombs of pharaohs estimated at 8,000 years old. Fields of agriculturally planted trees, estimated to be older than 5,000 years, suggest the date palm is one of the oldest cultivated plants in the world. Dates are the most important agricultural crop in the hot and arid regions surrounding the Arabian Gulf and their global production is close to 7 million tons yearly.
Despite a prolonged emphasis on their agriculture, there are a few problems to deal with if you are a date grower. Typical of tree crops, there is a long generation time from seedling to fruit harvesting. Additionally, only the female date palm provides fruit and it takes at least 5 years after seed germination to tell if you have a male or female plant. To make it even harder for a date grower, there are more than 2000 date varieties, each exhibiting its own color, flavor, size, shape, and ripening schedule, and they are all really hard to keep track of based on conventional techniques.
In an effort to provide genetic resources for date growers and breeders, the authors of this study – who were mainly located in Qutar – sequenced and assembled 380 Mb of the estimated 658 Mb genome of the Khalas cultivar, which is known for high fruit quality. Generated using short reads from the Illumina Genome Analyzer IIx platform, this partial sequence excluded numerous large repeated regions, includes a predicted 28,890 genes, and represented 18 pairs of chromosomes. The authors estimate that this draft genome represents roughly 90% of the total genes and 60% of the total genome.
This genome resource also serves a comparative genomics purpose by being the first member of the widespread monocot order Arecales. To this date, the only Monocots with sequenced genomes – for example: Corn, Rice, and Sorghum – have all been in the grass order, the Poales.
This report is missing some vital information: in addition to an incomplete genome assembly, there is no metabolic, developmental, or gene network pathway reconstruction for the date palm provided in this paper (and unfortunately this paper also includes some glaring typos in the citation section). In place of these expected analyses, the authors conducted a throughout survey of SNPs in this Khalas cultivar, along with eight additional cultivars common in breeding programs for the date palm. Within these nine cultivars, 3,518,029 SNPs were determined, but quite interestingly, a total of 32 SNPs could be used to differentiate the cultivars.
In addition to the throughout SNP analysis, the researchers then did a full parentage analysis of the cultivars used in this study, which includes the famous date varieties such as Deglet Noor, Dayri, and Medjool. Here‘s an article in Nature Middle East on the importance of understanding this parentage and gender analysis.
Although this is a draft genome still being completed and undergoing resequencing, namely the tools provided by the authors, the SNP and parentage analysis, should provide date palm breeders with many resources for improved fruit quality and this genome represents an exciting piece of the monocot evolutionary puzzle.