Diverse evolution in 111 plant genomes reveals purifying and dosage balancing selection models for F-Box genes

Document Type


Publication Date



The F-box proteins function as substrate receptors to determine the specificity of Skp1- Cul1-F-box ubiquitin ligases. Genomic studies revealed large and diverse sizes of the F-box gene superfamily across plant species. Our previous studies suggested that the plant F-box gene superfamily is under genomic drift evolution promoted by epigenomic programming. However, how the size of the superfamily drifts across plant genomes is currently unknown. Through a large-scale genomic and phylogenetic comparison of the F-box gene superfamily covering 110 green plants and one red algal species, I discovered four distinct groups of plant F-box genes with diverse evolutionary processes. While the members in Clusters 1 and 2 are species/lineage-specific, those in Clusters 3 and 4 are present in over 46 plant genomes. Statistical modeling suggests that F-box genes from the former two groups are skewed toward fewer species and more paralogs compared to those of the latter two groups whose presence frequency and sizes in plant genomes follow a random statistical model. The enrichment of known Arabidopsis F-box genes in Clusters 3 and 4, along with comprehensive biochemical evidence showing that Arabidopsis members in Cluster 4 interact with the Arabidopsis Skp1-like 1 (ASK1), demonstrates over-representation of active F-box genes in these two groups. Collectively, I propose purifying and dosage balancing selection models to explain the lineage/species-specific duplications and expansions of F-box genes in plant genomes. The purifying selection model suggests that most, if not all, lineage/species-specific F-box genes are detrimental and are thus kept at low frequencies in plant genomes.