We describe here a complete computational strategy to detect both duplicated and single copy genes in a genome, discussing all the methodological issues that may strongly affect the results, their quality and their reliability. This is also due to the lack of a reference bioinformatics pipeline that could exhaustively identify paralogs and singleton genes. ![]() This is still controversial, even in the widely studied Arabidopsis genome. The identification of paralogs and single copy genes within a highly duplicated genome is a prerequisite to understand its organization and evolution and to improve its exploitation in comparative genomics. These events, together with probable chromosome reductions, dramatically increased the genome complexity, limiting its role as a reference. However, the Arabidopsis genome is characterized by an inherently complex organization, since it has undergone ancient whole genome duplications, followed by gene reduction, diploidization events and extended rearrangements, which relocated and split up the retained portions. Its genome was the first among plants to be sequenced, becoming the reference in plant genomics. ![]() Arabidopsis thaliana became the model organism for plant studies because of its small diploid genome, rapid lifecycle and short adult size.
0 Comments
Leave a Reply. |