论文部分内容阅读
Background: Chinese fir (Cunninghamia lanceolata) is an important timber species that accounts for 20–30% of the total commercial timber production in China.However,the available genomic information of Chinese fir is limited,and this severely encumbers functional genomic analysis and molecular breeding in Chinese fir.Recently,major advances in transcriptome sequencing have provided fast and cost-effective approaches to generate large expression datasets that have proven to be powerful tools to profile the transcriptomes of non-model organisms with undetermined genomes.Results: In this study,the transcriptomes of nine tissues from Chinese fir were analyzed using the Illumina HiSeq? 2000 sequencing platform.Approximately 40 million paired-end reads were obtained,generating 3.62 gigabase pairs of sequencing data.These sequences were assembled into 83,248 unigenes with an average length of 449 bp,amounting to 37.40 Mb.These unigenes were 112-fold more than all the Chinese fir sequences in GeneBank (as of March 2012).Of the unigenes,45,501 (54.66%) had homologs in the NCBI non-redundant and Swiss-Prot protein databases,corresponding to 28,617 unique protein entries.Of these unigenes,18,229 were assigned to Gene Ontology classes,and 15,609 unigenes were clustered into orthologous groups.A total of 22,910 (27.52%) were mapped to 119 pathways by BLAST comparison against the Kyoto Encyclopedia of Genes and Genomes (KEGG) database.The majority of the genes encoding the enzymes in the biosynthetic pathways of cellulose and lignin were identified in the unigene dataset by targeted searches of their annotations.And a number of candidate Chinese fir genes in the two metabolic pathways were discovered firstly.Eighteen genes related to cellulose and lignin biosynthesis were cloned for experimental validating of transcriptome data.Overall 49 unigenes,covering different regions of these selected genes,were found by alignment.Their expression patterns in different tissues were analyzed by qRT-PCR to explore their putative functions.Conclusions: A substantial fraction of transcript sequences was obtained from the deep sequencing of Chinese fir.The assembled unigene dataset was used to discover candidate genes of cellulose and lignin biosynthesis.This transcriptome dataset will provide a comprehensive sequence resource for molecular genetics research of C.lanceolata.