The first insight into the Salvia (Lamiaceae) genome via BAC library construction and high-throughput sequencing of target BAC clones

Da Cheng Hao, Sonia Vautrin, Chi Song, Yingjie Zhu, Helene Berges, Chao Sun, Shi Lin Chen

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Salvia is a representative genus of Lamiaceae, a eudicot family with significant species diversity and population adaptibility. One of the key goals of Salvia genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of medicinal plants to increase their health and productivity. Large-insert genomic libraries are a fundamental tool for achieving this purpose. We report herein the construction, characterization and screening of a gridded BAC library for Salvia officinalis (sage). The S. officinalis BAC library consists of 17,764 clones and the average insert size is 107 Kb, corresponding to ~3 haploid genome equivalents. Seventeen positive clones (average insert size 115 Kb) containing five terpene synthase (TPS) genes were screened out by PCR and 12 of them were subject to Illumina HiSeq 2000 sequencing, which yielded 28,097,480 90-bp raw reads (2.53 Gb). Scaffolds containing sabinene synthase (Sab), a Sab homolog, TPS3 (kaurene synthase-like 2), copalyl diphosphate synthase 2 and one cytochrome P450 gene were retrieved via de novo assembly and annotation, which also have flanking noncoding sequences, including predicted promoters and repeat sequences. Among 2,638 repeat sequences, there are 330 amplifiable microsatellites. This BAC library provides a new resource for Lamiaceae genomic studies, including microsatellite marker development, physical mapping, comparative genomics and genome sequencing. Characterization of positive clones provided insights into the structure of the Salvia genome. These sequences will be used in the assembly of a future genome sequence for S. officinalis.

Original languageEnglish (US)
Pages (from-to)1347-1357
Number of pages11
JournalPakistan Journal of Botany
Volume47
Issue number4
StatePublished - Aug 18 2015
Externally publishedYes

Fingerprint

Salvia
Lamiaceae
Salvia officinalis
clones
genome
genomics
microsatellite repeats
sabinene
genes
physical chromosome mapping
genomic libraries
cytochrome P-450
terpenoids
haploidy
medicinal plants
promoter regions
screening
species diversity
genetic variation

Keywords

  • Genome characterization
  • Gridded BAC library
  • High-throughput sequencing
  • Microsatellite
  • Salvia officinalis
  • Terpene synthase

ASJC Scopus subject areas

  • Plant Science

Cite this

The first insight into the Salvia (Lamiaceae) genome via BAC library construction and high-throughput sequencing of target BAC clones. / Hao, Da Cheng; Vautrin, Sonia; Song, Chi; Zhu, Yingjie; Berges, Helene; Sun, Chao; Chen, Shi Lin.

In: Pakistan Journal of Botany, Vol. 47, No. 4, 18.08.2015, p. 1347-1357.

Research output: Contribution to journalArticle

Hao, Da Cheng ; Vautrin, Sonia ; Song, Chi ; Zhu, Yingjie ; Berges, Helene ; Sun, Chao ; Chen, Shi Lin. / The first insight into the Salvia (Lamiaceae) genome via BAC library construction and high-throughput sequencing of target BAC clones. In: Pakistan Journal of Botany. 2015 ; Vol. 47, No. 4. pp. 1347-1357.
@article{d3e7c670c7714197a7fef3c5c08e83eb,
title = "The first insight into the Salvia (Lamiaceae) genome via BAC library construction and high-throughput sequencing of target BAC clones",
abstract = "Salvia is a representative genus of Lamiaceae, a eudicot family with significant species diversity and population adaptibility. One of the key goals of Salvia genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of medicinal plants to increase their health and productivity. Large-insert genomic libraries are a fundamental tool for achieving this purpose. We report herein the construction, characterization and screening of a gridded BAC library for Salvia officinalis (sage). The S. officinalis BAC library consists of 17,764 clones and the average insert size is 107 Kb, corresponding to ~3 haploid genome equivalents. Seventeen positive clones (average insert size 115 Kb) containing five terpene synthase (TPS) genes were screened out by PCR and 12 of them were subject to Illumina HiSeq 2000 sequencing, which yielded 28,097,480 90-bp raw reads (2.53 Gb). Scaffolds containing sabinene synthase (Sab), a Sab homolog, TPS3 (kaurene synthase-like 2), copalyl diphosphate synthase 2 and one cytochrome P450 gene were retrieved via de novo assembly and annotation, which also have flanking noncoding sequences, including predicted promoters and repeat sequences. Among 2,638 repeat sequences, there are 330 amplifiable microsatellites. This BAC library provides a new resource for Lamiaceae genomic studies, including microsatellite marker development, physical mapping, comparative genomics and genome sequencing. Characterization of positive clones provided insights into the structure of the Salvia genome. These sequences will be used in the assembly of a future genome sequence for S. officinalis.",
keywords = "Genome characterization, Gridded BAC library, High-throughput sequencing, Microsatellite, Salvia officinalis, Terpene synthase",
author = "Hao, {Da Cheng} and Sonia Vautrin and Chi Song and Yingjie Zhu and Helene Berges and Chao Sun and Chen, {Shi Lin}",
year = "2015",
month = "8",
day = "18",
language = "English (US)",
volume = "47",
pages = "1347--1357",
journal = "Pakistan Journal of Botany",
issn = "0556-3321",
publisher = "Pakistan Botanical Society",
number = "4",

}

TY - JOUR

T1 - The first insight into the Salvia (Lamiaceae) genome via BAC library construction and high-throughput sequencing of target BAC clones

AU - Hao, Da Cheng

AU - Vautrin, Sonia

AU - Song, Chi

AU - Zhu, Yingjie

AU - Berges, Helene

AU - Sun, Chao

AU - Chen, Shi Lin

PY - 2015/8/18

Y1 - 2015/8/18

N2 - Salvia is a representative genus of Lamiaceae, a eudicot family with significant species diversity and population adaptibility. One of the key goals of Salvia genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of medicinal plants to increase their health and productivity. Large-insert genomic libraries are a fundamental tool for achieving this purpose. We report herein the construction, characterization and screening of a gridded BAC library for Salvia officinalis (sage). The S. officinalis BAC library consists of 17,764 clones and the average insert size is 107 Kb, corresponding to ~3 haploid genome equivalents. Seventeen positive clones (average insert size 115 Kb) containing five terpene synthase (TPS) genes were screened out by PCR and 12 of them were subject to Illumina HiSeq 2000 sequencing, which yielded 28,097,480 90-bp raw reads (2.53 Gb). Scaffolds containing sabinene synthase (Sab), a Sab homolog, TPS3 (kaurene synthase-like 2), copalyl diphosphate synthase 2 and one cytochrome P450 gene were retrieved via de novo assembly and annotation, which also have flanking noncoding sequences, including predicted promoters and repeat sequences. Among 2,638 repeat sequences, there are 330 amplifiable microsatellites. This BAC library provides a new resource for Lamiaceae genomic studies, including microsatellite marker development, physical mapping, comparative genomics and genome sequencing. Characterization of positive clones provided insights into the structure of the Salvia genome. These sequences will be used in the assembly of a future genome sequence for S. officinalis.

AB - Salvia is a representative genus of Lamiaceae, a eudicot family with significant species diversity and population adaptibility. One of the key goals of Salvia genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of medicinal plants to increase their health and productivity. Large-insert genomic libraries are a fundamental tool for achieving this purpose. We report herein the construction, characterization and screening of a gridded BAC library for Salvia officinalis (sage). The S. officinalis BAC library consists of 17,764 clones and the average insert size is 107 Kb, corresponding to ~3 haploid genome equivalents. Seventeen positive clones (average insert size 115 Kb) containing five terpene synthase (TPS) genes were screened out by PCR and 12 of them were subject to Illumina HiSeq 2000 sequencing, which yielded 28,097,480 90-bp raw reads (2.53 Gb). Scaffolds containing sabinene synthase (Sab), a Sab homolog, TPS3 (kaurene synthase-like 2), copalyl diphosphate synthase 2 and one cytochrome P450 gene were retrieved via de novo assembly and annotation, which also have flanking noncoding sequences, including predicted promoters and repeat sequences. Among 2,638 repeat sequences, there are 330 amplifiable microsatellites. This BAC library provides a new resource for Lamiaceae genomic studies, including microsatellite marker development, physical mapping, comparative genomics and genome sequencing. Characterization of positive clones provided insights into the structure of the Salvia genome. These sequences will be used in the assembly of a future genome sequence for S. officinalis.

KW - Genome characterization

KW - Gridded BAC library

KW - High-throughput sequencing

KW - Microsatellite

KW - Salvia officinalis

KW - Terpene synthase

UR - http://www.scopus.com/inward/record.url?scp=84939555494&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84939555494&partnerID=8YFLogxK

M3 - Article

VL - 47

SP - 1347

EP - 1357

JO - Pakistan Journal of Botany

JF - Pakistan Journal of Botany

SN - 0556-3321

IS - 4

ER -