Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines

Rupp O, Becker J, Brinkrolf K, Timmermann C, Borth N, Pühler A, Noll T, Goesmann A (2014)
PLoS ONE 9(1): e85568.

Download
No fulltext has been uploaded. References only!
Journal Article | Original Article | Published | English

No fulltext has been uploaded

Abstract / Notes
Chinese hamster ovary (CHO) cell lines represent the most commonly used mammalian expression system for the production of therapeutic proteins. In this context, detailed knowledge of the CHO cell transcriptome might help to improve biotechnological processes conducted by specific cell lines. Nevertheless, very few assembled cDNA sequences of CHO cells were publicly released until recently, which puts a severe limitation on biotechnological research. Two extended annotation systems and web-based tools, one for browsing eukaryotic genomes (GenDBE) and one for viewing eukaryotic transcriptomes (SAMS), were established as the first step towards a publicly usable CHO cell genome/transcriptome analysis platform. This is complemented by the development of a new strategy to assemble the ca. 100 million reads, sequenced from a broad range of diverse transcripts, to a high quality CHO cell transcript set. The cDNA libraries were constructed from different CHO cell lines grown under various culture conditions and sequenced using Roche/454 and Illumina sequencing technologies in addition to sequencing reads from a previous study. Two pipelines to extend and improve the CHO cell line transcripts were established. First, de novo assemblies were carried out with the Trinity and Oases assemblers, using varying k-mer sizes. The resulting contigs were screened for potential CDS using ESTScan. Redundant contigs were filtered out using cd-hit-est. The remaining CDS contigs were re-assembled with CAP3. Second, a reference-based assembly with the TopHat/Cufflinks pipeline was performed, using the recently published draft genome sequence of CHO-K1 as reference. Additionally, the de novo contigs were mapped to the reference genome using GMAP and merged with the Cufflinks assembly using the cuffmerge software. With this approach 28,874 transcripts located on 16,492 gene loci could be assembled. Combining the results of both approaches, 65,561 transcripts were identified for CHO cell lines, which could be clustered by sequence identity into 17,598 gene clusters.
Publishing Year
ISSN
eISSN
PUB-ID

Cite this

Rupp O, Becker J, Brinkrolf K, et al. Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines. PLoS ONE. 2014;9(1):e85568.
Rupp, O., Becker, J., Brinkrolf, K., Timmermann, C., Borth, N., Pühler, A., Noll, T., et al. (2014). Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines. PLoS ONE, 9(1), e85568. doi:10.1371/journal.pone.0085568
Rupp, O., Becker, J., Brinkrolf, K., Timmermann, C., Borth, N., Pühler, A., Noll, T., and Goesmann, A. (2014). Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines. PLoS ONE 9, e85568.
Rupp, O., et al., 2014. Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines. PLoS ONE, 9(1), p e85568.
O. Rupp, et al., “Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines”, PLoS ONE, vol. 9, 2014, pp. e85568.
Rupp, O., Becker, J., Brinkrolf, K., Timmermann, C., Borth, N., Pühler, A., Noll, T., Goesmann, A.: Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines. PLoS ONE. 9, e85568 (2014).
Rupp, Oliver, Becker, Jennifer, Brinkrolf, Karina, Timmermann, Christina, Borth, Nicole, Pühler, Alfred, Noll, Thomas, and Goesmann, Alexander. “Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines”. PLoS ONE 9.1 (2014): e85568.
This data publication is cited in the following publications:
This publication cites the following data publications:

20 Citations in Europe PMC

Data provided by Europe PubMed Central.

Integrative analysis of DNA methylation and gene expression in butyrate-treated CHO cells.
Wippermann A, Rupp O, Brinkrolf K, Hoffrogge R, Noll T., J Biotechnol 257(), 2017
PMID: 27890772
Transcriptome profiling of the Australian arid-land plant Eremophila serrulata (A.DC.) Druce (Scrophulariaceae) for the identification of monoterpene synthases.
Kracht ON, Ammann AC, Stockmann J, Wibberg D, Kalinowski J, Piotrowski M, Kerr R, Brück T, Kourist R., Phytochemistry 136(), 2017
PMID: 28162767
Ultra-deep next generation mitochondrial genome sequencing reveals widespread heteroplasmy in Chinese hamster ovary cells.
Kelly PS, Clarke C, Costello A, Monger C, Meiller J, Dhiman H, Borth N, Betenbaugh MJ, Clynes M, Barron N., Metab Eng 41(), 2017
PMID: 28188893
Transcriptomic changes in CHO cells after adaptation to suspension growth in protein-free medium analysed by a species-specific microarray.
Shridhar S, Klanert G, Auer N, Hernandez-Lopez I, Kańduła MM, Hackl M, Grillari J, Stralis-Pavese N, Kreil DP, Borth N., J Biotechnol 257(), 2017
PMID: 28302587
Draft genome sequence of the potato pathogen Rhizoctonia solani AG3-PT isolate Ben3.
Wibberg D, Genzel F, Verwaaijen B, Blom J, Rupp O, Goesmann A, Zrenner R, Grosch R, Pühler A, Schlüter A., Arch Microbiol 199(7), 2017
PMID: 28597196
Linking secondary metabolites to biosynthesis genes in the fungal endophyte Cyanodermella asteris: The anti-cancer bisanthraquinone skyrin.
Jahn L, Schafhauser T, Wibberg D, Rückert C, Winkler A, Kulik A, Weber T, Flor L, van Pée KH, Kalinowski J, Ludwig-Müller J, Wohlleben W., J Biotechnol 257(), 2017
PMID: 28647529
Listeria monocytogenes Induces a Virulence-Dependent microRNA Signature That Regulates the Immune Response in Galleria mellonella.
Mannala GK, Izar B, Rupp O, Schultze T, Goesmann A, Chakraborty T, Hain T., Front Microbiol 8(), 2017
PMID: 29312175
The use of 'Omics technology to rationally improve industrial mammalian cell line performance.
Lewis AM, Abu-Absi NR, Borys MC, Li ZJ., Biotechnol Bioeng 113(1), 2016
PMID: 26059229
Effect of Temperature Downshift on the Transcriptomic Responses of Chinese Hamster Ovary Cells Using Recombinant Human Tissue Plasminogen Activator Production Culture.
Bedoya-López A, Estrada K, Sanchez-Flores A, Ramírez OT, Altamirano C, Segovia L, Miranda-Ríos J, Trujillo-Roldán MA, Valdez-Cruz NA., PLoS One 11(3), 2016
PMID: 26991106
Next Generation Sequencing Identifies Five Major Classes of Potentially Therapeutic Enzymes Secreted by Lucilia sericata Medical Maggots.
Franta Z, Vogel H, Lehmann R, Rupp O, Goesmann A, Vilcinskas A., Biomed Res Int 2016(), 2016
PMID: 27119084
Engineering the supply chain for protein production/secretion in yeasts and mammalian cells.
Klein T, Niklas J, Heinzle E., J Ind Microbiol Biotechnol 42(3), 2015
PMID: 25561318
The DNA methylation landscape of Chinese hamster ovary (CHO) DP-12 cells.
Wippermann A, Rupp O, Brinkrolf K, Hoffrogge R, Noll T., J Biotechnol 199(), 2015
PMID: 25701679
Global insights into the Chinese hamster and CHO cell transcriptomes.
Vishwanathan N, Yongky A, Johnson KC, Fu HY, Jacob NM, Le H, Yusufi FN, Lee DY, Hu WS., Biotechnol Bioeng 112(5), 2015
PMID: 25450749
Towards next generation CHO cell biology: Bioinformatics methods for RNA-Seq-based expression profiling.
Monger C, Kelly PS, Gallagher C, Clynes M, Barron N, Clarke C., Biotechnol J 10(7), 2015
PMID: 26058739
The structure of the Cyberlindnera jadinii genome and its relation to Candida utilis analyzed by the occurrence of single nucleotide polymorphisms.
Rupp O, Brinkrolf K, Buerth C, Kunigo M, Schneider J, Jaenicke S, Goesmann A, Pühler A, Jaeger KE, Ernst JF., J Biotechnol 211(), 2015
PMID: 26150016
Advancing biopharmaceutical process science through transcriptome analysis.
Vishwanathan N, Le H, Le T, Hu WS., Curr Opin Biotechnol 30(), 2014
PMID: 25014889
Discovery of transcription start sites in the Chinese hamster genome by next-generation RNA sequencing.
Jakobi T, Brinkrolf K, Tauch A, Noll T, Stoye J, Pühler A, Goesmann A., J Biotechnol 190(), 2014
PMID: 25086342
Cross-species transcriptomic approach reveals genes in hamster implantation sites.
Lei W, Herington J, Galindo CL, Ding T, Brown N, Reese J, Paria BC., Reproduction 148(6), 2014
PMID: 25252651

49 References

Data provided by Europe PubMed Central.


AUTHOR UNKNOWN, 0

AUTHOR UNKNOWN, 0
The Sequence Analysis and Management System -- SAMS-2.0: data management and sequence analysis adapted to changing requirements from traditional sanger sequencing to ultrafast sequencing technologies.
Bekel T, Henckel K, Kuster H, Meyer F, Mittard Runte V, Neuweger H, Paarmann D, Rupp O, Zakrzewski M, Puhler A, Stoye J, Goesmann A., J. Biotechnol. 140(1-2), 2009
PMID: 19297685
POAVIZ: a Partial order multiple sequence alignment visualizer.
Grasso C, Quist M, Ke K, Lee C., Bioinformatics 19(11), 2003
PMID: 12874062

AUTHOR UNKNOWN, 0
TopHat: discovering splice junctions with RNA-Seq.
Trapnell C, Pachter L, Salzberg SL., Bioinformatics 25(9), 2009
PMID: 19289445
Full-length transcriptome assembly from RNA-Seq data without a reference genome.
Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A., Nat. Biotechnol. 29(7), 2011
PMID: 21572440
Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels.
Schulz MH, Zerbino DR, Vingron M, Birney E., Bioinformatics 28(8), 2012
PMID: 22368243
GMAP: a genomic mapping and alignment program for mRNA and EST sequences.
Wu TD, Watanabe CK., Bioinformatics 21(9), 2005
PMID: 15728110

AUTHOR UNKNOWN, 0
CAP3: A DNA sequence assembly program.
Huang X, Madan A., Genome Res. 9(9), 1999
PMID: 10508846
BLAT--the BLAST-like alignment tool.
Kent WJ., Genome Res. 12(4), 2002
PMID: 11932250
Biological evaluation of d2, an algorithm for high-performance sequence comparison.
Hide W, Burke J, Davison DB., J. Comput. Biol. 1(3), 1994
PMID: 8790465
RAPYD--rapid annotation platform for yeast data.
Schneider J, Blom J, Jaenicke S, Linke B, Brinkrolf K, Neuweger H, Tauch A, Goesmann A., J. Biotechnol. 155(1), 2011
PMID: 21040748
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ., Nucleic Acids Res. 25(17), 1997
PMID: 9254694

AUTHOR UNKNOWN, 0
KEGG: Kyoto Encyclopedia of Genes and Genomes.
Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M., Nucleic Acids Res. 27(1), 1999
PMID: 9847135
The COG database: an updated version includes eukaryotes.
Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA., BMC Bioinformatics 4(), 2003
PMID: 12969510
eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges.
Powell S, Szklarczyk D, Trachana K, Roth A, Kuhn M, Muller J, Arnold R, Rattei T, Letunic I, Doerks T, Jensen LJ, von Mering C, Bork P., Nucleic Acids Res. 40(Database issue), 2012
PMID: 22096231
The Pfam protein families database.
Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K, Holm L, Sonnhammer EL, Eddy SR, Bateman A., Nucleic Acids Res. 38(Database issue), 2010
PMID: 19920124

Export

0 Marked Publications

Open Data PUB

Web of Science

View record in Web of Science®

Sources

PMID: 24427317
PubMed | Europe PMC

Search this title in

Google Scholar