Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines

Rupp, Oliver; Becker, Jennifer; Brinkrolf, Karina; Timmermann, Christina; Borth, Nicole; Pühler, Alfred; Noll, Thomas; Goesmann, Alexander

Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines

Rupp O, Becker J, Brinkrolf K, Timmermann C, Borth N, Pühler A, Noll T, Goesmann A (2014)
PLoS ONE 9(1): e85568.

Zeitschriftenaufsatz | Veröffentlicht | Englisch

Download

Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!

DOI

https://doi.org/10.1371/journal.pone.0085568

Autor*in

Rupp, Oliver^UniBi; Becker, Jennifer^UniBi; Brinkrolf, Karina^UniBi; Timmermann, Christina^UniBi; Borth, Nicole; Pühler, Alfred^UniBi ; Noll, Thomas^UniBi ; Goesmann, Alexander^UniBi

Einrichtung

Technische Fakultät > AG Fermentationstechnik
Centrum für Biotechnologie > Arbeitsgruppe A. Pühler
Centrum für Biotechnologie > Arbeitsgruppe A. Goesmann
Centrum für Biotechnologie > Technologieplattformen > Bioinformatics Resource Facility
Centrum für Biotechnologie > Graduate Center > Graduate Cluster Industrial Biotechnology
Centrum für Biotechnologie > Arbeitsgruppe A. Tauch
Centrum für Biotechnologie > Arbeitsgruppe T. Noll
Technische Fakultät > AG Zellkulturtechnik
Technische Fakultät > Computational Genomics

Abstract / Bemerkung

Chinese hamster ovary (CHO) cell lines represent the most commonly used mammalian expression system for the production of therapeutic proteins. In this context, detailed knowledge of the CHO cell transcriptome might help to improve biotechnological processes conducted by specific cell lines. Nevertheless, very few assembled cDNA sequences of CHO cells were publicly released until recently, which puts a severe limitation on biotechnological research. Two extended annotation systems and web-based tools, one for browsing eukaryotic genomes (GenDBE) and one for viewing eukaryotic transcriptomes (SAMS), were established as the first step towards a publicly usable CHO cell genome/transcriptome analysis platform. This is complemented by the development of a new strategy to assemble the ca. 100 million reads, sequenced from a broad range of diverse transcripts, to a high quality CHO cell transcript set. The cDNA libraries were constructed from different CHO cell lines grown under various culture conditions and sequenced using Roche/454 and Illumina sequencing technologies in addition to sequencing reads from a previous study. Two pipelines to extend and improve the CHO cell line transcripts were established. First, de novo assemblies were carried out with the Trinity and Oases assemblers, using varying k-mer sizes. The resulting contigs were screened for potential CDS using ESTScan. Redundant contigs were filtered out using cd-hit-est. The remaining CDS contigs were re-assembled with CAP3. Second, a reference-based assembly with the TopHat/Cufflinks pipeline was performed, using the recently published draft genome sequence of CHO-K1 as reference. Additionally, the de novo contigs were mapped to the reference genome using GMAP and merged with the Cufflinks assembly using the cuffmerge software. With this approach 28,874 transcripts located on 16,492 gene loci could be assembled. Combining the results of both approaches, 65,561 transcripts were identified for CHO cell lines, which could be clustered by sequence identity into 17,598 gene clusters.

Erscheinungsjahr

2014

Zeitschriftentitel

PLoS ONE

Band

Ausgabe

Art.-Nr.

e85568

ISSN

1932-6203

eISSN

1932-6203

Page URI

https://pub.uni-bielefeld.de/record/2656773

Zitieren

Rupp O, Becker J, Brinkrolf K, et al. Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines. PLoS ONE. 2014;9(1): e85568.

Rupp, O., Becker, J., Brinkrolf, K., Timmermann, C., Borth, N., Pühler, A., Noll, T., et al. (2014). Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines. PLoS ONE, 9(1), e85568. doi:10.1371/journal.pone.0085568

Rupp, Oliver, Becker, Jennifer, Brinkrolf, Karina, Timmermann, Christina, Borth, Nicole, Pühler, Alfred, Noll, Thomas, and Goesmann, Alexander. 2014. “Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines”. PLoS ONE 9 (1): e85568.

Rupp, O., Becker, J., Brinkrolf, K., Timmermann, C., Borth, N., Pühler, A., Noll, T., and Goesmann, A. (2014). Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines. PLoS ONE 9:e85568.

Rupp, O., et al., 2014. Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines. PLoS ONE, 9(1): e85568.

O. Rupp, et al., “Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines”, PLoS ONE, vol. 9, 2014, : e85568.

Rupp, O., Becker, J., Brinkrolf, K., Timmermann, C., Borth, N., Pühler, A., Noll, T., Goesmann, A.: Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines. PLoS ONE. 9, : e85568 (2014).

Rupp, Oliver, Becker, Jennifer, Brinkrolf, Karina, Timmermann, Christina, Borth, Nicole, Pühler, Alfred, Noll, Thomas, and Goesmann, Alexander. “Construction of a Public CHO Cell Line Transcript Database Using Versatile Bioinformatics Analysis Pipelines”. PLoS ONE 9.1 (2014): e85568.

Daten bereitgestellt von European Bioinformatics Institute (EBI)

23 Zitationen in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

A cross-species whole genome siRNA screen in suspension-cultured Chinese hamster ovary cells identifies novel engineering targets.
Klanert G, Fernandez DJ, Weinguny M, Eisenhut P, Bühler E, Melcher M, Titus SA, Diendorfer AB, Gludovacz E, Jadhav V, Xiao S, Stern B, Lal M, Shiloach J, Borth N., Sci Rep 9(1), 2019
PMID: 31213643

Three previously unrecognised classes of biosynthetic enzymes revealed during the production of xenovulene A.
Schor R, Schotte C, Wibberg D, Kalinowski J, Cox RJ., Nat Commun 9(1), 2018
PMID: 29773797

Binning enables efficient host genome reconstruction in cnidarian holobionts.
Celis JS, Wibberg D, Ramírez-Portilla C, Rupp O, Sczyrba A, Winkler A, Kalinowski J, Wilke T., Gigascience 7(7), 2018
PMID: 29917104

Integrative analysis of DNA methylation and gene expression in butyrate-treated CHO cells.
Wippermann A, Rupp O, Brinkrolf K, Hoffrogge R, Noll T., J Biotechnol 257(), 2017
PMID: 27890772

Transcriptome profiling of the Australian arid-land plant Eremophila serrulata (A.DC.) Druce (Scrophulariaceae) for the identification of monoterpene synthases.
Kracht ON, Ammann AC, Stockmann J, Wibberg D, Kalinowski J, Piotrowski M, Kerr R, Brück T, Kourist R., Phytochemistry 136(), 2017
PMID: 28162767

Ultra-deep next generation mitochondrial genome sequencing reveals widespread heteroplasmy in Chinese hamster ovary cells.
Kelly PS, Clarke C, Costello A, Monger C, Meiller J, Dhiman H, Borth N, Betenbaugh MJ, Clynes M, Barron N., Metab Eng 41(), 2017
PMID: 28188893

Transcriptomic changes in CHO cells after adaptation to suspension growth in protein-free medium analysed by a species-specific microarray.
Shridhar S, Klanert G, Auer N, Hernandez-Lopez I, Kańduła MM, Hackl M, Grillari J, Stralis-Pavese N, Kreil DP, Borth N., J Biotechnol 257(), 2017
PMID: 28302587

Draft genome sequence of the potato pathogen Rhizoctonia solani AG3-PT isolate Ben3.
Wibberg D, Genzel F, Verwaaijen B, Blom J, Rupp O, Goesmann A, Zrenner R, Grosch R, Pühler A, Schlüter A., Arch Microbiol 199(7), 2017
PMID: 28597196

Linking secondary metabolites to biosynthesis genes in the fungal endophyte Cyanodermella asteris: The anti-cancer bisanthraquinone skyrin.
Jahn L, Schafhauser T, Wibberg D, Rückert C, Winkler A, Kulik A, Weber T, Flor L, van Pée KH, Kalinowski J, Ludwig-Müller J, Wohlleben W., J Biotechnol 257(), 2017
PMID: 28647529

Listeria monocytogenes Induces a Virulence-Dependent microRNA Signature That Regulates the Immune Response in Galleria mellonella.
Mannala GK, Izar B, Rupp O, Schultze T, Goesmann A, Chakraborty T, Hain T., Front Microbiol 8(), 2017
PMID: 29312175

The use of 'Omics technology to rationally improve industrial mammalian cell line performance.
Lewis AM, Abu-Absi NR, Borys MC, Li ZJ., Biotechnol Bioeng 113(1), 2016
PMID: 26059229

Precision control of recombinant gene transcription for CHO cell synthetic biology.
Brown AJ, James DC., Biotechnol Adv 34(5), 2016
PMID: 26721629

Effect of Temperature Downshift on the Transcriptomic Responses of Chinese Hamster Ovary Cells Using Recombinant Human Tissue Plasminogen Activator Production Culture.
Bedoya-López A, Estrada K, Sanchez-Flores A, Ramírez OT, Altamirano C, Segovia L, Miranda-Ríos J, Trujillo-Roldán MA, Valdez-Cruz NA., PLoS One 11(3), 2016
PMID: 26991106

Next Generation Sequencing Identifies Five Major Classes of Potentially Therapeutic Enzymes Secreted by Lucilia sericata Medical Maggots.
Franta Z, Vogel H, Lehmann R, Rupp O, Goesmann A, Vilcinskas A., Biomed Res Int 2016(), 2016
PMID: 27119084

Engineering the supply chain for protein production/secretion in yeasts and mammalian cells.
Klein T, Niklas J, Heinzle E., J Ind Microbiol Biotechnol 42(3), 2015
PMID: 25561318

The DNA methylation landscape of Chinese hamster ovary (CHO) DP-12 cells.
Wippermann A, Rupp O, Brinkrolf K, Hoffrogge R, Noll T., J Biotechnol 199(), 2015
PMID: 25701679

Global insights into the Chinese hamster and CHO cell transcriptomes.
Vishwanathan N, Yongky A, Johnson KC, Fu HY, Jacob NM, Le H, Yusufi FN, Lee DY, Hu WS., Biotechnol Bioeng 112(5), 2015
PMID: 25450749

Towards next generation CHO cell biology: Bioinformatics methods for RNA-Seq-based expression profiling.
Monger C, Kelly PS, Gallagher C, Clynes M, Barron N, Clarke C., Biotechnol J 10(7), 2015
PMID: 26058739

The structure of the Cyberlindnera jadinii genome and its relation to Candida utilis analyzed by the occurrence of single nucleotide polymorphisms.
Rupp O, Brinkrolf K, Buerth C, Kunigo M, Schneider J, Jaenicke S, Goesmann A, Pühler A, Jaeger KE, Ernst JF., J Biotechnol 211(), 2015
PMID: 26150016

Transcriptome analyses of CHO cells with the next-generation microarray CHO41K: development and validation by analysing the influence of the growth stimulating substance IGF-1 substitute LongR(3.).
Becker J, Timmermann C, Rupp O, Albaum SP, Brinkrolf K, Goesmann A, Pühler A, Tauch A, Noll T., J Biotechnol 178(), 2014
PMID: 24613301

Advancing biopharmaceutical process science through transcriptome analysis.
Vishwanathan N, Le H, Le T, Hu WS., Curr Opin Biotechnol 30(), 2014
PMID: 25014889

Discovery of transcription start sites in the Chinese hamster genome by next-generation RNA sequencing.
Jakobi T, Brinkrolf K, Tauch A, Noll T, Stoye J, Pühler A, Goesmann A., J Biotechnol 190(), 2014
PMID: 25086342

Cross-species transcriptomic approach reveals genes in hamster implantation sites.
Lei W, Herington J, Galindo CL, Ding T, Brown N, Reese J, Paria BC., Reproduction 148(6), 2014
PMID: 25252651

49 References

Daten bereitgestellt von Europe PubMed Central.

Recombinant Protein Therapeutics from CHO Cells –20 Years and Counting
AUTHOR UNKNOWN, 2007

Genetics of somatic mammalian cells. III. Long-term cultivation of euploid cells from human and animal subjects.
PUCK TT, CIECIURA SJ, ROBINSON A., J. Exp. Med. 108(6), 1958
PMID: 13598821

Getting the glycosylation right: implications for the biotechnology industry.
Jenkins N, Parekh RB, James DC., Nat. Biotechnol. 14(8), 1996
PMID: 9631034

AUTHOR UNKNOWN, 0

Evaluation of a genomics platform for cross-species transcriptome analysis of recombinant CHO cells.
Ernst W, Trummer E, Mead J, Bessant C, Strelec H, Katinger H, Hesse F., Biotechnol J 1(6), 2006
PMID: 16892312

Comparative transcriptional analysis of mouse hybridoma and recombinant Chinese hamster ovary cells undergoing butyrate treatment.
De Leon Gatti M, Wlaschin KF, Nissom PM, Yap M, Hu WS., J. Biosci. Bioeng. 103(1), 2007
PMID: 17298905

Quality assessment of cross-species hybridization of CHO transcriptome on a mouse DNA oligo microarray.
Yee JC, Wlaschin KF, Chuah SH, Nissom PM, Hu WS., Biotechnol. Bioeng. 101(6), 2008
PMID: 18814282

Comparing de novo assemblers for 454 transcriptome data.
Kumar S, Blaxter ML., BMC Genomics 11(), 2010
PMID: 20950480

Developing genomic platforms for Chinese hamster ovary cells.
Kantardjieff A, Nissom PM, Chuah SH, Yusufi F, Jacob NM, Mulukutla BC, Yap M, Hu WS., Biotechnol. Adv. 27(6), 2009
PMID: 19470403

Reaching the depth of the Chinese hamster ovary cell transcriptome.
Jacob NM, Kantardjieff A, Yusufi FN, Retzel EF, Mulukutla BC, Chuah SH, Yap M, Hu WS., Biotechnol. Bioeng. 105(5), 2010
PMID: 19882695

Into the unknown: expression profiling without genome sequence information in CHO by next generation sequencing.
Birzele F, Schaub J, Rust W, Clemens C, Baum P, Kaufmann H, Weith A, Schulz TW, Hildebrandt T., Nucleic Acids Res. 38(12), 2010
PMID: 20194116

Genomic sequencing and analysis of a Chinese hamster ovary cell line using Illumina sequencing technology.
Hammond S, Swanberg JC, Kaplarevic M, Lee KH., BMC Genomics 12(), 2011
PMID: 21269493

The genomic sequence of the Chinese hamster ovary (CHO)-K1 cell line.
Xu X, Nagarajan H, Lewis NE, Pan S, Cai Z, Liu X, Chen W, Xie M, Wang W, Hammond S, Andersen MR, Neff N, Passarelli B, Koh W, Fan HC, Wang J, Gui Y, Lee KH, Betenbaugh MJ, Quake SR, Famili I, Palsson BO, Wang J., Nat. Biotechnol. 29(8), 2011
PMID: 21804562

Unraveling the Chinese hamster ovary cell line transcriptome by next-generation sequencing.
Becker J, Hackl M, Rupp O, Jakobi T, Schneider J, Szczepanowski R, Bekel T, Borth N, Goesmann A, Grillari J, Kaltschmidt C, Noll T, Puhler A, Tauch A, Brinkrolf K., J. Biotechnol. 156(3), 2011
PMID: 21945585

AUTHOR UNKNOWN, 0

Chinese hamster genome sequenced from sorted chromosomes.
Brinkrolf K, Rupp O, Laux H, Kollin F, Ernst W, Linke B, Kofler R, Romand S, Hesse F, Budach WE, Galosy S, Muller D, Noll T, Wienberg J, Jostock T, Leonard M, Grillari J, Tauch A, Goesmann A, Helk B, Mott JE, Puhler A, Borth N., Nat. Biotechnol. 31(8), 2013
PMID: 23929341

De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data.
Diguistini S, Liao NY, Platt D, Robertson G, Seidel M, Chan SK, Docking TR, Birol I, Holt RA, Hirst M, Mardis E, Marra MA, Hamelin RC, Bohlmann J, Breuil C, Jones SJ., Genome Biol. 10(9), 2009
PMID: 19747388

Optimization of de novo transcriptome assembly from next-generation sequencing data.
Surget-Groba Y, Montoya-Burgos JI., Genome Res. 20(10), 2010
PMID: 20693479

De novo reconstruction of the Toxoplasma gondii transcriptome improves on the current genome annotation and reveals alternatively spliced transcripts and putative long non-coding RNAs.
Hassan MA, Melo MB, Haas B, Jensen KD, Saeij JP., BMC Genomics 13(), 2012
PMID: 23231500

Comparative study of de novo assembly and genome-guided assembly strategies for transcriptome reconstruction based on RNA-Seq.
Lu B, Zeng Z, Shi T., Sci China Life Sci 56(2), 2013
PMID: 23393030

Comparative analysis of de novo transcriptome assembly.
Clarke K, Yang Y, Marsh R, Xie L, Zhang KK., Sci China Life Sci 56(2), 2013
PMID: 23393031

Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.
Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L., Nat. Biotechnol. 28(5), 2010
PMID: 20436464

An overview of the wcd EST clustering tool.
Hazelhurst S, Hide W, Liptak Z, Nogueira R, Starfield R., Bioinformatics 24(13), 2008
PMID: 18480101

Optimization of de novo transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms.
Haznedaroglu BZ, Reeves D, Rismani-Yazdi H, Peccia J., BMC Bioinformatics 13(), 2012
PMID: 22808927

GenDB--an open source genome annotation system for prokaryote genomes.
Meyer F, Goesmann A, McHardy AC, Bartels D, Bekel T, Clausen J, Kalinowski J, Linke B, Rupp O, Giegerich R, Puhler A., Nucleic Acids Res. 31(8), 2003
PMID: 12682369

AUTHOR UNKNOWN, 0

The Sequence Analysis and Management System -- SAMS-2.0: data management and sequence analysis adapted to changing requirements from traditional sanger sequencing to ultrafast sequencing technologies.
Bekel T, Henckel K, Kuster H, Meyer F, Mittard Runte V, Neuweger H, Paarmann D, Rupp O, Zakrzewski M, Puhler A, Stoye J, Goesmann A., J. Biotechnol. 140(1-2), 2009
PMID: 19297685

Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems.
Grasso C, Lee C., Bioinformatics 20(10), 2004
PMID: 14962922

POAVIZ: a Partial order multiple sequence alignment visualizer.
Grasso C, Quist M, Ke K, Lee C., Bioinformatics 19(11), 2003
PMID: 12874062

AUTHOR UNKNOWN, 0

TopHat: discovering splice junctions with RNA-Seq.
Trapnell C, Pachter L, Salzberg SL., Bioinformatics 25(9), 2009
PMID: 19289445

Full-length transcriptome assembly from RNA-Seq data without a reference genome.
Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A., Nat. Biotechnol. 29(7), 2011
PMID: 21572440

Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels.
Schulz MH, Zerbino DR, Vingron M, Birney E., Bioinformatics 28(8), 2012
PMID: 22368243

GMAP: a genomic mapping and alignment program for mRNA and EST sequences.
Wu TD, Watanabe CK., Bioinformatics 21(9), 2005
PMID: 15728110

AUTHOR UNKNOWN, 0

Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.
Li W, Godzik A., Bioinformatics 22(13), 2006
PMID: 16731699

CAP3: A DNA sequence assembly program.
Huang X, Madan A., Genome Res. 9(9), 1999
PMID: 10508846

BLAT--the BLAST-like alignment tool.
Kent WJ., Genome Res. 12(4), 2002
PMID: 11932250

Biological evaluation of d2, an algorithm for high-performance sequence comparison.
Hide W, Burke J, Davison DB., J. Comput. Biol. 1(3), 1994
PMID: 8790465

RAPYD--rapid annotation platform for yeast data.
Schneider J, Blom J, Jaenicke S, Linke B, Brinkrolf K, Neuweger H, Tauch A, Goesmann A., J. Biotechnol. 155(1), 2010
PMID: 21040748

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ., Nucleic Acids Res. 25(17), 1997
PMID: 9254694

AUTHOR UNKNOWN, 0

KEGG: Kyoto Encyclopedia of Genes and Genomes.
Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M., Nucleic Acids Res. 27(1), 1999
PMID: 9847135

The COG database: an updated version includes eukaryotes.
Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA., BMC Bioinformatics 4(), 2003
PMID: 12969510

eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges.
Powell S, Szklarczyk D, Trachana K, Roth A, Kuhn M, Muller J, Arnold R, Rattei T, Letunic I, Doerks T, Jensen LJ, von Mering C, Bork P., Nucleic Acids Res. 40(Database issue), 2011
PMID: 22096231

The Pfam protein families database.
Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K, Holm L, Sonnhammer EL, Eddy SR, Bateman A., Nucleic Acids Res. 38(Database issue), 2009
PMID: 19920124

InterProScan--an integration platform for the signature-recognition methods in InterPro.
Zdobnov EM, Apweiler R., Bioinformatics 17(9), 2001
PMID: 11590104

PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees.
Mi H, Muruganujan A, Thomas PD., Nucleic Acids Res. 41(Database issue), 2012
PMID: 23193289

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB