Reference-based QUantification Of gene Dispensability (QUOD) - test dataset

Sielemann K, Weisshaar B, Pucker B (2020)
Bielefeld University.

Datenpublikation
 
Download
OA 10.97 MB
OA Input BAM file 1 95.85 MB
OA Input BAM file 2 206.79 MB
Alle
Abstract / Bemerkung
# Test set for QUOD (Reference-based QUantification Of gene Dispensability) ## Background: Dispensability of genes in a phylogenetic lineage, e.g. a species, genus, or higher-level clade, is gaining relevance as most genome sequencing projects move to a pangenome level. Most analyses classify genes as core genes, which are present in (almost) all investigated individual genomes, and dispensable genes, which only occur in a single or a few investigated genomes. The binary classification as ‘core’ or ‘dispensable’ is often based on arbitrary cutoffs of presence/absence in the analysed genomes. Instead of classifying a gene as core or dispensable, QUOD assigns a dispensability score to each gene. Hence, QUOD facilitates the identification of candidate dispensable genes which often underlie lineage-specific adaptation to varying environmental conditions. ## Test set: The test dataset for QUOD comprises genomic reads of four randomly selected accessions of the *Arabidopsis thaliana* Nordborg set. The reads were retrieved from the Sequence Read Archive (SRA) and mapped against the AthNd1_v2c reference genome sequence [1] using bowtie2 [2]. To reduce the size of the files, the first Mbp of Chr1 was extracted. All BAM files provided here are already sorted and should be used as input for QUOD which is available on GitHub: https://github.com/ksielemann/QUOD. A dispensability score is calculated for each gene. Optionally, the results can be visualized as a colored histogram and a box plot.

##### References: [1] Pucker B, et al. A chromosome-level sequence assembly reveals the structure of the Arabidopsis thaliana Nd-1 genome and its gene set. PloS one 14.5 (2019): e0216233. [2] Langmead B, Salzberg S. Fast gapped-read alignment with Bowtie 2. Nature Methods. 2012, 9:357-359.
Stichworte
pangenomics; genomics; dispensability; bioinformatics; bioinformatic tool; presence/absence variations
Erscheinungsjahr
2020
Page URI
https://pub.uni-bielefeld.de/record/2946079

Zitieren

Sielemann K, Weisshaar B, Pucker B. Reference-based QUantification Of gene Dispensability (QUOD) - test dataset. Bielefeld University; 2020.
Sielemann, K., Weisshaar, B., & Pucker, B. (2020). Reference-based QUantification Of gene Dispensability (QUOD) - test dataset. Bielefeld University. doi:10.4119/unibi/2946079
Sielemann, K., Weisshaar, B., and Pucker, B. (2020). Reference-based QUantification Of gene Dispensability (QUOD) - test dataset. Bielefeld University.
Sielemann, K., Weisshaar, B., & Pucker, B., 2020. Reference-based QUantification Of gene Dispensability (QUOD) - test dataset, Bielefeld University.
K. Sielemann, B. Weisshaar, and B. Pucker, Reference-based QUantification Of gene Dispensability (QUOD) - test dataset, Bielefeld University, 2020.
Sielemann, K., Weisshaar, B., Pucker, B.: Reference-based QUantification Of gene Dispensability (QUOD) - test dataset. Bielefeld University (2020).
Sielemann, Katharina, Weisshaar, Bernd, and Pucker, Boas. Reference-based QUantification Of gene Dispensability (QUOD) - test dataset. Bielefeld University, 2020.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Creative Commons Namensnennung 4.0 International Public License (CC-BY 4.0):
Volltext(e)
Titel
Annotation file for the QUOD test dataset.
Beschreibung
Annotation file for the AthNd1_v2c reference genome sequence containing the first MB of NdCChr1 only.
Access Level
OA Open Access
Zuletzt Hochgeladen
2020-09-25T17:57:18Z
MD5 Prüfsumme
2ecd808ab0f31ccf578866ba7fb732cd
Name
Titel
Input BAM file 1
Beschreibung
BAM file 1 which can be used to test the tool QUOD. The reads were mapped onto the AthNd1_v2c refrence genome sequence. The first MB on NdCChr1 was extracted for size reduction purposes.
Access Level
OA Open Access
Zuletzt Hochgeladen
2020-09-25T17:57:18Z
MD5 Prüfsumme
2c8bad8b504e71faae3d72f1c25e3652
Name
Titel
Input BAM file 2
Beschreibung
BAM file 2 which can be used to test the tool QUOD. The reads were mapped onto the AthNd1_v2c refrence genome sequence. The first MB on NdCChr1 was extracted for size reduction purposes.
Access Level
OA Open Access
Zuletzt Hochgeladen
2020-09-25T17:57:18Z
MD5 Prüfsumme
8fbb32b0082fbadbead686018871a61b
Name
Titel
Input BAM file 3
Beschreibung
BAM file 3 which can be used to test the tool QUOD. The reads were mapped onto the AthNd1_v2c refrence genome sequence. The first MB on NdCChr1 was extracted for size reduction purposes.
Access Level
OA Open Access
Zuletzt Hochgeladen
2020-09-25T17:57:18Z
MD5 Prüfsumme
148081bfc5bc7f1401bcd3bf547ed163
Name
Titel
Imput BAM file 4
Beschreibung
BAM file 4 which can be used to test the tool QUOD. The reads were mapped onto the AthNd1_v2c refrence genome sequence. The first MB on NdCChr1 was extracted for size reduction purposes.
Access Level
OA Open Access
Zuletzt Hochgeladen
2020-09-25T17:57:18Z
MD5 Prüfsumme
5d1c539a41f5f1f24c4fa0e28c4fcef3

Material in PUB:
Publikation, die diesen PUB Eintrag enthält
Reference-based QUantification Of gene Dispensability (QUOD)
Frey K, Weisshaar B, Pucker B (2020)
bioRxiv.

Externes Material:
Publikation, die diesen PUB Eintrag enthält

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar