qckitfastq: A comprehensive quality control R package for Next Generation Sequencing FASTQ data
This R package contains tools for comprehensive quality control of FASTQ format data. We hope to replicate existing tools for FASTQ quality control as well as advance FASTQ metrics where data is truncated for the analysis. We enable efficient processing of FASTQ format data by implementing efficient C++ functions using
The metrics that
qckitfastq provides are as following:
1. data dimension
2. per base sequence content
3. per base quality score statisitcs
4. per read GC content
5. per read mean quality score
6. overrepresented sequence
7. per base kmer count
8. overrepresented kmer
The above metrices include both analysis results tables and visualizations of results.
qckitfastq has dependencies on both CRAN packages and Bioconductor packages. Commands to install all prerequisites from R are given below:
install.packages(c('magrittr','ggplot2','dplyr','testthat','data.table','reshape2','grDevices','graphics','stats','utils','Rcpp','kableExtra','rlang','knitr','rmarkdown')) if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install(c("RSeqAn","seqTools","zlibbioc")
qckitfastq release version is on Bioconductor. To install from, follow instructions on the package page.
From Github repo
This repository contains the development version. You will need
devtools to install.
The simplest way to run
qckitfastq and its intended usage is by executing
run_all, a single command that will produce a report of all of the included metrics in a user-provided directory with some default parameters and default filenames. These default parameters and filenames cannot be changed. An example using
tempdir() and an example
fq.gz file is given below:
library(qckitfastq) infile <- system.file("extdata","10^5_reads_test.fq.gz",package="qckitfastq") testfolder <- tempdir() run_all(infile,testfolder)
However, each metric can also be run separately for closer examination, parameter tuning, or if the user wishes to save reports with a different filename. In those cases, we recommend taking a look at the
qckitfastq vignette to get started. The vignette can also be viewed in RStudio with the following commands:
NEWS for changes.
- August Guang, creator and maintainer.
- Wenyue Xing, creator.