Linking to and configuring RSeqAn
Introduction
The reason RSeqAn
was created was to allow for easy integration of the
SeqAn biological sequence analysis C++ library into R packages. This vignette describes how to link to RSeqAn
from another R package as well as how to configure RSeqAn for your own build system such as enabling zlib
or bzip2
.
Linking
Dependencies for linking
Prerequisites for linking to RSeqAn
are:
- Compiler needs to support C++14 standard. This is the default standard from GCC6 on. You need to tell the build system to use C++14, either by modifying the
SystemRequirements
field of theDESCRIPTION
file:SystemRequirements: C++14
or (preferred) by specifying it insrc/Makevars
:CXX_STD = CXX14
Rcpp
needs to be installed and imported inside theDESCRIPTION
file:Imports: Rcpp
as well as specified in theNAMESPACE
file:importFrom(Rcpp, sourceCpp)
Note: If you generate yourNAMESPACE
withroxygen2
then don't worry about theNAMESPACE
file.
Linking to RSeqAn
As long as the prerequisites are satisfied, then linking to RSeqAn
is simple. Just put RSeqAn
into the Imports
field of the DESCRIPTION
file as well, and then put
LinkingTo: Rcpp, RSeqAn
also in the DESCRIPTION
file.
In C or C++ code, use #include <seqan/$filename.h>
as usual, as well as // [[Rcpp::depends(RSeqAn)]]
as usual. For an example, you can look at the qckitfastq
package source code.
Configuring RSeqAn
By default SeqAn
and thus RSeqAn
are not set up to make use of libraries like zlib
and bzip2
although it has the capabilities. In order to enable and set options for these libraries (assuming the libraries are installed), preprocessor flags for it should be set in src/Makevars
(preferred) or using Sys.setenv()
. As an example for enabling zlib
:
- In
src/Makevars
, write:PKG_CXXFLAGS=-DSEQAN_HAS_ZLIB
- Using
Sys.setenv()
:Sys.setenv("PKG_CXXFLAGS"="-DSEQAN_HAS_ZLIB")
You can see other preprocessor defines that can be set at the SeqAn documentation.
Example script
An example script using Sys.setenv()
to set preprocessor defines that follows the SeqAn SAM and BAM I/O tutorial is below:
Sys.setenv("PKG_CXXFLAGS"="-DSEQAN_HAS_ZLIB -std=c++14")
// [[Rcpp::depends(RSeqAn)]] #include <seqan/bam_io.h> #include <Rcpp.h> using namespace Rcpp; // [[Rcpp::export]] int readBam() { // test.bam is in vignettes folder seqan::CharString bamFileName = "toy.bam"; // Open input file, BamFileIn can read SAM and BAM files. seqan::BamFileIn bamFileIn(toCString(bamFileName)); // Open output file, BamFileOut accepts also an ostream and a format tag. // Note the usage of Rcout instead of std::cout seqan::BamFileOut bamFileOut(context(bamFileIn), Rcout, seqan::Sam()); // Copy header. seqan::BamHeader header; seqan::readHeader(header, bamFileIn); seqan::writeHeader(bamFileOut, header); // Copy records. seqan::BamAlignmentRecord record; while (!atEnd(bamFileIn)) { seqan::readRecord(record, bamFileIn); seqan::writeRecord(bamFileOut, record); } return 0; }
readBam()
## @SQ SN:ref LN:45 ## @SQ SN:ref2 LN:40 ## r001 163 ref 7 30 8M4I4M1D3M = 37 39 TTAGATAAAGAGGATACTG * XX:B:S,12561,2,20,112 ## r002 0 ref 9 30 1S2I6M1P1I1P1I4M2I * 0 0 AAAAGATAAGGGATAAA * ## r003 0 ref 9 30 5H6M * 0 0 AGCTAA * SA:Z:ref,29,-,6H5M,17,0; ## r004 0 ref 16 30 6M14N1I5M * 0 0 ATAGCTCTCAGC * ## r003 16 ref 29 30 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1; ## r001 83 ref 37 30 9M = 7 -39 CAGCGCCAT * NM:i:1 ## r005 4 * 0 0 8X * 0 8 AAAAAAAA *
## [1] 0