Fast reverse complement of DNA and RNA sequences in R, implemented in C++ via Rcpp.
fastrc uses a static lookup table for O(1) per-base complement mapping with full IUPAC ambiguity code support. It is especially useful for reverse complementing many short sequences (e.g. primers, probes, k-mers, short reads), where per-call overhead dominates and fastrc is nearly 100x faster than the implementation in Biostrings.
Features
- DNA (A↔︎T) and RNA (A↔︎U) modes
- Full IUPAC ambiguity code support (M↔︎K, R↔︎Y, S↔︎S, W↔︎W, V↔︎B, H↔︎D, N↔︎N)
- Case preservation
- NA handling
- Vectorized over character vectors
Benchmarks
Benchmarks were run on a 12th Gen Intel i7-1270P using R’s default compilation flags (-O2). See inst/benchmarks/benchmark_revc.R to reproduce.
100 sequences x 30 bp
This is where you may really want to use fastrc: many short sequences where per-call overhead matters most.
| Method | Median | vs fastrc |
|---|---|---|
| fastrc | 17 µs | 1x |
| spgs | 619 µs | 36x slower |
| insect | 1,031 µs | 61x slower |
| Biostrings | 1,613 µs | 95x slower |
| tktools | 3,277 µs | 193x slower |