Skip to contents

Introduction

mSigPlot creates publication-quality plots for mutational signatures and mutational spectra. It supports single base substitutions (SBS), doublet base substitutions (DBS), and small insertions and deletions (indels) across 10 classification systems.

SBS96 – single base substitutions in trinucleotide context

The 96-channel catalog has one row per trinucleotide mutation context, organized into 6 mutation classes (C>A, C>G, C>T, T>A, T>C, T>G). If row names are present they will be checked agains catalog_row_order.

sbs96_file <- system.file("extdata", "sbs96_example.csv", package = "mSigPlot")
sbs96_df <- read.csv(sbs96_file)
catalog_sbs96 <- data.frame(
  sample1 = sbs96_df[, 3],
  row.names = catalog_row_order()$SBS96
)
plot_SBS96(catalog_sbs96, plot_title = "HepG2 sample -- SBS96")

Row names (or for a numeric vector, names) are not required.

If there are no names or row names be sure the rows are in the order expected for plotting.

plot_SBS96(sample(sbs96_df[ ,3, drop = TRUE], replace = FALSE), 
  plot_title = "HepG2 sample, mixed up row order -- SBS96")

The plot above will be unrecognizable.

ID83 – COSMIC indel classification

The 83-channel indel catalog uses the COSMIC classification with single-base and multi-base deletions, insertions, and microhomology deletions.

id83_file <- system.file("extdata", "id83_cosmic_v3.5.tsv", package = "mSigPlot")
id83_sigs <- read.table(id83_file, header = TRUE, sep = "\t",
                        row.names = 1, check.names = FALSE)
plot_ID83(id83_sigs[, "ID1", drop = FALSE], plot_title = "COSMIC ID1 signature")

ID89 – 89-channel indel classification

The 89-channel system (Koh et al.) provides a finer decomposition of indel types, including optional complex indels.

id89_file <- system.file("extdata", "type89_liu_et_al_sigs.tsv",
                         package = "mSigPlot")
id89_sigs <- read.table(id89_file, header = TRUE, sep = "\t",
                        row.names = 1, check.names = FALSE)
plot_ID89(id89_sigs[, 1, drop = FALSE], plot_title = "ID89 signature")

One can add arrows to label the tallest peaks for bar-chart-like plots:

plot_ID89(id89_sigs[, 1, drop = FALSE], plot_title = "ID89 signature",
          num_peak_labels = 5)

ID476 – 476-channel indel classification

The 476-channel system adds flanking base context to the indel classification, producing a detailed profile.

id476_file <- system.file("extdata", "type476_liu_et_al_sigs.tsv",
                          package = "mSigPlot")
id476_sigs <- read.table(id476_file, header = TRUE, sep = "\t",
                         row.names = 1, check.names = FALSE)
plot_ID476(id476_sigs[, 1, drop = FALSE], plot_title = "ID476 signature")

ID476 right panel

The right portion of the 476-channel profile (positions 343–476) can be plotted separately for a closer look at multi-base indels.

plot_ID476_right(id476_sigs[, 1, drop = FALSE],
                 plot_title = "ID476 right panel")

DBS78 – doublet base substitutions

The 78-channel DBS catalog covers all dinucleotide substitution classes, organized into 10 reference dinucleotide groups.

dbs78_file <- system.file("extdata", "dbs78_example.csv", package = "mSigPlot")
dbs78_df <- read.csv(dbs78_file)
catalog_dbs78 <- data.frame(
  sample1 = dbs78_df[, 3],
  row.names = paste0(dbs78_df$Ref, dbs78_df$Var)
)
plot_DBS78(catalog_dbs78, plot_title = "HepG2 sample -- DBS78")

SBS192 – SBS with transcription strand

The 192-channel catalog pairs each of the 96 trinucleotide contexts with transcribed and untranscribed strand information.

sbs192_file <- system.file("extdata", "regress.cat.sbs.192.csv",
                           package = "mSigPlot")
sbs192_df <- read.csv(sbs192_file)
catalog_sbs192 <- data.frame(
  sample1 = sbs192_df[, 4],
  row.names = catalog_row_order()$SBS192
)
plot_SBS192(catalog_sbs192, plot_title = "HepG2 -- SBS192")

DBS144 – DBS with transcription strand

The 144-channel DBS catalog adds transcription strand context to the 78 dinucleotide substitution types.

dbs144_file <- system.file("extdata", "regress.cat.dbs.144.csv",
                           package = "mSigPlot")
dbs144_df <- read.csv(dbs144_file)
catalog_dbs144 <- data.frame(
  sample1 = dbs144_df[, 3],
  row.names = paste0(dbs144_df$Ref, dbs144_df$Var)
)
plot_DBS144(catalog_dbs144, plot_title = "HepG2 -- DBS144")

DBS136 – DBS heatmap

The 136-channel DBS catalog is displayed as a heatmap of 10 panels (4x4 grids) rather than a bar chart.

dbs136_file <- system.file("extdata", "regress.cat.dbs.136.csv",
                           package = "mSigPlot")
dbs136_df <- read.csv(dbs136_file, row.names = 1)
plot_DBS136(dbs136_df[, 1, drop = FALSE], plot_title = "HepG2 -- DBS136")

SBS1536 – SBS pentanucleotide context

The 1536-channel catalog extends trinucleotide context to pentanucleotide context, displayed as a faceted heatmap.

sbs1536_file <- system.file("extdata", "regress.cat.sbs.1536.csv",
                            package = "mSigPlot")
sbs1536_df <- read.csv(sbs1536_file)
catalog_sbs1536 <- data.frame(
  sample1 = sbs1536_df[, 3],
  row.names = catalog_row_order()$SBS1536
)
plot_SBS1536(catalog_sbs1536, plot_title = "HepG2 -- SBS1536")

SBS288 – SBS with three-strand context

The 288-channel catalog adds three strand categories (transcribed, untranscribed, non-transcribed/intergenic) to the 96 SBS channels.

sbs288_file <- system.file("extdata", "SBS288_De-Novo_Signatures.txt",
                           package = "mSigPlot")
sbs288_df <- read.table(sbs288_file, header = TRUE, sep = "\t",
                        row.names = 1, check.names = FALSE)
plot_SBS288(sbs288_df[, 1, drop = FALSE], plot_title = "SBS288A")

ID166 – indel genic/intergenic

The 166-channel indel catalog adds genic/intergenic context to the 83-channel COSMIC classification.

set.seed(42)
sig_id166 <- runif(166)
sig_id166 <- sig_id166 / sum(sig_id166)
names(sig_id166) <- catalog_row_order()$ID166
plot_ID166(sig_id166, plot_title = "Simulated ID166 signature")

SBS12 – strand bias summary

The SBS12 plot collapses a 192-channel catalog to 12 bars (6 mutation classes x 2 strands) to visualize transcription strand bias.

plot_SBS12(catalog_sbs192, plot_title = "HepG2 -- SBS12 strand bias")

Auto-dispatch with plot_guess()

If you don’t know (or don’t want to specify) the catalog type, plot_guess() detects it from the number of rows:

plot_guess(catalog_sbs96, plot_title = "Auto-detected SBS96")

Multi-sample PDF export

Every plot function has a _pdf() variant that writes a multi-page PDF with 5 plots per page. The auto-dispatch version is plot_guess_pdf():

sbs96_mat <- as.matrix(sbs96_df[, 3:6])
rownames(sbs96_mat) <- catalog_row_order()$SBS96
colnames(sbs96_mat) <- paste0("Sample_", 1:4)

plot_guess_pdf(sbs96_mat, file.path(tempdir(), "sbs96_samples.pdf"))