callPeaks
callPeaks.Rd
This is a R-based implementation of the general MACS2 strategy (Zhang et al., Genome Biology 2008), taking some freedom here and there in comparison to the original. The function is still under development, especially with respect to single-end reads, where some optimization might still be needed. For paired-end reads, the results are nearly identical with those of MACS2, with two main differences: 1) the p-values are more conservative (and arguably better calibrated) and 2) because the implementation does not rely on sliding windows, with default settings the peaks are narrower.
Usage
callPeaks(
bam,
ctrl = NULL,
paired = FALSE,
type = c("narrow", "broad"),
nullH = c("local", "global.nb", "global.bin"),
blacklist = NULL,
binSize = 10L,
fragLength = 200L,
minPeakCount = 5L,
minFoldChange = 1.3,
pthres = 10^-3,
maxSize = NULL,
bgWindow = c(1, 5, 10) * 1000,
pseudoCount = 1L,
useStrand = TRUE,
outFormat = c("custom", "narrowPeak"),
verbose = TRUE,
...
)
Arguments
- bam
A signal bam file
- ctrl
An optional (but highly recommended) control bam file
- paired
Logical, whether the reads are paired
- type
The type of peaks to identify ('narrow' or 'broad').
- blacklist
An optional `GRanges` of regions to be excluded (or the path to such a file). Since the blacklisted regions are removed from both the signal and control peaks, this also has an important impact on the empirical FDR (when `ctrl` is given).
- binSize
Binsize used to estimate peak shift
- fragLength
Fragment length. Ignored if `paired=TRUE`. This is only used for the initial candidate region identification, and sizes are adjusted after, so it doesn't need to be very precise.
- minPeakCount
The minimum summit count for a region to be considered. Decreasing this can substantially increase the running time.
- minFoldChange
The minimum fold-change for a region to be considered. Decreasing this can substantially increase the running time.
- pthres
The p-value threshold to use
- bgWindow
The windows to consider (in addition to the peak itself) for local background.
- outFormat
The output format ('custom' or 'narrowPeak')
- verbose
Logical; whether to output progress messages
- ...
Passed to
bamChrChunkApply
Details
`callPeaks` takes about twice as long to run as MACS2, and uses more memory. If dealing with very large files (or a very low memory system), consider increasing the number of processing chunks, for instance with `nChunks=10`.
The function uses bamChrChunkApply
to obtain the coverages,
and can accept any argument of that function. This means that the
`mapqFilter` and bam `flgs` arguments can be used to restrict the reads used.