Peaks with hichipper¶
To call peaks from HiChIP data directly, hichipper aggregates read density from either all samples or each sample individually. Additionally,
users can specify whether all read density is used or if only self-ligation reads are used. To specify these options, put the appropriate
string of the form {COMBINED,EACH},{ALL,SELF}
in the peaks
slot of the .yaml
.
For example, to replicate the peak calling performed in Mumbach et al., one would use the following .yaml
:
peaks:
- COMBINED,SELF
resfrags:
- hg19_MboI_resfrag.bed.gz
hicpro_output:
- hicpro
Alternatively, we can call peaks from the HiChIP data for each sample individually using all reads using this specification–
peaks:
- EACH,ALL
resfrags:
- hg19_MboI_resfrag.bed.gz
hicpro_output:
- hicpro
The figure below shows all options for peak specification in hichipper including every option for inferring peaks which are noted in the table.
Alternatively, users can pre-specify a set of peaks to used. In this case, a “connectome” will be inferred between the peaks
specified in the .bed
file. Of note, pre-specified peaks will still be padded either by fixed amounts or to the edges of the restriction
fragment pads (or both) unless the user specifies these flags differently (see below).
peaks:
- predeterminedPeaks.bed
resfrags:
- hg19_MboI_resfrag.bed.gz
hicpro_output:
- hicpro
Note: the input of pre-determined peaks does not have to explicitly be a .bed
file. Rather, any file name is acceptable so long
as the first three columns indicate appropriate genomic loci as if it were a .bed
file. For example, .narrowPeak
files from
macs2
should be fine.
Multiple ChIP-Seq peaks as input¶
As raised in this issue, if you have multiple samples and multiple ChIP-Seq or related high-quality peak definitions to be used as an input, the way to do this is to create two or more .yaml
files, each one specifying its own bed file of peaks. Then, execute hichipper
such that you restrict the analysis to the sample you want per bed file using the --keep-samples
or --ignore-samples
flags. Thanks to user sb5169 for bringing this up.
HiChIP-Specific Bias Correction¶
A key difference of HiChIP data compared to ChIA-PET, ChIP-Seq, and related immunoprecipitation assays is the a notable bias where a greater read density accumulates near the motif used in the restriction enzyme digestion. The image below shows the ratio of the treatment to the background (the statistic used in macs2 to call peaks) as a function of distance to the nearest restriction fragment locus. Note the plot below–
A more detailed description of this bias and our analysis is contained in this writeup.