hichipper parameters¶
Running
hichipper --help
shows the parameters that can be used in this software package as reproduced below.
Usage: hichipper [OPTIONS] MANIFEST
A preprocessing and QC pipeline for HiChIP data.
Options:
--out TEXT Output directory name; must not be an already existing directory [Required]
--min-dist TEXT Minimum distance; default = 5000
--max-dist TEXT Peak padding width (applied on both left and
right); default = 2000000
--macs2-string TEXT String of arguments to pass to MACS2; only is
called when peaks are set to be called;
default = "-q 0.01 --extsize 147 --nomodel"
--macs2-genome TEXT Argument to pass to the -g variable in MACS2
(mm for mouse genome; hs for human genome);
default = "hs"
--peak-pad TEXT Peak padding width (applied on both left and
right); default = 500
--merge-gap TEXT Merge nearby peaks (after all padding is
complete); default = 500
--keep-temp-files Keep temporary files?
--skip-background-correction Skip restriction fragment aware background
correction?
--skip-resfrag-pad Skip restriction fragment aware padding
--skip-qc Skip QC report generation?
--skip-diffloop Skip analyses in diffloop (e.g. Mango loop
calling; .rds generation)
--make-ucsc Make additional output files that can support
viewing in UCSC genome browser; requires tabix
and htslib tools.
--keep-samples TEXT Comma separated list of sample names to keep;
ALL (special string) by default
--ignore-samples TEXT Comma separated list of sample names to
ignore; NONE (special string) by default
--read-length TEXT Length of reads from sequencing runs; default = 75
--version Show the version and exit.
--help Show this message and exit.
Running
hichipper --version
will show the version of this package currently installed.
hichipper, version 0.6.0
Check the badge up top to see if a newer version is available or try directly through pip:
pip install hichipper --upgrade
Unless these flags are supplied, the pipeline will attempt to run. Minimally sufficient parameters include
the --out flag and a .yaml file as shown in the example executions. Below are some explanations of the
additional parameters than can be configured when executing the pipeline.
Restriction-fragment aware padding¶

Parameter explanations¶
Most of the parameter options are fairly straight forward. Running hichipper --version or hichipper --help
doesn’t run the tool but supplies the information noted above. Otherwise, the default run mode requires
a .yaml file supplied in addition to the --out parameter, which specifies the output directory of the run.
Users can decide to customize final output by using boolean flags or supply variable text input. The following
cartoon shows a graphical overview of important parameters to consider when running hichipper.

As noted in orange, defined peaks are automatically padded by some integer width from the --peak-pad flag. By default,
this pad extends 500 base pairs in either direction. Padding the peaks boosts the number of PETs that can be mapped to loops.
For example, PET II would not be considered in loop since the left end of the read does not overlap with the called peak (black).
However, it does overlap with the padded peak, so it is retained with padding. When two peaks are close to one another, they may
be merged using the --merge-gap command. As suggested in the figure, the padded peaks B and C may be sufficiently close to be merged into a single anchor.
Note that this can lead to some PETs becoming self-ligation (e.g. I-III). Note, the --merge-gap command is equivalent to running
bedtools merge -d on the padded anchors.
By default, the merge-gap is 500 base pairs. Specifying this to 0 can cause issues, particularly when the width of a PET spans multiple peaks.
The dist or distance between two peaks is noted in black as the center of two peaks. The --min-dist flag is the smallest
and --max-dist is the largest integer number that ensures this distance falls between to be considered in a loop. These defaults
are 5Kb and 2Mb as smaller reads are likely self-ligations whereas larger reads are unlikely to be biologically real loops.
From our inspection of the HiChIP data, we determined that reads for putative loops localize to the edges of restriciton fragments and that
increasing the padding to the edges of fragments can prove beneficial for maximizing the read density associated with loops.
By default, hichipper adds additional padding to the edges of restriction fragments unless the --skip-resfrag-pad pad is thrown.
To give an example, consider peak C. Under the default options, the anchor corresponding to this peak would span restriction
fragments 7 and 9 (assuming that the merge-gap is small enough such that B and C are not merged). This is because the
additional restriction fragment padding occurs after the original padding (orange), and hichipper padds to the edges of whatever
fragment pads are overlapping the peak. However, if the user specified --peak-pad 0, the anchor corresponding to C would only span
restriction motifs 8 and 9 since the peak only sits on that singular fragment. While we have found the default options to be
sensible, the user can adjust these padding parameters to potentially increase precision of anchors calls possibly at the expense of PET density or vice-versa.
Finally, we note the --macs2-genome and --macs2-string which by default has parameters that we felt were suited appropriately for processing HiChIP data.
However, users can modify these when performing peak-calling from HiChIP data directly.
User parameter recommendations¶
- If
Ris not in the system or if theRpackage dependencies could not be installed, the following flags should be added:
--skip-resfrag-pad --skip-diffloop --skip-qc --skip-background-correction
- In the current version of hichipper, the novel background correction implementation is quite memory intense. Thus, users running hichipper on a laptop or other low RAM machine should likely skip the adaptive background correction.
--skip-background-correction