Skip to content

taffish/cooltools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cooltools

TAFFISH wrapper for cooltools 0.7.1, the Open2C command-line and Python toolkit for analyzing Hi-C and Micro-C contact maps stored in .cool / .mcool format.

Package Identity

  • name: cooltools
  • command: taf-cooltools
  • kind: tool
  • TAFFISH version: 0.7.1-r2
  • container image: ghcr.io/taffish/cooltools:0.7.1-r2
  • upstream: open2c/cooltools tag v0.7.1
  • runtime: Python 3.10
  • app license: Apache-2.0
  • upstream license: MIT

What Is Included

The image installs cooltools 0.7.1 from the PyPI source distribution with a verified SHA256 checksum. The runtime includes the upstream cooltools CLI, the cooler CLI, and the Python libraries needed by cooltools, including cooler, bioframe, NumPy, pandas, SciPy, scikit-learn, scikit-image, numba, matplotlib, click, joblib, and multiprocess.

The packaged upstream command surface includes:

  • coverage
  • expected-cis, expected-trans
  • insulation
  • eigs-cis, eigs-trans
  • saddle
  • dots
  • pileup
  • virtual4c
  • random-sample
  • rearrange
  • genome binnify, genome digest, genome gc, genome fetch-chromsizes, genome genecov

A tiny synthetic .cool generator is installed at /opt/cooltools/share/testdata/cooltools_smoke.py for smoke validation and debugging examples.

Usage

Show TAFFISH package help:

taf-cooltools --help

Show upstream cooltools help or version:

taf-cooltools -- --help
taf-cooltools -- --version
taf-cooltools cooltools coverage --help

Run the default upstream command:

taf-cooltools cooltools coverage --ignore-diags 0 sample.cool -o sample.coverage.tsv

Because TAFFISH command mode is enabled, commands inside the same container can also be called directly:

taf-cooltools cooler info sample.cool
taf-cooltools python -c '"import cooltools; print(cooltools.__version__)"'

For cooltools subcommands, prefer taf-cooltools cooltools <subcommand> .... Do not write taf-cooltools coverage ..., because coverage is a cooltools subcommand, not a standalone executable in the container. For default-command options that begin with -, put -- before them so the TAFFISH wrapper passes the option to upstream cooltools.

Common Workflows

Calculate raw bin coverage from a cooler:

taf-cooltools cooltools coverage --ignore-diags 0 sample.cool -o sample.coverage.tsv

Calculate cis expected contacts:

taf-cooltools cooltools expected-cis \
  sample.mcool::resolutions/10000 \
  -o sample.expected-cis.tsv

Calculate insulation scores:

taf-cooltools cooltools insulation \
  sample.cool \
  100000 \
  -o sample.insulation.tsv

Create fixed-width genome bins from chrom sizes:

taf-cooltools cooltools genome binnify --all-names chrom.sizes 10000 > bins.bed

Run a Python API check:

taf-cooltools python -c '"import cooltools, cooler; print(cooltools.__version__)"'

Inputs And Outputs

Cooltools expects contact matrices in .cool or .mcool format. Multi-resolution coolers use the standard cooler group syntax, for example sample.mcool::resolutions/10000.

Depending on the subcommand, additional inputs may include BED/view files, chrom-size tables, FASTA files, expected tables, phasing tracks, or feature tables. Outputs are subcommand-specific TSV, BED-like, BEDPE-like, bigWig, or cooler table updates.

Boundaries

r2 is an offline cooltools analysis runtime once user input files exist. It does not map sequencing reads, create .pairs files, build .cool files from alignments, convert .hic files, call an end-to-end Hi-C pipeline, or provide an interactive visualization service. Pair generation, matrix construction, large workflow orchestration, and visualization remain separate tool apps or flows.

The upstream cooltools genome fetch-chromsizes and cooltools genome genecov helpers can fetch information from public genome resources. They are included because they are part of the upstream CLI, but production workflows that must be offline should pass already prepared chrom-size, BED, FASTA, and annotation files instead.

Some analyses require balanced coolers and a weight column. The app does not silently balance user data; prepare or balance coolers explicitly with cooler or your chosen Hi-C workflow before running those subcommands. For raw/unbalanced coverage calculations, pass an explicit upstream setting such as --ignore-diags 0, as shown above.

The smoke tests generate a tiny synthetic .cool file and validate command availability, Python imports, cooler info, cooltools coverage, and cooltools genome binnify. They verify the packaged runtime and a minimal real data path, not biological correctness on production Hi-C datasets.

Platform Notes

Native linux/amd64 and linux/arm64 builds are intended. The image uses a Python 3.10 virtual environment and fixed major numerical dependencies to avoid drift from future NumPy/pandas releases. No GPU, device, service, or special container runtime arguments are required. Release 0.7.1-r2 keeps upstream cooltools 0.7.1 unchanged and adds the missing TBB runtime library needed by amd64 Python wheels during dynamic-library validation.

Documentation

Citation

Open2C, Abdennur N, Abraham S, Fudenberg G, Flyamer IM, Galitsyna AA, Goloborodko A, Imakaev M, Oksuz BA, Venev SV. Cooltools: enabling high-resolution Hi-C analysis in Python. bioRxiv (2022). doi: 10.1101/2022.10.31.514564.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors