HuBMAP Pipelines and Documentation

HuBMAP CODEX Data:

The HuBMAP Consortium CODEX pipeline uses Cytokit to process CODEX datasets from raw data to OME-TIFF compliant segmentation results and compiled antigen fluorescence images.

Learn more about our progress here.

SPRM – Spatial Process & Relationship Modeling

SPRM is a statistical modeling program used to calculate a range of descriptors from multichannel images.  It can be used for any type of multichannel 2D or 3D image (e.g., CODEX, IMS).

Learn more about our progress here.

  • Imaging QA/QC: The HIVE image analysis pipeline (SPRM) computes several quality control measures. These are written to outputs files as described in the following documentation. Learn more about Imaging QA/QC here.

Single-cell RNA sequencing

HuBMAP single-cell RNA-seq data sets are processed with a two-stage pipeline, using Salmon (https://combine-lab.github.io/salmon/) for transcript quantification and Scanpy (https://icb-scanpy.readthedocs-hosted.com/en/stable/) for secondary analysis. This pipeline is implemented in CWL, calling command-line tools encapsulated in Docker containers, and is available at https://github.com/hubmapconsortium/salmon-rnaseq.

Learn more about our progress here.

  • Single-cell RNA sequencing QA/QC: The HIVE scRNA-seq pipeline implements computation of some quality control measures, and defers to the Scanpy method scanpy.pp.calculate_qc_metrics for others; QA/QC outputs are written to files qc_results.json and qc_results.hdf5. Learn more here.

Single-cell ATAC-seq

Likewise, the HuBMAP Consortium uses a three-stage pipeline for scATAC-seq data sets, composed of SnapTools (https://github.com/r3fang/SnapTools), SnapATAC (https://github.com/r3fang/SnapATAC), and chromVAR (https://bioconductor.org/packages/release/bioc/html/chromVAR.html). This pipeline is written in CWL, calling command-line tools encapsulated in Docker containers, and is available at https://github.com/hubmapconsortium/sc-atac-seq-pipeline. 

Learn more about our progress here.

  • Single-cell ATAC-sequencing QA/QC: Processing of scATAC-seq data is performed with the SnapTools and SnapATAC packages, with short read alignment performed by BWA. Learn more here.

Cells: An API For Molecular and Cellular Queries of HuBMAP Data

The Cells API allows users to navigate the output of the HIVE’s data processing pipelines by performing queries on the aggregated pipeline output. Currently available data include cell by gene and cell by protein quantification data, cluster assignments, and differential expression analysis results from DE evaluations on organs and clusters.

Learn more here.

  • Base URL for the production API endpoint: https://cells.api.hubmapconsortium.org/
  • Server code Github repo: https://github.com/hubmapconsortium/cross_modality_query
  • Python client: https://github.com/hubmapconsortium/hubmap-api-py-client