General-use scripts#

File Processing, handling HPC runs, etc#

cleanHiM_run.py#

Cleans the directories and log files created by pyHiM in previous runs.

Usage: clean_him_run [-F ROOTFOLDER] [-P PARAMETERS] [-A ALL]

optional arguments:
  -F ROOTFOLDER, --rootFolder ROOTFOLDER
                        Folder where the analysis has been performed
  -P PARAMETERS, --fileParameters PARAMETERS
                        Parameters file. Default: parameters.json
  -A ALL, --all ALL
                        Delete all folders and all created files

lndir.py#

Creates link for files in a second directory (useful to analyze data in a new folder without copying raw data files).

Usage: lndir "/user_home/Repositories/pyHiM/\*py" ~/Downloads/test

Use quotation marks in the first argument if using wildcards.

zipHiM_run.py#

Zip all output files from a pyHiM run. It excludes .npy and .tif files. Useful to retrieve results from a run from an HPC cluster.

Usage: zip_him_run [-F ROOTFOLDER] [-P PARAMETERS] [-R RECURSIVE]

optional arguments:
  -F ROOTFOLDER, --rootFolder ROOTFOLDER
                        Folder where the analysis has been performed
  -P PARAMETERS, --fileParameters PARAMETERS
                        Parameters file. Default: parameters.json
  -R RECURSIVE, --recursive RECURSIVE
                        Zip files inside folders of current directory

unzipHiM_run.py#

Unzips HiM_run.tar.gz recursively. Useful to unzip the results from several folders retrieved from a run in an HPC cluster.

Usage: unzip_him_run [-F ROOTFOLDER] [-R RECURSIVE]

optional arguments:
  -F ROOTFOLDER, --rootFolder ROOTFOLDER
                        Folder where the HiM_run.tar.gz is located
  -R RECURSIVE, --recursive RECURSIVE
                        Unzip files inside folders of current directory

runHiM_cluster.py#

Launches pyHiM on a cluster using slurm srun.

Usage: run_him_cluster

Plotting scripts#

figureHiMmatrix.py#

Produces and plots a HiM matrix for a given dataset.

Usage figure_him_matrix [-F ROOTFOLDER] [-O OUTPUTFOLDER] [-P PARAMETERS] 
						 [-A LABEL] [-W ACTION] [--fontsize] [--axisLabel]
						 [--axisTicks] [--barcodes] [--scalingParameter]
						 [--cScale] [--plottingFileExtension] [--shuffle]
						 [--scalogram] [--inputMatrix] [--pixelSize]
						 [--cmap] [--PWDmode]						 
	 -F ROOTFOLDER, --rootFolder ROOTFOLDER
		 Folder with datasets
	 -O OUTPUTFOLDER, --outputFolder OUTPUTFOLDER
		 Folder for outputs
	 -P PARAMETERS, --parameters PARAMETERS
		 Name of parameters file. Default: folders2Load.json
	 -A LABEL, --label LABEL
		 Name of label
	 -W ACTION, --action ACTION
		 Selects: all, labeled or unlabeled for the datasets.
	 --fontsize
		 Size of fonts to be used in plots
	 --axisLabel
		 Select optional label in x and y axis
	 --axisTicks
		 Display axis ticks
	 --barcodes
		 Display barcode images
	 --scalingParameter
		 Scaling parameter of colormap
	 --cScale
		 Colormap absolute scale
	 --plottingFileExtension
		 Select file extension to save images. Default: svg. 
		 Other options: pdf, png
	 --shuffle
		 Provide shuffle vector: 0,1,2,3,.. of the same size or
		 smaller than the original matrix. 
	 --scalogram
		 Display scalogram image
	 --inputMatrix
		 Select plot type among one of the following: PWD, contact, iPWD.
		 Default: contact
	 --pixelSize
		 Pixel size in µm
	 --cmap
		 Select colormap. Default: coolwarm
	 --PWDmode
		 Mode used to calculate the mean distance. 
		 Options are: 'median' or 'KDE'. Default: 'median'

figure4Mmatrix.py#

Creates proximity frequency 4M profiles from a given list of anchors (similar analysis to a 4C experiment, but using HiM data). Works with up to two datasets.

Usage: figure_4_m_matrix [-F1 ROOTFOLDER1] [-F2 ROOTFOLDER2] [-O OUTPUTFOLDER]
						 [-P PARAMETERS] [-A1 LABEL1] [-A2 LABEL2] [-W1 ACTION1]
						 [-W2 ACTION2] [--fontsize] [--axisLabel] [--axisTicks]
						 [--splines] [--cAxis] [--plottingFileExtension]
						 [--legend] [--normalize]
 

	-F1 ROOTFOLDER1, --rootFolder1 ROOTFOLDER1
		 Folder with dataset 1
	-F2 ROOTFOLDER2, --rootFolder2 ROOTFOLDER2
		 Folder with dataset 2
	 -O OUTPUTFOLDER, --outputFolder OUTPUTFOLDER
		 Folder for outputs
	 -P PARAMETERS, --parameters PARAMETERS
		 Name of parameters file. Default: folders2Load.json
	 -A1 LABEL1, --label1 LABEL1
		 Name of label for dataset 1
	 -A2 LABEL2, --label2 LABEL2
		 Name of label for dataset 2
	 -W1 ACTION1, --action1 ACTION1
		 Selects: all, labeled or unlabeled for dataset 1
	 -W2 ACTION2, --action2 ACTION2
		 Selects: all, labeled or unlabeled for dataset 2
	 --fontsize
		 Size of fonts to be used in plots
	 --axisLabel
		 Select optional label in x and y axis
	 --axisTicks
		 Display axis ticks
	 --splines 
		 Plots data using spline interpolations
	 --cAxis
		 Absolute axis value for colormap
	 --plottingFileExtension
		 Select file extension to save images. Default: svg. 
		 Other options: pdf, png
	 --legend
		 Shows legends for datasets in plot
	 --normalize
		 Matrix normalization factor: maximum, none, single value. Default: none

figureCompare2Matrices.py#

Comparison of proximity matrices. Plots either the ratio or the difference between two HiM matrices. It also plots both matrices together, with one in the upper triangle, and the other in the lower triangle.

Usage: figure_compare_2_matrices [-F1 ROOTFOLDER1] [-F2 ROOTFOLDER2] 
								 [-O OUTPUTFOLDER] [-P PARAMETERS]
							     [-A1 LABEL1] [-A2 LABEL2] [-W1 ACTION1]
							     [-W2 ACTION2] [--fontsize] [--axisLabel] 
							     [--axisTicks] [--ratio] [--cAxis] 
							     [--plottingFileExtension] [--normalize]
							     [--inputMatrix] [--pixelSize]
 

	-F1 ROOTFOLDER1, --rootFolder1 ROOTFOLDER1
		 Folder with dataset 1
	-F2 ROOTFOLDER2, --rootFolder2 ROOTFOLDER2
		 Folder with dataset 2
	 -O OUTPUTFOLDER, --outputFolder OUTPUTFOLDER
		 Folder for outputs
	 -P PARAMETERS, --parameters PARAMETERS
		 Name of parameters file. Default: folders2Load.json
	 -A1 LABEL1, --label1 LABEL1
		 Name of label for dataset 1
	 -A2 LABEL2, --label2 LABEL2
		 Name of label for dataset 2
	 -W1 ACTION1, --action1 ACTION1
		 Selects: all, labeled or unlabeled for dataset 1
	 -W2 ACTION2, --action2 ACTION2
		 Selects: all, labeled or unlabeled for dataset 2
	 --fontsize
		 Size of fonts to be used in plots
	 --axisLabel
		 Select optional label in x and y axis
	 --axisTicks
		 Display axis ticks
	 --ratio 
		 Performs the ratio between matrices. Defaukt: difference
	 --cAxis
		 Absolute axis value for colormap
	 --plottingFileExtension
		 Select file extension to save images. Default: svg. 
		 Other options: pdf, png		
	 --normalize
		 Matrix normalization factor: maximum, none, single value, 
		 bin pair. Default: none
	 --inputMatrix
		 Source of input matrix: contact (default), PWD matrix, 
		 iPWD matrix
	 --pixelSize
		 Pixel size in microns. Default: 0.1 microns
		 

figure3wayInteractions.py#

Plots 3-way proximity probability matrices for a given anchor (or set of anchors), as defined in the folders2Load.json configuration file. Comparative analysis can be performed for two datasets simultaneously. The calculation of 3-way proximity probability matrices needs to be previously performed using the processHiMmatrix.py script.

Usage: figure_3_way_interactions [-F1 ROOTFOLDER1] [-F2 ROOTFOLDER2] 
								 [-O OUTPUTFOLDER] [-P PARAMETERS]
								 [-P2 PARAMETERS2] [-A1 LABEL1] [-A2 LABEL2]
								 [-W1 ACTION1] [-W2 ACTION2] [--fontsize]
								 [--scalingParameter] [--colorbar] 
								 [--plottingFileExtension] [--normalize]
	
	-F1 ROOTFOLDER1, --rootFolder1 ROOTFOLDER1
		 Folder with dataset 1
	-F2 ROOTFOLDER2, --rootFolder2 ROOTFOLDER2
		 Folder with dataset 2
	 -O OUTPUTFOLDER, --outputFolder OUTPUTFOLDER
		 Folder for outputs
	 -P PARAMETERS, --parameters PARAMETERS
		 Name of parameters file. Default: folders2Load.json
	 -P2 PARAMETERS2, --parameters2 PARAMETERS2
		 Name of parameters file for dataset 2. Default: folders2Load.json
	 -A1 LABEL1, --label1 LABEL1
		 Name of label for dataset 1
	 -A2 LABEL2, --label2 LABEL2
		 Name of label for dataset 2
	 -W1 ACTION1, --action1 ACTION1
		 Selects: all, labeled or unlabeled for dataset 1
	 -W2 ACTION2, --action2 ACTION2
		 Selects: all, labeled or unlabeled for dataset 2
	 --fontsize
		 Size of fonts to be used in plots
	 --scalingParameter
		 Scaling parameter of the colormap
	 --colorbar
		 Use if a colorbar is required
	 --plottingFileExtension
		 Select file extension to save images. Default: svg. 
		 Other options: pdf, png
	 --normalize
		 Normalizes matrices by their maximum.

figureN_HiMmatrices.py#

Plots several (N) HiM matrices in the same plot, using N datasets specified in folders2Load.json.

It also plots a submatrix representing the difference in contact probability for a subset of barcodes compared to a particular dataset. The subset of barcodes and the reference dataset are defined in folders2Load.json by the options barcodes2plot and plotSegment_anchor, respectively.

Usage figure_n_him_matrices [-F ROOTFOLDER] [-O OUTPUTFOLDER] [-P PARAMETERS] 
							 [-A LABEL] [-W ACTION] [--fontsize] [--axisLabel]
							 [--axisTicks] [--barcodes] [--scalingParameter]
							 [--plottingFileExtension] [--shuffle] [--scalogram]
							 [--type] [--pixelSize] [--cAxis] [--ratio]
							 [--normalizeMatrix]
	 -F ROOTFOLDER, --rootFolder ROOTFOLDER
		 Folder with datasets
	 -O OUTPUTFOLDER, --outputFolder OUTPUTFOLDER
		 Folder for outputs
	 -P PARAMETERS, --parameters PARAMETERS
		 Name of parameters file. Default: folders2Load.json
	 -A LABEL, --label LABEL
		 Name of label
	 -W ACTION, --action ACTION
		 Selects: all, labeled or unlabeled for the datasets.
	 --fontsize
		 Size of fonts to be used in plots
	 --axisLabel
		 Select optional label in x and y axis
	 --axisTicks
		 Display axis ticks
	 --barcodes
		 Display barcode images
	 --scalingParameter
		 Scaling parameter of colormap
	 --plottingFileExtension
		 Select file extension to save images. Default: svg. 
		 Other options: pdf, png
	 --shuffle
		 Provide shuffle vector: 0,1,2,3,.. of the same size or
		 smaller than the original matrix. 
	 --scalogram
		 Display scalogram image
	 --type
		 Select plot type among one of the following: PWD, contact, iPWD
	 --pixelSize
		 Pixel size in µm
	 --cAxis
		 Absolute axis value for colormap
	 --ratio
		 Calculates ration between matrices for submatrices plots. 
		 Default: difference
	 --normalizeMatrix
		 Normalize matrices by maximum. Default: True

figureSingleCell.py#

This scripts:

  • produces movies and trajectories from single cell PWD matrices.

  • calculates barcode detection efficiencies and number of barcodes per cell.

  • plots single cell matrices.

  • plots distance histograms and distributions of Rg.

Usage: figure_single_cell [-F ROOTFOLDER] [-O OUTPUTFOLDER] [-P PARAMETERS] 
						   [-A LABEL] [-W ACTION] [--fontsize] [--axisLabel]
						   [--axisTicks] [--barcodes] [--nRows] [--pixelSize]
						   [--maxDistance] [--plottingFileExtension] [--shuffle]
						   [--ensembleMatrix] [--video] [--videoAllcells]
						   [--plotHistogramMatrix] [--minNumberPWD] [--threshold]
	
	-F ROOTFOLDER, --rootFolder ROOTFOLDER
		 Folder with datasets
	-O OUTPUTFOLDER, --outputFolder OUTPUTFOLDER
		 Folder for outputs
	-P PARAMETERS, --parameters PARAMETERS
		 Name of parameters file. Default: folders2Load.json
	-A LABEL, --label LABEL
		 Name of label
	-W ACTION, --action ACTION
		 Selects: all, labeled or unlabeled for the datasets.
	--fontsize
		 Size of fonts to be used in plots
	--axisLabel
		 Select optional label in x and y axis
	--axisTicks
		 Display axis ticks
	--barcodes
		 Display barcode images
	--nRows
		 The number of cells is determined by nRows^2. Default: 10
	--pixelSize
		 Pixel size in microns. Default: 0.1 microns
	--maxDistance
		 Maximum distance for histograms in microns. Default: 4 microns
	--plottingFileExtension
		 Select file extension to save images. Default: svg. 
		 Other options: pdf, png
	--shuffle
		 Provide shuffle vector: 0,1,2,3,.. of the same size or
		 smaller than the original matrix. 
	--ensembleMatrix
		Use if ensemble matrix should be plot alongside single cell
		matrices
	--video
		Use if you want to output video
	--videoAllcells
		Use if you want all nRows^2 single cells to be output in video
	--plotHistogramMatrix
		Use if you want to plot the PWD histograms for all bin combinations
	--minNumberPWD
		Minimum number of PWD to calculate radius of gyration Rg. Default: 6
	--threshold
		Maximum accepted PWD (in pixels) to calculate radius of gyration Rg.
		Default: 8

Post-processing scripts#

processHiMmatrix.py#

This script performs the post-processing of one or more datasets previously analysed with pyHiM, defined in the folders2Load.json file.

It performs the following operations:

  • Merges datasets from different experiments.

  • Calculates and plots ensemble pairwise distance (PWD) matrix.

  • Calculates and plots the inverse of the PWD matrix.

  • Calculates and plots contact probability matrix for each dataset.

  • Calculates and plots ensemble contact probability matrix.

  • Calcualtes and plots tensemble 3-way contact probability matrix for the set of anchors defined in the folders2Load.json file.

  • Optional: Reads MATLAB single-cell PWD matrices and performs all previous operations.

Usage: process_him_matrix [-F ROOTFOLDER] [-P PARAMETERS] [-A LABEL] [-W ACTION]
						   [--matlab] [--saveMatrix] [--getStructure] [--pixelSize]
						   [--HiMnormalization] [--d3]
Optional arguments:

	-F ROOTFOLDER, --rootFolder ROOTFOLDER
			Folder with folders2Load.json file
	-P PARAMETERS, --parameters PARAMETERS
			File with parameters. Default: folders2Load.json
	-A LABEL, --labal LABEL
			Name of label for the dataset
	-W ACTION, --action ACTION
			Selects: all, labeled or unlabeled for the datasets. 
	--matlab
			Loads MATLAB data (e.g. .mat files)
	--saveMatrix
			Saves the combined PWD matrix from all datasets. Default: False
	--getStructure
			Multi-dimensional scaling to get coordinates from PWDs. Default: False
	--pixelSize
			Specify images pixel size. Default: 100 nm.
	--HiMnormalization
			Normalization of contact matrix: nonNANs (default) or nCells.
	--d3
			Loads data segmented in 3D. Default: False

processSNDchannel.py#

This script will:

  • allow the user to manually draw ROI based on secondary labels, such as RNA-FISH images.

  • use the ROIs defined by the user to attribute labels to traces.

Usage: process_snd_channel [-F ROOTFOLDER] [-A ADDMASK] [--cleanAllMasks]

	-F ROOTFOLDER, --rootFolder ROOTFOLDER
		Folder with images
	-A ADDMASK, --addMask ADDMASK
		Add manual segmentation
	--cleanAllMasks
		Clear all masks

trace_combinator.py#

This script combines trace tables from different experiments/ROIs into a single trace table. The folders containing the trace tables of the experiments to be combined are provided as a JSON file. It is possible to select only a subset of trace tables within the folders provided using the methods parameter. Merged trace table is outputed in the buildPWDmatrix folder.

Outputs: ChromatinTraceTable() object and output .ecsv formatted file with assembled trace tables.

Usage: trace_combinator [-F ROOTFOLDER] [-P PARAMETERS] [-A LABEL] [-W ACTION]
						   [--saveMatrix] [--ndims] [--method]

	-F ROOTFOLDER, --rootFolder ROOTFOLDER
		Folder with folders2Load.json file
	-P PARAMETERS, --parameters PARAMETERS
		File with parameters. Default: folders2Load.json
	-A LABEL, --labal LABEL
		Name of label for the dataset
	-W ACTION, --action ACTION
		Selects: all, labeled or unlabeled for the datasets. 
	--saveMatrix
		Saves the combined PWD matrix from all datasets. Default: False
	--ndims
		Dimensions of the trace (2 or 3). Default: 3
	--method
		Method or mask ID used for tracing: KDtree, mask, mask0

trace_selector.py#

This scipt loads a trace file and a number of numpy masks, and assings them the labels produced by process_snd_channel.

Usage: trace_selector [-F ROOTFOLDER] [--pixel_size]

	-F ROOTFOLDER, --rootFolder ROOTFOLDER
		Folder with fimages
	--pixel_size
		Lateral pixel size in microns. Default = 0.1

npy_to_tiff#

This script will convert Numpy array files into imageJ-readable TIFs. Images will be rescaled to (0, 2^14) range and will be histogram normalized using skimage.exposure.equalize_adapthist().

You can invoke this in two ways:

  • Use find and send the list of files as arguments:

npy_to_tiff $(find -name "*ch0*_2d_registered.npy")
  • Otherwise you can pipe the results as follows:

ls *ch0*_2d_registered.npy | npy_to_tiff