Fnet setup

Label-free prediction of three-dimensional fluorescence images from transmitted light microscopy by Allen Institute

Chang-Min Hsu, Wen-Wei Tseng

This repo is mounted by: PyTorch

Link to paper: http://dx.doi.org/10.1038/s41592-018-0111-2

The Nextjournal PyTorch environment

Nextjournal had K80 as the GPU and CUDA 9.2 as runtime.

Python and PyTorch versions are listed below.

import platform, torch
print("Python version: %s.\nPyTorch version: %s." %
      (platform.python_version(),torch.__version__))
0.8s
PyTorch (Python)

Cloning the repository

Using release_1 branch to reproduce the results in the Nature paper.

Not required in the Nextjournal notebook.

cd ~
git clone https://github.com/AllenCellModeling/pytorch_fnet.git -b release_1
9.9s
PyTorch (Bash in Python)

Changes I made

The newer GPU models (e.g. Titan V) do not work on older PyTorch versions as well as CUDA9 runtime. CUDA compile error would occur. Thus, I had to remove version constraints on both pytorch and torchvision packages in environment.yml. As a result, the latest PyTorch and CUDA 10 will be installed.

The environment file by me:

environment.yml

To create and activate fnet environment in Conda

Not required in the Nextjournal notebook. The environment is already set up.

conda create -f 
environment.yml
conda activate fnet
1.5s
PyTorch (Bash in Python)

Installing fnet packages in the repo

Note: Install fnet package before download data, or you'll overflow the temp directory from pip caching and the content in the repository.

cd /pytorch_fnet  # The folder where the repo resides
pip install .
97.9s
PyTorch (Bash in Python)

Fix 'cannot import name imsave'

imsave() is deprecated and removed since scipy 1.2.0. Installing the older one instead.

pip install scipy==1.1.0
8.7s
PyTorch (Bash in Python)
# List packages
conda list
3.6s
PyTorch (Bash in Python)

Test run

bash /pytorch_fnet/scripts/test_run.sh
68.8s
PyTorch (Bash in Python)

Downloading dataset

https://downloads.allencell.org/publication-data/label-free-prediction/index.html

./scripts/paper/download_all_data.sh
Bash

Downloads all data (do not do this in the Nextjournal Notebook).

They are over 500 GB in total so a large enough storage is required. I put them in the NAS and mount it via SMB.

sudo mount -t cifs -o username=sosiristseng,password=********,gid=1000,uid=1000 //<NAS IP address>/lab /home/sosiristseng/lab
Bash

Train a model

For example, to train the DNA image model with the first GPU:

./scripts/train_model.sh dna 0
Bash

Run predictions with the trained model

./scripts/predict.sh dna 0
Bash

Benchmarking

  • GTX 1080 Ti: 16 hrs per dataset

  • Titan V: 14.5 hrs (52400 secs) per dataset

Consistent with this benchmark

Visualizing CZI files

  1. Download and extract Image J

  2. Download BioFormats .jar package and put it into the folder of ImageJ

  3. Profit!

Runtimes (1)