From Shading to Local Shape

Documentation

Usage on a local machine

Compile and Init
Specifying Observed Intensities/Object Mask
Computing Local Patch Distributions
Surface Reconstruction

Usage on a Cluster

Setup for a cluster environment
Computing Local Patch Distributions

Parallelizing dSolve

Usage on a local machine

Compile and Init
Un-tar the distribution in a local directory, cd to it, start MATLAB and call init to set up paths, and compile to compile the MEX functions and compiler runtime scripts. You only need to call compile once, but you must call init everytime you start up MATLAB.

$ tar xzvf qsfs.v2.0.tgz
$ cd qsfs
$ matlab
>> init
>> compile

Specifying Observed Intensities/Object Mask
All functions assume that "light direction" is a unit vector, and that the intensity image has been normalized by dividing by the albedo x light strength (for real images in the absence of specular highlights, we typically compute this as the 99^th percentile intensity value). This means that the functions expect all valid intensity values to be between 0 and 1.

Intensity values beyond the [0,1] range are masked out, but there is a subtle difference in how values < 0 and those > 1 are treated. Values < 0 should be used to specify an "object mask", i.e., you should set all values outside the object in the input image to say -1. Values > 1 are considered to be missing data. The difference between how these are handled is that "partial patches" will be handled ignoring values > 1.0, but an entire patch will be discarded if any values is < 0 (since otherwise disjoint parts of a surface could be treated as being connected).

Practically, this means that when dealing with real images, you must first determine "valid" regions for your data. Intensities at pixels that are outside the surface should be set to -1.0. If you automatically detect specular highlights or cast shadows (note that we do not do this in our real image experiments), set those pixels to +2.0. Everything else should be clipped between [0,1], for example, by doing:

>> I(outside_object) = -1;
>> I(highlights_or_shadows) = 2;
>> I(valid) = max(0,min(1,I(valid)));

where valid, outside_object, highlights_or_shadows are logical variables specifying the different masks.

Computing Local Patch Distributions
Use the getPatches function to compute the local distributions, called as:

>> [nA,D] = getPatches(I,ell,psizes,vars,ncores);

Here, I is the intensity image in the format above, ell is a unit-norm 3-vector corresponding to the (known or assumed) light direction, psizes is a "cell array" of patch sizes (e.g., {3,5,9,17}), vars is a scalar corresponding to the noise variance (we used 1e-4 for the real images in our experiments), and ncores is the number of parallel workers and should be set to the number of cores on the local machine.

This returns two cell arrays nA and D with number of elements equal to the number of patch sizes. Each nA{i} corresponds to the shape proposals for all patches of size psizes{i} and D{i} to the corresponding likelihood costs— specifically, nA{i}(1:5,:,y,x) is a 5x21 matrix corresponding to the 21 coefficient vectors (each is a 5 vector) for the proposals for the (x,y)^th patch, with D{i}(y,x,:) being the costs. Here, the (x,y)^th patch comprises the set of pixels I(y:y+psizes{i}-1,x:x+psizes{i}-1). Also, some patches may have only 20 valid proposals instead of 21, or no valid proposals at all (if pixels in the patch are masked out), in which case the corresponding invalid columns of nA{i} will be all NaN and the values in D{i} will be Inf.

The coefficients in nA are in terms of a patch-centered and scaled co-ordinate system— use the function nxyFromA to get the normals from nA{i} as:

>> [nx,ny] = nxyFromA(nA{i}(:,:,x,y),psizes{i});

Here nx,ny will be psizes{i} x psizes{i} x 21 matrices corresponding to the x and y partial derivatives of -Z(x,y), for each proposed shape, at each location in the (x,y)^th patch at scale i.

Finally, we provide the functions loadPatches and savePatches to save and load the variables nA,D,psizes associated with patch distributions to and from files. Unlike MATLAB's save and load functions, these save the data without compression and are usually much faster, especially for large images.

>> savePatches(nA, D, psizes, 'filename.dat');
%% Later
>> [nA, D, psizes] = loadPatches('filename.dat');

Partial Overlap (optional): Note that all of the above code assumes that you want to compute distributions for all overlapping patches for a given patch size, and this is what we use for the reconstruction results in the paper. However, it is possible to consider a smaller number of patches (for example, to reduce computation time) by "skipping" patches. This is done by specifying each patch size as a row 2-vector, where the second argument is the number of pixels you want to skip by (1 being the default that corresponds to full overlap). Therefore, for a 50% overlap, you might want to set psizes={[3 1],[5 2],[9 4],[17 8]};. To ensure that all pixels in the image are included in at least one patch, the subset of selected patches might sometimes skip by slightly more or less than the specified parameter. The function pidx can be used to find the mapping between patch and pixel indices:

>> [yset,xset] = pidx(size(I),psizes{i}(1),psizes{i}(2));

Then, D{i}(y,x,:) will correspond to pixels I(yset(y):yset(y)+psizes{i}(1)-1,xset(x):xset(x)+psizes{i}(1)-1).

Surface Reconstruction
To reconstruct object-scale depth from local patch distributions, call dSolve as:

>> Z = dSolve(lambda0,D,nA,psizes,size(I),dummy0);

The algorithm automatically chooses the optimal parameters lambda and D₀ as described in the paper, and the first and last parameters to dSolve are the scale factors for these. Suggested values are 0.25 for lambda0 and 10 for dummy0. D,nA,psizes describe the patch distributions computed above, and size(I) specifies the size of the depth map to be returned.

By default, dSolve will use as many parallel threads for computation as automatically determined by MATLAB to be optimal. You can change this by calling the MATLAB function maxNumCompThreads. In general, the best strategy is to select as many threads as physical cores. However, if these cores are distributed across multiple processors or sockets, you may experience a slow-down due to cache rewrite issues, especially for large images and multiple patch-sizes. See the section Parallelizing dSolve for a discussion on how to achieve optimal performance in these cases.

Usage on a Cluster

We provide MATLAB functions and bash scripts that allow the computation of patch distributions (i.e., getPatches) to be parallelized over a compute cluster. The scripts are currently set up to automatically interface with an SGE-based cluster created on Amazon EC2 using the Star Cluster toolkit from MIT. If you wish to use these scripts, you should add the scripts/ directory to your path (for example, by adding export PATH=$PATH:/path/to/qsfs/scripts to your ~/.bashrc file).

If you plan on using EC2, please go through the Star Cluster documentation on how to set up a cluster, and e-mail ayanc[at]eecs.harvard.edu for access to "AMIs" that come pre-installed with compiled versions of the code and the MCR libraries required to run them. As detailed in Parallelizing dSolve, these AMIs also contain compiled versions of dSolve and scripts to schedule surface reconstruction jobs through the grid scheduler. If you do use EC2, you can skip the next setup section since the included scripts are already configured for our EC2 AMIs.

Setup for a cluster environment
You can also configure the code to use any compute cluster with a standard scheduler (such as SGE, SLURM, etc.). The first step is compiling the source code for the architecture of the nodes on the cluster and available version of MATLAB—note that it is sufficient to have just the MATLAB component runtime while running the code, but you will need MATLAB with an active license for compilation.

Copy the source code tgz file to a machine with the same architecture and MATLAB version (for example, to a cluster login node), do any required setup to load or activate MATLAB (check the cluster's documentation, for example, you might need to use the module command) and run the following commands to create the required binaries:

$ tar xzvf qsfs.v2.0.tgz
$ cd qsfs
$ export MATLAB=/path/to/matlab
$ scripts/mkdistro.sh ~/qsfs_bin

The /path/to/matlab is the directory in which your local MATLAB distribution in stored— in particular, the script will expect the MATLAB binaries to be in $MATLAB/bin. The above commands will compile and put all the required binaries in the qsfs_bin folder in your home directory. If required, copy this directory to a location that is available to all nodes on the cluster (through an NFS share).

Next, in the qsfs directory on your local machine, edit the scripts/config.sh and scripts/jbase.sh files based on your cluster environment. In particular, all job script files will be created by appending the appropriate execute command to the jbase.sh file, so you should use this file to define any headers required by the cluster scheduler and to set up environment and path variables. Consider the following sample file:

#!/bin/bash

export MATLAB=/path/to/matlab
export QSFS=$HOME/qsfs_bin
export MCR_CACHE_ROOT=/tmp/$USER.mcr

The job scripts require the MATLAB and QSFS variables to be set up to point to your MATLAB distribution, and the location where you copied the binary distribution created above. You should also set the variable MCR_CACHE_ROOT to a location that is NOT on a shared file system and is local to each node. (The MATLAB runtime uses this directory as a cache, and expects to be able to "lock" it for temporary exclusive access. This works reliably when different MATLAB processes running are requesting locks on the same local machine, but not across a shared filesystem.).

Finally, edit the config.sh file to specify the command (e.g., qsub for SGE, sbatch for SLURM, along with any arguments) to be used on the cluster to submit jobs to the scheduler. See the included config.sh file for an example (that is set up for the SGE scheduler on Star Cluster/EC2).

Computing Local Patch Distributions
Here, we describe how to compute local patch distributions over a cluster (rather than using getPatches). First, create a directory on your local machine to store all data and scripts to be sent to the cluster—say ~/cluster. Now, for each image that you wish to compute local distributions for, call dstPatches as:

>> dstPatches(I,ell,psizes,vars,'~/cluster/image_name');

The first four arguments above are identical to the first four arguments to getPatches, the last argument is a basename for directories where the input data is stored, and output data will be collected when run on the cluster. You will find that dstPatches creates multiple directories (with suffixes _pN) for each patch size. You can call the above function multiple times with different values of I or ell if you want to send multiple images to the cluster—as long as you make sure that each has a unique image_name.

The next step is to specify how you want to distribute these jobs on the cluster. Do this from your OS shell (and assuming you have added the scripts directory to your path as specified above) using the sgeJob command as:

$ cd ~/cluster
$ sgeJob 64 *

This will distribute the computation for each image for each patch size across 64 "jobs" on the cluster. Note that instead you can also choose to split different images or different patch sizes across different numbers of jobs:

$ sgeJob 32 *_p3 *_p5
$ sgeJob 64 *_p9 *_p17

$ sgeJob 32 image_name1* image_name2*
$ sgeJob 64 image_name3* image_name4*

However, make sure that you do not call sgeJob on the same directory twice! The above process will create individual job scripts in each directory, and also create a submit.sh file in ~/cluster that, when run on the cluster, will submit these scripts to the scheduler.

Next, copy these scripts to the cluster and execute submit.sh. For example,

$ scp -r ~/cluster user@cluster.host.name:
$ ssh user@cluster.host.name
user@cluster:$ cd cluster
user@cluster:$ ./submit.sh

To periodically keep "syncing" the output files back from the cluster to your local machine and checking progress, do the following:

$ cd ~/cluster
$ rsync -avz user@cluster.host.name:cluster/. .
$ sgestat

The sgestat command will tell you how many jobs have completed. When all are done, return to MATLAB and use the cmbPatches function to collate the results:

>> [nA,D] = cmbPatches('~/cluster/image_name',psizes);

Remember to clean out the cluster directory on both your local machine and on the cluster node in between performing these steps multiple times.

Parallelizing dSolve

While dSolve automatically parallelizes its computation through multiple threads on a single multi-core machine, it also allows for distributing the computation across multiple MATLAB processes—running on the same local machine or multiple networked machines. Doing this can lead to a speedup on a local machine that has multiple physical CPUs (rather than, or in addition to, multiple cores on a single machine). Distributing the computation across multiple machines is advantageous for large images, as long as the machines are connected via a high-speed and low-latency network.

You will need to run all but one of the MATLAB sessions as workers, and one as a master. If you are starting workers on multiple machines, you can copy the code distribution to each machine, and make sure you have called compile on each machine. Start all the worker MATLAB processes, run init to set up paths, optionally specify the number of threads/cores you want each process to use, and then run dSolveWorker command:

>> init
>> maxNumCompThreads(8) % Optional
>> dSolveWorker
Starting server on port 7459
Waiting for a job ....

As shown above, the worker will start and wait till a master process connects to it. It is also possible to start multiple worker processes on the same machine, but they must be distinguished by a port offset. For example, to start a second worker, start a new MATLAB process, and perform the same steps as above replacing the last line with dSolveWorker(1).

Then, in the main master MATLAB process, call dSolve as you normally would, providing an extra argument that specifies the hostnames or IP addresses of the machines on which the worker processes are running, along with port offsets if there are multiple processes per machine. This argument is a cell array of strings, of the form 'hostname' or 'hostname:port-offset'. For example,

>> Z = dSolve(lambda0,D,nA,psizes,size(I),dummy0,{'host1','host2','host2:1'});

will distribute computation across the process from which dSolve is called, the worker on host1 and the two workers on host2. For optimal performance, you should ensure that all processes (master and workers) use the same number of threads/cores each.

As mentioned above, when a local machine has more than one physical CPU, it may be advantageous to run dSolve across multiple processes in this manner—since the multi-threaded approach across CPU often incurs large overheads due to cache contamination. In this usage scenario, you should call maxNumCompThreads on each MATLAB process (master and workers) with the number of cores available on a single physical CPU. This will typically ensure that all the threads of each process are scheduled on the same physical CPU. On Linux machines, you can use taskset while calling MATLAB to explicitly bind each process to a CPU. See the taskset man page for details.

Source Code

Documentation

Usage on a local machine

Usage on a Cluster

Parallelizing dSolve