Google’s Lens Blur Feature

Earlier today Google announced that they added a new feature to their android camera application called “Lens Blur”. From their blog post we can see that they take a series of images and solve for a multiview solution for the depth image of the first frame in the sequence. That depth image is then used to refocus the image. It is probably better described that they selectively blur an already fully focused image. This is similar to how they create bokeh in video games, however their blog states that they actually invoke the thin lens equations to get a realistic blur.

I thought this was really cool that they can solve for a depth image on a cellphone in only a few seconds. I also wanted access to that depth image so I can play with it and make 3D renderings! However that will come at another time.

Accessing the Data

Everything the GoogleCamera app needs to refocus an image is contained right inside the JPEG file recorded to /sdcard/DCIM/Camera after you perform a Lens Blur capture. Go ahead and download that IMG of the picture you took. How is this possible? It seems they use Adobe’s XMP format to store a depth image as PNG inside the header of the original JPEG. Unfortunately, I couldn’t figure out how to make usually tools, GDAL, read that.

So instead let’s do a manual method. If you open the JPEG file up in Emacs, right away you’ll see the XMP header which is human readable. Scroll down till you find the GDepth:Data section and copy everything between the quotes to a new file. Now things get weird. For whatever reason, the XMP format embeds binary containing strings of extension definition and a hash periodically through this binary PNG data you just copied. This doesn’t belong in the PNG and libpng will be very upset with you! Using Emacs again you can search this binary data for the http extension definition string and then delete everything between the \377 or OxFF upfront to the 8 bytes after the hash string. Depending on the file length you’ll have to perform this multiple times.

At this point you now have the raw PNG string! Unfortunately it is still in base64, so we need to decode it.

> base64 –D < header > header.png

Viola! You now have a depth image that you can open with any viewer.

Going back to the original XMP in the header of the source JPEG, you can find some interesting details on what these pixels mean like the following:


How did they do this?

There is no way for me to determine this. However looking at GoogleCamera we can see that it refers a which seems to contain the first half of the Lens Blur feature’s native code. Doing a symbol dump gives hints about what stereo algorithm they use. D vtable for vision::optimization::belief_propagation::BinaryCost D vtable for vision::optimization::belief_propagation::GridProblem D vtable for vision::optimization::belief_propagation::LinearTruncatedCost V vtable for vision::sfm::RansacSolver<vision::sfm::EssentialMatrixProblem> V vtable for vision::sfm::RansacSolver<vision::sfm::FundamentalMatrixProblem> D vtable for vision::sfm::StdlibRandom D vtable for vision::image::FixedPointPyramid D vtable for vision::shared::EGLOffscreenContext V vtable for vision::shared::Progress V vtable for vision::shared::GLContext V vtable for vision::stereo::PlaneSweep D vtable for vision::stereo::GPUPlaneSweep D vtable for vision::stereo::PhotoConsistencySAD V vtable for vision::stereo::PhotoConsistencyBase D vtable for vision::stereo::MultithreadPlaneSweep V vtable for vision::features::FeatureDetectorInterface D vtable for vision::features::fast::FastDetector V vtable for vision::tracking::KLTPyramid V vtable for vision::tracking::KLTPyramidFactory D vtable for vision::tracking::KLTFeatureDetector D vtable for vision::tracking::GaussianPyramidFactory D vtable for vision::tracking::LaplacianPyramidFactory

So it appears they use a KLT system that is fed to an SfM model to solve for camera extrinsics and intrinsics. Possibly they solve for the fundamental between the first sequential images, decompose out the intrinsics and then solve for the essential matrix on the remainder of the sequence. After that point, they use a GPU accelerated plane sweep stereo algorithm that apparently has some belief propagation smoothing. That’s interesting stereo choice given how old plane sweeping is and the lack of popularity in the Middlebury tests. However you can’t doubt the speed. Very cool!

Noproj’n LRO Imagery

Earlier this year I found out that I got my first proposal funded. I’ve had directed funding before thanks to NASA’s cyrosphere program. I’ve also been a Co-I on numerous other funded proposals with co-workers and friends at USGS. However my LASER proposal to perform Mass DEM production from LROC-NA imagery was something I wrote as PI and was competed. Now that it is accepted, I have two years to follow through and I’d like to share the whole process here on the blog.

The first problem I’m going to write about has bugged me for a while. That problem is that each LROC-NA observation is actually 2 files and makes using ASP awkward. In the past I’ve been telling people to run perform 4 runs with stereo. Do the combinations of LE-LE, LE-RE, RE-LE, and RE-RE. However UofA had the same problem with HiRISE, which comes down in 20 different files. They had worked out an ISIS process chain that would noproj those files together among other things to make a single observation. I don’t know why such step doesn’t exist for LROC-NA but today I’ll show you what I’ve come up with.

If you try right now to noproj the RE cub file to the LE cube file you’ll find that the program errors out because an ideal camera model for LROC has not been defined. Definitions for ideal noproj cameras for all ISIS cameras are defined in a file at $ISIS3DATA/base/applications/ noprojInstruments003.pvl. Looking at that file we see that there are 5 elements that can be defined for the ideal camera model, TransY, ItransS, TransX, ItransL, and DetectorSamples. DetectorSamples is easy; it’s what the output image width should be in the final image.  The Trans* variables are measured in millimeters and define a focal plane offset from the original camera model we are using. I plan to noproj with the match file being the LE file. Thus Trans* references a shift from the original position of the LE sensors. The Itrans* variables are pixel conversion of the millimeter measurements. It’s used by ISIS to run the math the other direction. If Trans* and Itrans* variable don’t match, funny errors will occur where the CCDs noproj correctly but the triangulation in ASP is completely bonkers. Relating the two is easy, just use the pixel pitch. For LROC-NA that is 7 micrometers per pixel.

So how do we decide what to set the TransY and TransX values to be? If those values are left to zero, the LE image will be centered on the new image and the RE will be butted up beside but falling off the edge of the new image. A good initial guess would be to set TransY to be a shift half the CCD width. A better idea I thought to use was to look at the FK kernel and identify the angle differences between the two cameras and then use the instantaneous field of view measurement to convert to pixel offset between the two CCD origins. Use pixel pitch to convert to millimeters and then divide by two will be the shift that we want. Below are the final noproj settings I used for LROC.

    TransY = 16.8833
    ItransS = -2411.9
    TransX = 0.6475
    ItransL = -92.5
    DetectorSamples = 10000

At this point we can noproj the imagery and then handmos them together. A naïve version would look something like the following.

> noproj from=originalRE.cub to=RE.cub match=originalLE.cub
> noproj from=originalLE.cub to=LE.cub match=origianlLE.cub
> cp LE.cub LERE_mosaic.cub
> handmos from=RE.cub mosaic=LERE_mosaic.cub

Alas, the LE and RE imagery does not perfectly align. In the HiRISE processing world we would use hijitreg to determine a mean pixel offset. There is no generic form of hijitreg that we can use for LROC-NA. There is the coreg application, but in all my tests this program failed to find any correct match points between the images. I’ve tried two other successful methods. I’ve manually measured the offset using Qtie. Warning: Sending this images to Qtie requires twiddling with how serial numbers are made for ideal cameras. This is something I should document at some point as it would allow bundle adjusting ideal cameras like fully formed HiRISE and LROC images. My other solution was the example correlate program in Vision Workbench.  I did the following to make processing quick (5 minutes run time).

> crop from=LE.cub to=LE.centerline.cub sample=4900 nsamples=200
> crop from=RE.cub to=RE.centerline.cub sample=4900 nsamples=200
> parallel isis2std from={} to={.}.tif format=tiff ::: *centerline.cub
> correlate --h-corr-min -60 --h-corr-max 0 --v-corr-min 0 --v-corr-max 60 LE.centerline.tif RE.centerline.tif

That creates a disparity TIF file. The average of the valid pixels (third channel is 1) can then be used to solve for the mean translation. That translation can then be used during handmos by using the outsample and outline options. Ideally this would all be one program like hijitreg but that is for another time. Below is the final result.

Hijitreg actually does more than just solve for the translation between CCDs on HiRISE. It correlates a line of pixels between the CCDs in hope of determining the jitter of the spacecraft. I can do the same!

From the above plots, it doesn’t look like there is much jitter or the state of the data is not in form that I could determine. The only interesting bit here is that the offset in vertical direction changes with the line number. I think this might be disparity due to terrain. The imagery I used for my testing was M123514622 and M123521405, which happened to be focused on the wall of Slipher crater. The NA cameras are angled 0.106 degrees off from each other in the vertical direction. Ideal stereo geometry would 30 degrees or 15 degrees, but 0.106 degrees should allow some disparity given the massive elevation change into the crater. I wouldn’t recommend using this for 3D reconstruction but it would explain the vertical offset signal. The horizontal signal has less amplitude but does seem like it might be seeing spacecraft jitter. However it looks aliased, impossible to determine what the frequency is.

Anyways, making a noproj version of the LROC-NA observation is a huge boon for processing with Ames Stereo Pipeline. Using the options of affine epipolar alignment, no map projection, simple parabola subpixel, and it is possible to make a beautiful DEM/DTM in 2.5 hours. 70% of that time was just during triangulation because ISIS is single threaded. That would be faster with the application parallel_stereo (renamed from stereo_mpi in ASP 2.2.2).

Automatic Production of HiRISE DEMs 2

I’m still heating my apartment with HiRISE processing. For details how I’m creating this, please look at this previous post. Let’s look at the recent results!

ESP_011265_1560 and ESP_011331_1560, requested by timparker. This stereo pair was a nightmare. D_sub said part of the south part of the image could be correlated even though only one of the images saw that part. Then in full resolution integer correlation, the algorithms went into an aggressive search for a match that didn’t exist. The end result meant that stereo took 15 days to complete! Yikes!

ESP_011287_2165 and ESP_011564_2165, requested by mcewen. Stereo finished in 5 hours.

ESP_011290_1800 and ESP_011778_1800, requested by mcewen. Stereo finished in 12 hours. It looks like stereo mis-guessed the correlation parameters for the crater chain, thus the long processing time.

ESP_011293_1710 and ESP_019218_1710, requested by cweitz. Stereo finished in 5 days! This might be because there is a lot of disparity happening this image. This is one of those problems where local homography transforms or epipolar alignment would make a major contribution in speed.

ESP_011310_1395 and ESP_011811_1395, requested by jwray. Stereo finished in 2 days. I’m disappointed that the mesa’s edges didn’t come out.

ESP_011314_1585 and ESP_011595_1585, requested by mcewen. Interest point detection failed in preprocessing. ASP just gave up and never completed anything. It is not clear to me why this didn’t work.

ESP_011319_2100 and ESP_011464_2100, requested by ltornabene. Stereo finished in 2 hours.

ESP_011339_1665 and ESP_011972_1665, requested by ltornabene. Stereo finished in 9 hours.

ESP_011350_0945 and ESP_011351_0945, requested by cjhansen. Stereo finished in 2 hours.

ESP_011365_1365 and ESP_011642_1365, requested by alxla. Stereo finished in 11 hours.

ESP_011371_1730 and ESP_011516_1730, requested by rbeyer. Stereo finished in 15 hours.

ESP_011372_1730 and ESP_012295_1730, requested by mcewen. Stereo finished in 9 hours.

Creating a registered LRO-NAC DEM without GCPs

I’ve previously written a tutorial on performing a bundle adjustment, jigsaw in ISIS lingo, using CTX. Let’s try something a little bit more complex with LRO-NAC. If you are not previously aware, LRO-NAC is actually two separate optical assemblies that are joined by the spacecraft bus. Ideally this spacecraft hardware would be solid as a rock and wouldn’t bend. Unfortunately that’s not the case, so what would have been a bundle adjustment between 2 cameras is now 4 in the case of LRO-NAC.

LRO-NAC Data Preparation

Picking a stereo pair has always been the most difficult part of the job for me. I honestly don’t know how to find stereo pairs. So instead I’m just going to recreate something ASU did in SOCET SET. If you are not aware, ASU has processed about 100 DEMs with LRO-NAC. I’m going to recreate their Moore F Crater DEM using M125713813 and M125720601. You can download the raw data via PDS.

I picked this stereo pair blindly and it ended up being extremely difficult to register correctly. In order to help out, I added a fifth image to the bundle adjustment to give another orbit’s worth of spacecraft ephemeris into the measurements. That image was M110383422LE and I picked it because it overlapped 2 of my previous images and had the same lighting conditions as my previous images.

Once you have downloaded all the images, prep the files in ISIS with the following:

parallel "lronac2isis from={} to={.}.cub; lronaccal from={.}.cub to={.}.cal.cub; spiceinit from={.}.cal.cub; rm {.}.cub" ::: *IMG

* The ‘parallel’ command is GNU Parallel, a non-standard system utility that I love dearly.

Attaching a custom Elevation Source

From a previous article I wrote about, we saw that map projecting against the WAC Global DTM gave much better results than working against the sparse LOLA that is the default for map projecting by in ISIS. So I’m going to do that now. Once we have a successful jigsaw result, then we’ll map-project. If you forgot how to attach the WAC Global DTM, here are the commands I used. My image pair was at 37.3N and 185E. So the tile WAC_GLD100_E300N2250_100M.IMG from this link was what I needed.

pds2isis from=WAC_GLD100_E300N2250_100M.IMG to=WAC_GLD100_E300N2250_100M.cub
algebra from=WAC_GLD100_E300N2250_100M.cub to=WAC_GLD100_E300N2250_100M.lvl1.cub operator=unary A=1 C=1737400
demprep from=WAC_GLD100_E300N2250_100M.lvl1.cub to=WAC_GLD100_E300N2250_100M.lvl2.cub
rm WAC_GLD100_E300N2250_100M.cub WAC_GLD100_E300N2250_100M.lvl1.cub

parallel spiceinit from={} shape=USER model=$PATH_TO/WAC_GLD100_E300N2250_100M.lvl2.cub ::: *cal.cub

Making the control network

Making the control network is pretty straightforward and is just time consuming. I performed the commands I previously outlined here on this blog.


Group = PolygonSeederAlgorithm
      Name = Grid
      MinimumThickness = 0.01
      MinimumArea = 1
      XSpacing = 600
      YSpacing = 1000


Object = AutoRegistration
   Group = Algorithm
     Name         = MaximumCorrelation
     Tolerance    = 0.7

   Group = PatternChip
     Samples = 19
     Lines   = 19
     MinimumZScore = 1.5
     ValidPercent = 80

   Group = SearchChip
     Samples = 120
     Lines   = 100


parallel footprintinit from={} ::: *cal.cub
echo *.cub | xargs -n1 echo > cube.lis
findimageoverlaps from=cube.lis overlaplist=overlap.lis
autoseed fromlist=cube.lis overlaplist=overlap.lis deffile=autoseed.def networkid=lronac pointid=???? description=moon
pointreg fromlist=cube.lis deffile=autoRegTemplate.def
qnet (editting and fixing

I had to spend a lot of time in Qnet for this stereo pair. About half of my automatic matches didn’t work. (Possibly my search range was too small?) In order to correct this, I had to then manually edit the output control network in Qnet. I then filter down all the control points to the ones marked ‘Ignored’. Then I just manually align the measurements. You just need to get the measures close to the right spot by clicking on the same feature. Afterwards, you can then click the ‘Register’ button to get subpixel accurate alignment.

Autoseed and AutoReg also have a bad habit of only matching LE images to LE, and RE images to RE. It is a good idea to run along the edge of an LE image and try to find common points that are seen in both RE images. These are going to be the few control points that have more than 2 control measures. I also had to do this between my 5th image and the other 2 images that it had overlap with. This problem was likely because the overlap between these 3 images didn’t meet the requirement of MinimumThickness in Autoseed.def.

Normally at this point I would then recommend to you to start picking out ground control points. I tried that. Then I went a little insane. I’m better now. Instead we’re going to do crazy stuff in the next section.

The reason we’re not using GCPs, is that the only source to align against is the WAC Image mosaic. I tried this and I could easily get a bundle adjustment result that aligned well with the image mosaic. However the output DEM would always show a massive error between my height results and the results of LOLA and the WAC Global DTM. After adding the 5th image and clamping the trust of my GCP to 50 meters in radius, I got a result that matched LOLA and WAC DTM in an okay manner but didn’t really align at all to the WAC image mosaic. Things made sense once I compared the image mosaic to a hillshade of WAC DTM.

WAC’s Image mosaic doesn’t agree with WAC’s Global DTM! Ghaaaa! Madness! I was so incredibly disappointed. I didn’t know what to do at this point, so I just dropped the ground control points.

Bundle Adjusting

Alrighty then, time to start bundle adjusting with jigsaw. Likely on the first run things are going to blow up because bad measurements still exist in the control network. My solution to this was to first have jigsaw solve for only a few camera parameters and then check out the output control network for features that had the highest error. In Qnet you can filter control measures that have errors higher than a certain amount of pixels. After I fixed those measurements manually, I resave the control network as my input network and then redo the jigsaw.

Each time I redo jigsaw, I let jigsaw solve for more parameters. My pattern was:

jigsaw fromlist=cube.lis radius=no twist=no cnet= update=no
jigsaw fromlist=cube.lis radius=no twist=yes cnet= observations=no update=no
jigsaw fromlist=cube.lis radius=yes twist=yes cnet= observations=no update=no
jigsaw fromlist=cube.lis radius=yes twist=yes cnet= observations=no update=no camsolve=velocities

After iterating many times, I eventually produced a control network and jigsaw result that had a sigma0 of .34 pixels. However, if we’re to write this camera solution to file and then make a DEM, we would see that I have worse alignment with LOLA than if we hadn’t bundle adjusted at all.

So how can we get a good bundle adjustment result that aligns well with LOLA? One idea would be to pick GCPs directly against LOLA using their gridded data product. I know folks in the USGS who have done exactly this. However I don’t even see how they can determine correct alignment, so I’m going to ‘nix that idea. Another idea would be to have Jigsaw not solve for radius (radius=no), but instead use only the radius values provided by LOLA. This will keep problem in the ball park, but the resulting jigsaw solution will have a sigma0 around 15 pixels! Not solving for radius essentially ties our hands behind our back in the case of LRO-NAC when our radius source is so incredibly low res.

What we need instead is to simply constrain / rubber-band our Control Points to have radius values that are similar to LOLA. Unfortunately, how do we determine what correct Lat Long to sample the LOLA radius for each control measure? My solution was to not define that. I did two things: (1) I changed all my control measures from ‘free’ to ‘constrained’ with a sigma of 200,200,50. This can be performed with cneteditor. (2) I then wrote and ran the following bash script.

#/bin/env bash

cp $input_control

for i in {0..100}; do
    echo Iteration $i
    cnetnewradii cnet= onet= model= $radius_source getlatlon= adjusted
    jigsaw fromlist=cube.lis radius=yes twist=yes cnet=  onet= update=yes spsolve= position camsolve= velocities


Your mind should be blown at this point. I’ve essentially written an iterative closest point algorithm using the ISIS utilities jigsaw and cnetnewradii. This process is only able to work because we have a perfect control network that we vetted with the previous jigsaw runs. This also works because our LOLA data here is not flat, and has plenty of texture. Finally, LRO-NAC was close to the correct solution. Imagine images with horrible attitude data, like Apollo, would just blow up with this technique.

On each iteration, the sigma0 shouldn’t change. Or if it changes, it should be extremely small. That’s because a translation or scale change of the control points and camera location has no effect on the reprojection error. What should be reducing on each iteration is the standard deviation of the radius correction for each control point as listed in bundleout.txt. The sparse 3D model of the problem in Jigsaw is slowly fitting itself to the LOLA DEM but constrained only by the camera geometry. The standard deviation of the radius correction will never go to zero and will eventually plateau. This is because there will always be some error against LOLA since large portions of the measurements are interpolations between orbits.

Disclaimer: I’ve only tried this technique for this LRO-NAC stereo pair. I don’t know if it will work for others. However this was the only technique I could come up with that would produce results that best aligned with LOLA. I couldn’t get anything good to come out of using WAC Image Mosaic GCPs.

Map projecting for Stereo

Now that we have an updated ephemeris, we’re ready to perform a map projection to speed up stereo. Be sure to not accidentally spiceinit your files, which would wipe your updated ephemeris. Notice that I’m map projecting everything at a low resolution. This is because I’m in a hurry for this demo. In a production environment for LRO-NAC, we would likely map project at 1 MPP or 0.5 MPP. This step and ASP’s triangulation are usually the longest part of this entire process.

parallel cam2map pixres=mpp resolution=5 from={} to={/.}.map.cub ::: ../*cal.cub

Running ASP

Nothing special here, I’m just processing the LE-LE, RE-RE, and LE-RE combination. RE-LE stereo pair wouldn’t resolve anything at 5 m/px. Afterwards, I gdalbuildvrt the outputs together. The massive lines separating the individual DEMs that make up this stereo pair may concern you. However that effect will diminish if I had map projected my inputs at a higher resolution. The reason being that, correlation is much like convolution, in that we erode the input image by a half kernel size along the image border. That border gets perceivably smaller if the image is of higher resolution.

parallel -n 2 stereo M125713813{1} M125720601{2} {1}{2}/{1}{2} ::: LE LE RE RE LE RE
parallel -n2 point2dem -r moon --t_srs \"+proj=eqc +lat_ts=37 +lat_0=0 +lon_0=180 +x_0=0 +y_0=0 +a=1737400 +b=1737400 +units=m +no_defs\" --tr 5 --orthoimage {1}{2}/{1}{2}-L.tif {1}{2}/{1}{2}-PC.tif --nodata -32767 ::: LE LE RE RE LE RE
gdalbuildvrt aspba_drg.vrt */*-DRG.tif
gdalbuildvrt aspba_dem.vrt */*-DEM.tif
parallel gdal_translate -co TILED=yes -co COMPRESS=LZW -of GTiff {} {.}.tif ::: *.vrt

Conclusion and Comparison

Now the moment you’ve been waiting for. The point where I have to prove I’m not crazy. I have three versions of this stereo pair. There is the result produced by ASU using SOCET SET and then there’s ASP with and without Bundle Adjustment. The look of ASP’s result will be a little shoddy because I map projected my input at 5 m/px which prevent me from get the most detail out. It did however save me a lot processing time while I experimented this method out.

Here are the hillshades for those 3 datasets overlaid a hillshade of the LRO-WAC Global DTM which is 100 m/px. There’s not much to tell, but you can at least see that everyone seems to have decent horizontal alignment.

Here are the difference maps between those 3 datasets and the LRO-WAC Global DTM.

Here are the difference maps between those 3 datasets and the LOLA Gridded Data Product, which I resampled to 100 m/px. I think this shows straight forward the improvement bundle adjustment made to the output DEM.

Just for giggles, I also did a difference between the ASU DEM and the bundle adjusted ASP DEM. They’re pretty close to each other. There seems to just be a simple tilt between the two or that the ASP DEM is just shifted to a slightly lower latitude. Who’s more correct is anyone’s guess for now.


Ideally at this point I would then compare the ASU DEM and the BA’d ASP DEM against the raw LOLA shot points. This is where curiosity got me and where I move on to the next personal project. Maybe someone else would be interested in a more rigorous review.

Also, I’m not sure if the 5th image is required.