Goodbye and Thanks for all the Fish!

July will be my final month working at NASA Ames Research Center. It has been a great run working under the ill-defined job title as “Geospatial Software Architect”. I’ve gotten a chance to work as Software Dev, a Principal Investigator, and most recently as a Flight Software Lead on a new robot for the ISS called Astrobee. I’m leaving this comfortable blanket I’ve had within NASA for 6 years for a chance to do computer vision at Google for Project Tango. Something I find exciting and terrifying, but I view this as an opportunity to learn ever more.

Despite this new opportunity I feel like I’m leaving my baby, Ames Stereo Pipeline (ASP). Honestly, though, I’ve been doing less and less with ASP for a while now. Oleg has been lead developer for quite sometime. Under his guidance the software has gotten more features that everyone wants and more people have started using it for Earth Science. Clearly Oleg is doing a great job! Recently another coworker, Scott, has started improving ASP as well. With those two on the job, I feel like ASP will continue to grow.

I’m extremely proud that a community has developed around ASP. I’m also grateful to APL UofW and the PGC for putting faith into the software. Their time spent using ASP, requesting changes, and offer solutions to bugs has made ASP a product worthwhile. I’m sad I won’t get to be involved anymore or at least hear about what new applications scientists have thought up. My time involved developing ASP was wonderful and was perfect for honing my skills. I hope others can do the same through the use and understanding of how it works.

Thank you ASP users and the Intelligent Robotics Group. It was fun!

Man on the Moon

During my internship at NASA in 2009, I helped produce an elevation model and image mosaic from Orbit 33 of Apollo 15. This mosaic was later burned into Google Earth’s Moon mode. Earlier this week it appears people have found an image of a man walking in the region of the Moon I stitched together. Here’s links to articles about this supposed extra terrestrial at The Nation, News.com.au, AOL.com, and Examiner.com. Thank you to LROC’s Jeff Plescia for bringing this to my attention.

I quickly traced out that this section of the image mosaic comes from AS15-M-1151. This is a metric camera image from Apollo 15 that was scanned into digital form sometime in 2008 by ASU. What is shown in Google Earth is a reprojection of the image on to a DEM created by Ames Stereo Pipeline using said image. The whole strip of images was then mosaicked together using ASP’s geoblend utility. So this man could have been created by an error in ASP’s projection code. Below is the man in the moon from the raw unprojected form of the Apollo Metric image. Little man perfectly intact.

Unfortunately if you look at the next image in the film reel, AS15-M-1152, the man is gone. This is true also for 1153 and 1154. After that, the Apollo command module was no longer over looking the area. The metric camera takes a picture roughly every 30 seconds, so maybe the guy (who must be like 100 meters tall) just high tailed it.

These images come from film that had been in storage for 40 years. They were lightly dusted and then scanned. Unfortunately a lot of lint and hair still made it into the scans that we used for the mosaic. So much so, that Ara Nefian at IRG developed the Bayes EM correlator for ASP to work around those artifacts. Thus, this little Man in the image was very likely some hair or dust on the film. In fact if you search around the little man in image 1151 (in the top left corner of the image, just off an extension of a ray from the big crater) you’ll find a few more pieces of lint. Those lint pieces are also visible in Google Moon. However, it is still pretty awesome to find out others have developed a conspiracy theory on your own work. Hopefully it won’t turn into weird house calls like it did for friends of mine over the whole hidden nuclear base on Mars idea.

Update: You can find the Bad Astronomer’s own debunking of this man here. The cool bit is he tried to find the artifact in LRO and LO imagery. He then links to a forum where someone identifies that the dust was actually in the optics of the camera or in the scanner bed. So the man and other pieces of lint can be seen at roughly the same pixel location in consecutive frames.

Hah, it was even covered on SGU.

Man, I’m late to debunking my own work. 🙁

Google’s Lens Blur Feature

Earlier today Google announced that they added a new feature to their android camera application called “Lens Blur”. From their blog post we can see that they take a series of images and solve for a multiview solution for the depth image of the first frame in the sequence. That depth image is then used to refocus the image. It is probably better described that they selectively blur an already fully focused image. This is similar to how they create bokeh in video games, however their blog states that they actually invoke the thin lens equations to get a realistic blur.

I thought this was really cool that they can solve for a depth image on a cellphone in only a few seconds. I also wanted access to that depth image so I can play with it and make 3D renderings! However that will come at another time.

Accessing the Data

Everything the GoogleCamera app needs to refocus an image is contained right inside the JPEG file recorded to /sdcard/DCIM/Camera after you perform a Lens Blur capture. Go ahead and download that IMG of the picture you took. How is this possible? It seems they use Adobe’s XMP format to store a depth image as PNG inside the header of the original JPEG. Unfortunately, I couldn’t figure out how to make usually tools, GDAL, read that.

So instead let’s do a manual method. If you open the JPEG file up in Emacs, right away you’ll see the XMP header which is human readable. Scroll down till you find the GDepth:Data section and copy everything between the quotes to a new file. Now things get weird. For whatever reason, the XMP format embeds binary containing strings of extension definition and a hash periodically through this binary PNG data you just copied. This doesn’t belong in the PNG and libpng will be very upset with you! Using Emacs again you can search this binary data for the http extension definition string and then delete everything between the \377 or OxFF upfront to the 8 bytes after the hash string. Depending on the file length you’ll have to perform this multiple times.

At this point you now have the raw PNG string! Unfortunately it is still in base64, so we need to decode it.

> base64 –D < header > header.png

Viola! You now have a depth image that you can open with any viewer.

Going back to the original XMP in the header of the source JPEG, you can find some interesting details on what these pixels mean like the following:

GFocus:BlurAtInfinity="0.036530018"
GFocus:FocalDistance="20.225327"
GFocus:FocalPointX="0.59722227"
GFocus:FocalPointY="0.5756945"
GImage:Mime="image/jpeg"
GDepth:Format="RangeInverse"
GDepth:Near="12.516363143920898"
GDepth:Far="47.468021392822266"
GDepth:Mime="image/png"

How did they do this?

There is no way for me to determine this. However looking at GoogleCamera we can see that it refers a librefocus.so which seems to contain the first half of the Lens Blur feature’s native code. Doing a symbol dump gives hints about what stereo algorithm they use.

librefocus.so:00221b68 D vtable for vision::optimization::belief_propagation::BinaryCost
librefocus.so:00221b88 D vtable for vision::optimization::belief_propagation::GridProblem
librefocus.so:00221bc0 D vtable for vision::optimization::belief_propagation::LinearTruncatedCost
librefocus.so:00221ca0 V vtable for vision::sfm::RansacSolver<vision::sfm::EssentialMatrixProblem>
librefocus.so:00221cb8 V vtable for vision::sfm::RansacSolver<vision::sfm::FundamentalMatrixProblem>
librefocus.so:00221c78 D vtable for vision::sfm::StdlibRandom
librefocus.so:00221b58 D vtable for vision::image::FixedPointPyramid
librefocus.so:00221c00 D vtable for vision::shared::EGLOffscreenContext
librefocus.so:00221800 V vtable for vision::shared::Progress
librefocus.so:00221be0 V vtable for vision::shared::GLContext
librefocus.so:00221cd0 V vtable for vision::stereo::PlaneSweep
librefocus.so:00221ce8 D vtable for vision::stereo::GPUPlaneSweep
librefocus.so:00221d20 D vtable for vision::stereo::PhotoConsistencySAD
librefocus.so:00221d00 V vtable for vision::stereo::PhotoConsistencyBase
librefocus.so:00221d60 D vtable for vision::stereo::MultithreadPlaneSweep
librefocus.so:00221ae8 V vtable for vision::features::FeatureDetectorInterface
librefocus.so:00221b00 D vtable for vision::features::fast::FastDetector
librefocus.so:00221d78 V vtable for vision::tracking::KLTPyramid
librefocus.so:00221a90 V vtable for vision::tracking::KLTPyramidFactory
librefocus.so:00221dd0 D vtable for vision::tracking::KLTFeatureDetector
librefocus.so:00221da0 D vtable for vision::tracking::GaussianPyramidFactory
librefocus.so:00221db8 D vtable for vision::tracking::LaplacianPyramidFactory

So it appears they use a KLT system that is fed to an SfM model to solve for camera extrinsics and intrinsics. Possibly they solve for the fundamental between the first sequential images, decompose out the intrinsics and then solve for the essential matrix on the remainder of the sequence. After that point, they use a GPU accelerated plane sweep stereo algorithm that apparently has some belief propagation smoothing. That’s interesting stereo choice given how old plane sweeping is and the lack of popularity in the Middlebury tests. However you can’t doubt the speed. Very cool!

I’m not Dead

However I’ve been really busy working with Google’s project Tango. I encourage you to watch the video if you haven’t already.

What is NASA doing with project Tango? Well currently there is a very vague article available here. However the plan is to apply Tango to the SPHERES project to perform visual navigation. Lately, I’ve been overwhelmed with trying to meet the schedule of 0-g testing and all the hoops there are with getting hardware and software onboard the ISS. This has left very little time to write, let alone sleep. In a few weeks NASA export control will have gone over our collected data and I’ll be able to share here.

In the short term, project Tango represent an amazing opportunity to perform local mapping. The current hardware has little application to the large-scale satellite mapping that I usually discuss. However I think the ideas present in project Tango will have application in low-cost UAV mapping. Something David Shean of U of W has been pursuing. In the more immediate term I think the Tango hardware would have application to scientists wanting to perform local surveys of a glacial wall, cave, or anything you can walk all over. It’s ability to export its observations as a 3D model makes it perfect for sharing with others and perform long-term temporal studies. Yes the 3D sensor won’t work outside, however stereo observations and post processing with things like Photoscan are still possible with the daylight imagery. Tango will then be reduced to providing an amazing 6-DOF measurement of where each picture was taken. If this sounds interesting to you, I encourage you to apply for a prototype device! I’d be interested in helping you tackle a scientific objective with project Tango.

This picture is of Mark and I dealing with our preflight jitters of being onboard the “Vomit Comet” while 0-g testing the space-rated version of Project Tango. This shares my current state of mind. Also, there aren’t enough pictures of my ugly mug on this blog. I’m the guy on the right.