We made it as a joke on the Tonight Show! Original link.
I didn’t realize Ames had a Youtube account until Reddit was sharing a hybrid engine firing test video.
Publicity video for our Zero G testing of SPHERE + Tango.
Earlier today Google announced that they added a new feature to their android camera application called “Lens Blur”. From their blog post we can see that they take a series of images and solve for a multiview solution for the depth image of the first frame in the sequence. That depth image is then used to refocus the image. It is probably better described that they selectively blur an already fully focused image. This is similar to how they create bokeh in video games, however their blog states that they actually invoke the thin lens equations to get a realistic blur.
I thought this was really cool that they can solve for a depth image on a cellphone in only a few seconds. I also wanted access to that depth image so I can play with it and make 3D renderings! However that will come at another time.
Accessing the Data
Everything the GoogleCamera app needs to refocus an image is contained right inside the JPEG file recorded to /sdcard/DCIM/Camera after you perform a Lens Blur capture. Go ahead and download that IMG of the picture you took. How is this possible? It seems they use Adobe’s XMP format to store a depth image as PNG inside the header of the original JPEG. Unfortunately, I couldn’t figure out how to make usually tools, GDAL, read that.
So instead let’s do a manual method. If you open the JPEG file up in Emacs, right away you’ll see the XMP header which is human readable. Scroll down till you find the GDepth:Data section and copy everything between the quotes to a new file. Now things get weird. For whatever reason, the XMP format embeds binary containing strings of extension definition and a hash periodically through this binary PNG data you just copied. This doesn’t belong in the PNG and libpng will be very upset with you! Using Emacs again you can search this binary data for the http extension definition string and then delete everything between the \377 or OxFF upfront to the 8 bytes after the hash string. Depending on the file length you’ll have to perform this multiple times.
At this point you now have the raw PNG string! Unfortunately it is still in base64, so we need to decode it.
> base64 –D < header > header.png
Viola! You now have a depth image that you can open with any viewer.
Going back to the original XMP in the header of the source JPEG, you can find some interesting details on what these pixels mean like the following:
How did they do this?
There is no way for me to determine this. However looking at GoogleCamera we can see that it refers a librefocus.so which seems to contain the first half of the Lens Blur feature’s native code. Doing a symbol dump gives hints about what stereo algorithm they use.
librefocus.so:00221b68 D vtable for vision::optimization::belief_propagation::BinaryCost
librefocus.so:00221b88 D vtable for vision::optimization::belief_propagation::GridProblem
librefocus.so:00221bc0 D vtable for vision::optimization::belief_propagation::LinearTruncatedCost
librefocus.so:00221ca0 V vtable for vision::sfm::RansacSolver<vision::sfm::EssentialMatrixProblem>
librefocus.so:00221cb8 V vtable for vision::sfm::RansacSolver<vision::sfm::FundamentalMatrixProblem>
librefocus.so:00221c78 D vtable for vision::sfm::StdlibRandom
librefocus.so:00221b58 D vtable for vision::image::FixedPointPyramid
librefocus.so:00221c00 D vtable for vision::shared::EGLOffscreenContext
librefocus.so:00221800 V vtable for vision::shared::Progress
librefocus.so:00221be0 V vtable for vision::shared::GLContext
librefocus.so:00221cd0 V vtable for vision::stereo::PlaneSweep
librefocus.so:00221ce8 D vtable for vision::stereo::GPUPlaneSweep
librefocus.so:00221d20 D vtable for vision::stereo::PhotoConsistencySAD
librefocus.so:00221d00 V vtable for vision::stereo::PhotoConsistencyBase
librefocus.so:00221d60 D vtable for vision::stereo::MultithreadPlaneSweep
librefocus.so:00221ae8 V vtable for vision::features::FeatureDetectorInterface
librefocus.so:00221b00 D vtable for vision::features::fast::FastDetector
librefocus.so:00221d78 V vtable for vision::tracking::KLTPyramid
librefocus.so:00221a90 V vtable for vision::tracking::KLTPyramidFactory
librefocus.so:00221dd0 D vtable for vision::tracking::KLTFeatureDetector
librefocus.so:00221da0 D vtable for vision::tracking::GaussianPyramidFactory
librefocus.so:00221db8 D vtable for vision::tracking::LaplacianPyramidFactory
So it appears they use a KLT system that is fed to an SfM model to solve for camera extrinsics and intrinsics. Possibly they solve for the fundamental between the first sequential images, decompose out the intrinsics and then solve for the essential matrix on the remainder of the sequence. After that point, they use a GPU accelerated plane sweep stereo algorithm that apparently has some belief propagation smoothing. That’s interesting stereo choice given how old plane sweeping is and the lack of popularity in the Middlebury tests. However you can’t doubt the speed. Very cool!
However I’ve been really busy working with Google’s project Tango. I encourage you to watch the video if you haven’t already.
What is NASA doing with project Tango? Well currently there is a very vague article available here. However the plan is to apply Tango to the SPHERES project to perform visual navigation. Lately, I’ve been overwhelmed with trying to meet the schedule of 0-g testing and all the hoops there are with getting hardware and software onboard the ISS. This has left very little time to write, let alone sleep. In a few weeks NASA export control will have gone over our collected data and I’ll be able to share here.
In the short term, project Tango represent an amazing opportunity to perform local mapping. The current hardware has little application to the large-scale satellite mapping that I usually discuss. However I think the ideas present in project Tango will have application in low-cost UAV mapping. Something David Shean of U of W has been pursuing. In the more immediate term I think the Tango hardware would have application to scientists wanting to perform local surveys of a glacial wall, cave, or anything you can walk all over. It’s ability to export its observations as a 3D model makes it perfect for sharing with others and perform long-term temporal studies. Yes the 3D sensor won’t work outside, however stereo observations and post processing with things like Photoscan are still possible with the daylight imagery. Tango will then be reduced to providing an amazing 6-DOF measurement of where each picture was taken. If this sounds interesting to you, I encourage you to apply for a prototype device! I’d be interested in helping you tackle a scientific objective with project Tango.
This picture is of Mark and I dealing with our preflight jitters of being onboard the “Vomit Comet” while 0-g testing the space-rated version of Project Tango. This shares my current state of mind. Also, there aren’t enough pictures of my ugly mug on this blog. I’m the guy on the right.
I do other things beside develop Ames Stereo Pipeline. I actually have to this month because my projects’ budgets are being used to pay for other developers. This is a good thing because it gets dug in developers like me out of the way for a while so that new ideas can come in. One of the projects I occasionally get to help out on is HET Spheres.
This is a picture of that robot. The orange thingy is the SPHERE robot designed by MIT. The blue puck is an air carriage so we can do frictionless testing in 1G. There have been 3 SPHERES robots onboard the ISS for quite some time now and they’ve been hugely successful. However we wanted to have an upgrade of the processing power available on the SPHERES. We also wanted better wireless networking, cameras, additional sensors, and a display to interact with the Astronauts. While our manager, lord, and savior Mark listed off all these requirements, we attentively played angry birds. That’s when it suddenly became clear that all we ever wanted was already available in our palms. We’ll use cellphones! So, though crude, we glued a cellphone to the SPHERE and called it a day.
Actually a lot more work happened then that and you can hear about that in Mark’s Google Tech Talk. I also wasn’t involved in any of that work. I tend to do other stuff that is SPHERES development related. But I spent all last week essentially auditing the console side code and the internal SPHERE GSP code. I remembered why I don’t like Java and Eclipse. (I have to type slower so Eclipse will autocomplete. :/) This all collimated into the following video of a test of having the SPHERE fly around a stuffed robot. We ran out of CO2 and our PD gains for orientation control are still out-of-whack, but it worked!
Congrats to Cosmogia and the PhoneSat team for hitching a ride on the first Antares flight into space!