Johnny Lee, my boss a couple links up, recently showed off at Google I/O a reel of recent research improvements on Tango. Possibly not as exciting as the other stuff in the video, but at the 1:19 mark in the video you’ll see some research by me and my coworkers to see how well Tango works on a car. As you can see it works really well and we even drove it 8 km in downtown San Francisco through tourist infested areas. Surprisingly or not, the mass number of people or the sun didn’t manage to blind out our tracking.
How did we do it? Well we took Tango phone and quick clamped it to my car. Seriously. Here’s a picture of a Lenovo Phab 2 Pro and a Asus Zenphone AR attach to my car and me in my driving glasses. We ran the current release of Tango and did motion tracking only and it just worked! … As long as you shut off the safeties that reset tracking once you exceed a certain velocity. Unfortunately you users outside of Google can’t access this ability as in a way these velocity restrictions are our own COCOM Limits.
Also, this is something really only achievable with the new commercial phones. The original Tango Development Kit didn’t include good IMU intrinsics calibration. The newly produced cellphones at the factory will solve for IMU scale, bias, and misalignment. At the end of each factory line, a worker places the phone in a robot for a calibration dance. Having this calibration is required for getting the low drift rates. Remember our IMUs are cheap 50 cent things and have a lot of wonkiness that the filter to needs to sort out.
July will be my final month working at NASA Ames Research Center. It has been a great run working under the ill-defined job title as “Geospatial Software Architect”. I’ve gotten a chance to work as Software Dev, a Principal Investigator, and most recently as a Flight Software Lead on a new robot for the ISS called Astrobee. I’m leaving this comfortable blanket I’ve had within NASA for 6 years for a chance to do computer vision at Google for Project Tango. Something I find exciting and terrifying, but I view this as an opportunity to learn ever more.
Despite this new opportunity I feel like I’m leaving my baby, Ames Stereo Pipeline (ASP). Honestly, though, I’ve been doing less and less with ASP for a while now. Oleg has been lead developer for quite sometime. Under his guidance the software has gotten more features that everyone wants and more people have started using it for Earth Science. Clearly Oleg is doing a great job! Recently another coworker, Scott, has started improving ASP as well. With those two on the job, I feel like ASP will continue to grow.
I’m extremely proud that a community has developed around ASP. I’m also grateful to APL UofW and the PGC for putting faith into the software. Their time spent using ASP, requesting changes, and offer solutions to bugs has made ASP a product worthwhile. I’m sad I won’t get to be involved anymore or at least hear about what new applications scientists have thought up. My time involved developing ASP was wonderful and was perfect for honing my skills. I hope others can do the same through the use and understanding of how it works.
Thank you ASP users and the Intelligent Robotics Group. It was fun!
I finally got around to checking my mail at work and found some LatLon notebooks! They’re all island themed with topo maps on their covers. The one pictured is my favorite since it features Mars and the elevation data was from CTX data processed by ASP. Yay!
Thank you Aitor Garcia. I’m now the engineer with the best stationery at NASA ARC.
During my internship at NASA in 2009, I helped produce an elevation model and image mosaic from Orbit 33 of Apollo 15. This mosaic was later burned into Google Earth’s Moon mode. Earlier this week it appears people have found an image of a man walking in the region of the Moon I stitched together. Here’s links to articles about this supposed extra terrestrial at The Nation, News.com.au, AOL.com, and Examiner.com. Thank you to LROC’s Jeff Plescia for bringing this to my attention.
I quickly traced out that this section of the image mosaic comes from AS15-M-1151. This is a metric camera image from Apollo 15 that was scanned into digital form sometime in 2008 by ASU. What is shown in Google Earth is a reprojection of the image on to a DEM created by Ames Stereo Pipeline using said image. The whole strip of images was then mosaicked together using ASP’s geoblend utility. So this man could have been created by an error in ASP’s projection code. Below is the man in the moon from the raw unprojected form of the Apollo Metric image. Little man perfectly intact.
Unfortunately if you look at the next image in the film reel, AS15-M-1152, the man is gone. This is true also for 1153 and 1154. After that, the Apollo command module was no longer over looking the area. The metric camera takes a picture roughly every 30 seconds, so maybe the guy (who must be like 100 meters tall) just high tailed it.
These images come from film that had been in storage for 40 years. They were lightly dusted and then scanned. Unfortunately a lot of lint and hair still made it into the scans that we used for the mosaic. So much so, that Ara Nefian at IRG developed the Bayes EM correlator for ASP to work around those artifacts. Thus, this little Man in the image was very likely some hair or dust on the film. In fact if you search around the little man in image 1151 (in the top left corner of the image, just off an extension of a ray from the big crater) you’ll find a few more pieces of lint. Those lint pieces are also visible in Google Moon. However, it is still pretty awesome to find out others have developed a conspiracy theory on your own work. Hopefully it won’t turn into weird house calls like it did for friends of mine over the whole hidden nuclear base on Mars idea.
Update: You can find the Bad Astronomer’s own debunking of this man here. The cool bit is he tried to find the artifact in LRO and LO imagery. He then links to a forum where someone identifies that the dust was actually in the optics of the camera or in the scanner bed. So the man and other pieces of lint can be seen at roughly the same pixel location in consecutive frames.
Ames Stereo Pipeline’s (ASP) current integer correlator leaves a bit to be desired. Currently it does poorly in scenes with aggressively changing slopes. It is also a coin flip if it finishes in an hour or several days. So I’ve been working on researching a new correlator by reading, implementing, and applying to satellite imagery a select few of the top performers from the Middlebury stereo competition. I started this a long time ago with PatchMatch and I never gave a good conclusion. Now I will summarize by experiences and give a short introduction into the current solution I’m pursuing.
Algorithm Shoot Out!
Semi Global Matching:  This is a world recognized well performing stereo algorithm. I don’t need to say its graces. The cons in my opinion are that it uses a lot of memory and that it is only applicable to 1-D searching. For ASP we like to have 2-D searching solution, or optical flow, to handle flaws in the user’s input data and because some users have actual used us for the creation of velocity maps. We might have been to get around the inaccuracies in our users data and the horrors of linescan cameras by calculating a local epipolar vector for each pixel after a bundle adjustment. But I believe we wouldn’t catch the vertical CCD shifts and jitter seen in HiRISE and World View satellites. As for the memory problem, there have been derivative SGM algorithms to fix this problem, but I didn’t evaluate them.
PatchMatch:  I really love the idea of starting with a uniform noise guess for the disparity and then propagating lowest cost scores to the neighbors. There were a couple downsides to this algorithm for satellite processing. 1. The cost metric of absolute differencing intensities and gradients performed much worse than an NCC cost metric in the arctic. 2. The run time was horrible because each pixel evaluation didn’t reuse previous comparison used by neighboring pixels. 3. Their slanted window needed to be adapted to support slants in the vertical direction as well as the horizontal for our optical flow demands. I couldn’t find a formulation that would stop the algorithm from cheating by defining the window as 90 degrees from the image geometry. In other words, the PatchMatch algorithm kept finding out that the correlation score was minimal if you define the kernel as having no area.
Despite all of this, a plain jane implementation of PatchMatch using NCC and non-slanted windows performs the same as a brute force dense evaluation of a template window across all disparity values. This also means that places were brute force search fails, so would PatchMatch. But, maybe for extremely large search ranges, PatchMatch might be worth its processing time. I will keep this in the back of mind forever.
PatchMatch with Huber Regularization:  This is a neat idea that is built on top of Steinbruecker and Thomas Pock’s “Large Displacement Optical Flow Computation without Warping” . (Seriously though, Thomas Pock hit a gold mine with lets apply a regularity term to everything in computer vision and show an improvement.) I eventually learned how to implement primal dual convex optimization using Handa’s guide . I realize now that everything I need to know is in Heise’s paper , but it took me a long time to understand that. But I never implement exactly what the paper described. They wanted a smoothness constraint applied to both the disparity and the normal vector used to define the correlation kernel. Since I couldn’t define a slanted correlation kernel that worked both in horizontal and vertical directions as seen in PatchMatch, I just dropped this feature. Meaning I only implemented a smoothness constraint with the disparity. Implementing this becomes a parameter tuning hell. I could sometimes get this algorithm to produce a reasonable looking disparity. But if I ran it for a few more iterations, it would then proceed to turn slopes into constant disparity values until it hit a color gradient in the input image. So it became a very difficult question for me of, at what point in the iterations do I get a good result? How do I know if this pretty result is actually a valid measurement and not something the smoothness constraint glued together because it managed to out weight the correlation metric?
In the image I provided above, you can see a slight clustering or stair-casing of the disparity as the smoothness constraint wants disparity values to match their neighbors. Also, random noise spikes would appear and neither the total variance (TV) term or the data term would remove them. They are stable minimas. I wonder if a TVL1 smoothnss term would be better than a TVHuber.
As Rigid As Possible Stereo under Second Order Smoothness Priors:  This paper again repeats the idea seen in PatchMatch Huber regularization of having a data term, a regularization term, and theta that with increasing iterations forces the two terms to converge. What I thought was interesting here was their data term. Instead of matching templates between the images for each pixel, instead break the image into quadratic surfaces and then refine the quadratic surfaces. This is incredibly fast evaluating even when using a derivative free Nelder Mead simplex algorithm. Like several orders of magnitude faster. Unfortunately this algorithm has several cons again. 1. They wanted to use the cost metric seen in PatchMatch that again doesn’t work for the satellite imagery of the arctic that I have been evaluating. 2. The data term is incredibly sensitive to its initial seed. If you can’t find a surface that is close to the correct result, the Nelder Mead algorithm will walk away. 3. This algorithm with a smoothness prior is again a parameter tuning hell. I’m not sure that what I tune up for my images will work equally well for the planetary scientists as well as the polar scientists.
Fast Cost-Volume Filtering for Visual Correspondence and Beyond:  This is an improvement algorithm to the KAIST paper about Adaptive Support Weights.  (Hooray KAIST! Send us more of your grad students.) They say hey, this is actually a bilateral filter that Yoon is talking about. They also recently read a paper about performing a fast approximate of the bilateral filter by using a ‘guided’ filter. In the end this is similar to a brute force search except now there is fancy per pixel weighting for each kernel based on image color. This algorithm is easy to implement but fails to scale to larger search regions just like brute force search. Yes this can be applied in a pyramidal fashion but I think in the next section that I’ve hit on a better algorithm. I wouldn’t count this algorithm out all together though. I think it has benefit as a refinement algorithm to the disparity, specifically in cases of urban environments with hard disparity transitions.
What am I pursuing now?
Our users have long known that they could get better results in ASP by first map projecting their input imagery on a prior DEM source like SRTM or MOLA. This reduces the search range. But it also warps the input imagery so that from the perspective of the correlator, the imagery doesn’t have slopes anymore. The downside is that this requires a lot of work on the behalf of the user. They must run a bunch more commands and must also find a prior elevation source. This prior elevation source may or may not correctly register with their new satellite imagery.
My coworker Oleg hit upon an idea of instead using a lower resolution disparity, smoothing it, and then using that disparity to warp the right image to the left before running the final correlation. It’s like map projecting, except with out the maps, camera models, and prior existing elevation source. I’ve been playing with it and made a pyramidal version of this idea. Each layer of the pyramid takes the previous disparity, smooths it, and the warps the right image to the left. Here is an example of a disparity produced with this method up against current ASP correlator’s result. I have single thread rough prototype variant and an in-progress parallel variant I’m working on.
Looks pretty good right? There are some blemishes still that I hope to correct. Surprisingly the parallel implementation of this iterated warping correlator is 2x faster than our current pyramid correlator. Another surprising feature is that the runtime for this mapping algorithm is mostly constant despite the image content. For consecutive pyramid levels, we’ll always be searching a fixed square region, whereas the original ASP pyramid correlator will need to continually adapt to terrain it sees. Once I finish tuning this new algorithm I’ll write another post on exactly why this is the case. There is also a bit of a black art for smoothing the disparity that is used for remapping the right image.
I’m pretty excited again about finding a better correlator for ASP. I still have concerns about how this iterative mapping algorithm will handle occlusions. I also found out that our idea is not completely new. My friend Randy Sargent has been peddling this idea for a while . He even implemented it for the Microscopic Imager (MI) on board the Mars Exploration Rovers. I didn’t even know that software existed! But they used homography matrices, while our ‘new’ idea is using a continuous function. In the end, I hope some of you find my diving into stereo research papers useful. I learned about a lot of cool ideas. Unfortunately very few of them scale to satellite imagery.
 Hirschmuller, Heiko. “Accurate and efficient stereo processing by semi-global matching and mutual information.” Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. Vol. 2. IEEE, 2005.
 Bleyer, Michael, Christoph Rhemann, and Carsten Rother. “PatchMatch Stereo-Stereo Matching with Slanted Support Windows.” BMVC. Vol. 11. 2011.
 Heise, Philipp, et al. “PM-Huber: PatchMatch with Huber Regularization for Stereo Matching.” Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE, 2013.
 Steinbrucker, Frank, Thomas Pock, and Daniel Cremers. “Large displacement optical flow computation withoutwarping.” Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 2009.
 Handa, Ankur, et al. Applications of Legendre-Fenchel transformation to computer vision problems. Vol. 45. Tech. Rep. DTR11-7, Department of Computing at Imperial College London, 2011.
 Zhang, Chi, et al. “As-Rigid-As-Possible Stereo under Second Order Smoothness Priors.” Computer Vision–ECCV 2014. Springer International Publishing, 2014. 112-126.
 Rhemann, Christoph, et al. “Fast cost-volume filtering for visual correspondence and beyond.” Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.
 Yoon, Kuk-Jin, and In So Kweon. “Adaptive support-weight approach for correspondence search.” IEEE Transactions on Pattern Analysis and Machine Intelligence 28.4 (2006): 650-656.
 Sargent, Randy, et al. “The Ames MER microscopic imager toolkit.” Aerospace Conference, 2005 IEEE. IEEE, 2005.