Mass CTX Processing

A few weeks back, Ross Beyer presented my blog posts on autonomous HiRISE DEM processing at the HiRISE team meeting in Tucson, AZ. This brought about a question of could this be performed for CTX. Of course! ASP can be applied to do bulk processing for any of the missions it supports. Earth or any place ISIS and CSpice have defined coordinate system for. Just put in some safeties about run time into the processing scripts because ASP still occasionally goes mad and eats a whole bunch of processing time for no output. (Were working on it!)

Processing CTX stereo pairs however is in fact a little more difficult compared to the HiRISE processing I was doing before. HiRISE lists all their stereo pairs from their website or Dr. Shane Byrne’s website. There’s no equivalent for CTX. Luckily for me, some folks at UofA wrote PairendipityCTX (Chris Schaller?). They provided Ross and I a detailed report of overlapping files and other statistics. I cut out everything but the filenames for my own use and you can get a copy of the list with its 1,542 stereo pairs here.

Another difference was how these two missions stored their data on PDS. I can look at a HiRISE filename and work out its download path in PDS with no trouble. CTX on the other hand seems to have arbitrary volume breaks, thus the download URL is not predictable. My solution is a bad solution, but a quick solution. I wrote a python script that scraped PDS’s servers and identified all the CTX images it has and what their URLs are. I then just ‘grep’ for the URL in the processing scripts. Here’s the resulting text file that lists all of the 50,708 CTX images that existed at the time of my scraping. This is a mean trick because my script can make HTTP requests much faster than a human can. In a sense, I’m DOS’ing the PDS servers. So please copy my results rather than my methods.

Processing scripts

Previously with my autonomous HiRISE processing efforts, I just wrote a Bash script and then spawned it multiple times on my server using GNU parallel. That’s great, but looking around my office I saw a lot of unused computers that I’d like to do my bidding. What I wanted was a job management system like PBS, which is what they use on the super computer. But PDS is a little too heavy and sometimes cost money. So I instead found two alternative projects that seemed to fit the bill, Gearman and Celery. Gearman is the minimalist. Celery required a database backing and multiple ports open with each slave worker. I decided to use Gearman since it seemed simpler to learn.

My code for this little project is available on Github in my Mars3DGearman project. I hope to eventually add HiRISE support. Here’s how it plays out. All machines make a working directory that contains the CTX stereo list, the CTX url lookup list, and the folders DEM and DRG. One machine is designated the server, in my case, my workstation at home. It starts the only instance of ctx_processor.py with the help of a backing ‘gearmand’ executable (gearman daemon). All of the slaves then SSH back to my server and forward 4730, the port used by Gearman for communication. Then all the slave machines can start one or more instances of ctx_worker.py.

The part I haven’t worked out is how to send home the completed results to my main server. The ctx_worker script produces a DEM and orthophoto and then just dumps it locally into the DEM and DRG folder. Gearman does allow sending binary strings back the main server, but I’m betting a 100 MB long string would break something. I chose a different route. Since I already have all the slaves SSH’ing back to my main server, I decided to simply rsync all the slaves’ DEM and DRG folder back home. I don’t have to re-enter my password as I’ve enabled SSH ControlMaster which re-uses previous connections. For now, I just put that rsync in a watch command that fires every 2 hours. Alternatively it could be inside the ctx_worker script. A better bet would be to use SSH keys.

Another worthwhile improvement compared to my HiRISE processing scripts is the inclusion of a timeout method for each step. When it comes to CTX, if the correlation doesn’t finish in 2 hours, I’m not interested. This timeout is achieved through the run_cmd and process_timeout functions in the ctx_worker script. The Internet helped me a lot in figuring out how to make that a reality. Thanks Internet!

Results

These last few days I’ve roped 4 machines into doing my bidding. So far they’ve produced ~260 DEMs. 5 new DEMs completed just while I was writing this article. There are still some hiccups in the process. But when the stars align, I seem to produce over 50 new DEMs every day. I’m not going to show you all of them as that felt like a lot of work to post on to the blog. Instead I’m just going to show off a couple screenshots of some interesting places in Google Mars. The color ramp is a little funky because someday I need to learn to reference everything against the Mars Aeroid and not just the sphere.

Not everything looks as great as those screenshots. Here are some places that failed to correlate. I’m not sure what went wrong and I unfortunately don’t have the time to investigate.

In conclusion, this is just another proof of concept of what is possible. I hope that someday someone else will attempt this and do a better job than me. ASP is not perfect, but it can achieve a lot of processing on its own that could be beneficial to the scientific community.

Related

Shean, D. E., et al. “MRO CTX Stereo Image Processing and Preliminary DEM Quality Assessment.” Lunar and Planetary Institute Science Conference Abstracts. Vol. 42. 2011.

David Shean, MSSS, and Larry Edwards actually already attempted this once before! In the above abstract you can see that they went above and beyond my little demo and processed 1180 stereo pairs. They also developed some initial steps for registering the output DEMs to MOLA and plotted the relationship convergence angle has on the outcome of a stereo pair.