PC Align

Last month we had a new release of Ames Stereo Pipeline, version 2.3! We’ve performed a lot of bug fixing and implementing new features. But my two new prized features in ASP are pc_align and lronac2mosaic. Today I’d like to only introduce pc_align, a utility for registering DEMs, LIDAR points, and ASP point clouds to each other. All you have to do is specify an input, a reference, and an approximate estimate for how bad you think the misplacement is.

Does that sound like magic? Under the hood, pc_align is performing an implementation of the iterative closest point algorithm (ICP). Specifically we are using internally libpointmatcher library from ETH. What ICP does is iteratively attempt to match every point of the input with the nearest neighbor in the reference point cloud. ICP then solves for a transform that would globally reduce the distances between the current set of matches. Then it repeats itself and performs a new round of matching input to reference and then again solves for another global step. Repeat, repeat, repeat until we no longer see any improvements in the sum of distances between matches.

This means that failure cases for PC_align are when the reference set is too coarse to describe the features seen in the input. An example would be having a HiRISE DEM and then only having a single orbit of MOLA that intersects. A single line of MOLA does nothing to constrain the DEM about the axis of the shot line. The 300 meter post spacing might also not be detailed enough to constrain a DEM that is looking at small features such as dunes on a mostly flat plane. What is required is a large feature that both the DEM and the LIDAR source can resolve.

CTX to HRSC example

Enough about how it works. Examples! I’m going to be uncreative and just process CTX and Gale Crater because they’re fast to process and easy to find. I’ve processed the CTX images P21_009149_1752_XI_04S222W, P21_009294_1752_XI_04S222W, P22_009650_1772_XI_02S222W, and P22_009716_1773_XI_02S223W using stereo options “–alignment affineepipolar –subpixel-mode 1”. This means correlation happens in 15 minutes and then triangulation takes an hour because ISIS subroutines are not thread safe. I’ve plotted these two DEMs on top of a DLR HRSC product, H1927_0000_DT4.IMG. It is important to note that this version of the DLR DEM is referenced against the Mars ellipsoid and not the Aeroid. If your data is referenced against the Aeroid, you’ll need to use dem_geoid to temporarily remove it for processing with pc_align who only understand ellipsoidal datums.

In the above picture, the two ASP created DEMs stick out like sore thumbs and are misplaced by some 200 meters. This is due to pointing information for MRO and subsequently CTX being imperfect (how much can you ask for anyway?). You could run jigsaw and that is the gold standard solution, but that takes a lot of manual effort. So instead let’s use pc_align with the commands shown next.

> pc_align --max-displacement 200 H1927_0000_DT4.tif P22-PC.tif \
        --save-transformed-source-points  -o P22_align/P22_align \
> point2dem --t_srs "+proj=sinu +lon_0=138 +x_0=0 +y_0=0 \
        +a=3396000 +b=3396000 +units=m +no_defs" \
        --nodata -32767 P22_align/P22_align-trans_source.tif

There are 3 important observations to make from the command line above. (1) We are using the max displacement option to set the upper bound of how bad we think we are, 200 meters. (2) I’m feeding the ASP PC file as the input source instead of ASP’s DEM. This is because in (3) with the save transformed source points we’ll be writing out another PC file. PC align can only export PC files so we always have to perform another round of point2dem. It is possible to run stereo -> point2dem -> PC Align -> point2dem, but it means you are unneccesarily resampling your data once. Using the PC file directly from stereo saves us from potential aliasing and removes a point2dem call.

Here’s the final result where both CTX DEMs are plotted on top of the HRSC DEM. Everything looks really good except for that left edge. This might be because the DEM was rendered with incorrect geometry and the output DEM is subtly warped from a perfect solution.

Another cool feature that pc_align author Oleg Alexandrov added was recording the beginning and ending matching errors in CSV files. They’re found with the names <prefix>-{beg,end}_errors.csv. You can load those up in QGIS and plot theirs errors to visualize that pc_align uniformly reduced matching error across the map. (Thanks Ross for showing me how to do this!)

CTX to MOLA

Quite a few MOLA shots can be found inside the CTX footprints. Above is a plot of all MOLA PEDR data for Gale crater. I was able to download this information in CSV format from the MOLA PEDR Query tool from Washington University St. Louis. Conveniently PC_align can read CSV files, just not in the format provided by this tool. PC_align is expecting the data to be in format long, lat, elevation against ellipsoid. What is provide is lat, long, elevation against aeroid, and then radius. So I had to manually edit the CSV in Excel to be in the correct order and create my elevation values by subtracting 3396190 (this number was wrong in first draft) from the radius column. The other added bit of information needed with CSV files is that you’ll need to define the datum to use. If you don’t, pc_align will assume you’re using WGS84.

> pc_align --max-displacement 200 P22-PC.tif mola.csv\
        -o P22_mola/P22_mola --datum D_MARS --save-inv-trans \
> point2dem --t_srs "+proj=sinu +lon_0=138 +x_0=0 +y_0=0 \
        +a=3396000 +b=3396000 +units=m +no_defs" \
        --nodata -32767 P22_mola/P22_mola-trans_reference.tif

Two things to notice in these commands, the inputs are backwards from before and I’m saving the inverse transform.  You can keep things in the same order as when I was aligning to HRSC,  it is just that things will run very slowly. For performance reasons, the denser source should be considered the reference and then you must request the reference to be transformed to the source. You’ll likely routinely be using this inverse form with LIDAR sources.

In the end I was able to reduce alignment error for my CTX DEMs from being over 50 meters to being less than 15 meters against MOLA and from over 100 meter to 40 meters error against HRSC. A result I’m quite happy with for a single night processing at home. You can see my final composited MOLA registered CTX DEMs on the left. The ASP team will have more information about pc_align in LPSC abstract form next year. We also hope that you try out pc_align and find it worth regular use in your research.

Update:

I goofed in the MOLA example! Using D_MARS implies a datum that is a sphere with 3396190 meter radius. I subtracted the wrong number from MOLA’s radius measurement before (the value 3396000). That probably had some effect on the registration result shown in the pictures, but this mistake is smaller than the shot spacing of MOLA. Meaning the horizontal registration is fine, but my output DTMs are 190 meters higher than they should have been. FYI, D_MOON implies a datum that is a sphere with radius 1737400 meters.

Advances in LRO-NAC processing in ASP

Since I last wrote, we’ve hired a new full-time employee. His name is Scott and we assigned him the task of learning ASP and LROC. The first utilities he’ll be contributing back to ASP are lronacjitreg and lronac2mosaic.py. The first utility is very similar to the ISIS utility with a similar name designed for HiRISE. The second utility, lronac2mosaic.py, uses the first tool and can take LRO-NAC EDR imagery and make a non-projected image mosaic. What lronac2mosaic.py does internally is ‘noproj’ and then ‘handmos’ the images together. There is an offset between the images due to model imperfections. Finding the correct offset so the combined images are seamless is the task of lronacjitreg. All of this just a streamed line version of what I wrote in a past blog post.

Previously users and our team only had the option to run all 4 combinations of the 4 LRO-NAC input files through ASP and then glue them together afterwards. Now with the use of lronac2mosaic, we can feed whole LRO-NAC observations into ASP and receive the full DTM in one go. No messy mosaicking of 4 files.

I’ve used Scott’s program successfully to recreate most DTMs that ASU has made via SOCET SET. Using my home server, I’ve been able to recreate 77 of their DTMs to date. We’ve been fixing bugs as we hit them. One of the biggest was in our search range guessing code. The next upcoming release of ASP will have the fruits of that labor. Previously ASP had a bad habit of ignoring elevation maximas in the image as it thought those IP measurements were noise. Now we should have a better track record of getting measurements for the entire image.

One of the major criticisms I’m expecting from the large dump of LRO-NAC DTMs we expect to deliver next year is what is the quality of the placement of our DTMs in comparison to LOLA. Another engineer we have on staff, Oleg, has just the solution for this. He has developed an iterative closest point (ICP) program called pc_align which will be in the next release. This is built on top of ETHZ Autonomous System Lab’s libpointmatcher and has the ability to take DTMs and align them to other DTMs or LIDAR data. This enables us to create well-aligned products that have height values agreeing within tens of meters with LOLA. Our rough testing shows us having a CE90 of 4 meters against LOLA after performing our corrections.

We’re not ready for the big production run yet. One problem still plaguing our process is that we can see the CCD boundaries in our output DTMs. We believe most of this problem is due to the fact that the angle between line of sight of the left and right CCDs changes with every observation. ISIS however only has one number programmed into it, the number provided by the FK. Scott is actively developing an automated system to determine this angle and to make a custom FK for every LRO-NAC observation. The second problem we’re tracking is that areas of high slope are missing from our DEMs. This is partially because we didn’t use Bayes EM for our test runs but it also seems like our disparity filtering is overly aggressive or just wrong. We’ll get on to that. That’s all for now!

Semi-Global Matching

My co-worker Oleg Alexandrov has been working on Ames Stereo Pipeline for a while now. He’s just about touched all parts of the code. This is crystal clear when you look at our logs on Github. One of things he ribs me most about in ASP is that he doesn’t like that we advertise ASP as using up-to-date stereo correlation algorithms. “Come ‘on Man! That’s just not true!” he tells me. Depending on whom you talk to, we’re using primo 90’s research[7] or something re-hashed from the 70’s[1]. Either way, it is clear, we haven’t been tracking along with the current research in our field when it comes to integer correlation. This blog post covers the first part of my own personal research to find a new correlator algorithm that would improve ASP in terms of both runtime and quality. In this post, I’ll be reviewing an algorithm called Semi-Global Matching.

Semi-Global Matching or SGM is a method developed by Heiko Hirschmueller from the DLR. He first wrote about this in his 2005 paper[2]. He then elaborated and proposed further improvements in [3][4]. The best paper to learn the method is probably his second paper[3]. In my opinion his first paper gets side tracked in using a Mutual Information (MI) cost metric when the interesting bit is just SGM. The most exciting bit about this work is that it comes from DLR and is an algorithm they have applied to aerial and satellite mapping. I believe this is the method that was used to create the wonderful HRSC DTMs that some how managed to overcome the weird JPEG artifacts in their raw imagery.

The Algorithm

Heiko might have sped over his SGM algorithm in his first paper because he didn’t view it as being as challenging to implement when compared to the MI cost metric. SGM shares a lot in common with scanline optimization stereo, which has had a lot of prior research but now-a-days is considered a dead end. Let’s review how that worked. Also, the images used for this testing are from the Middlebury Stereo dataset. More information about this data and stereo algorithms applied to them can be found in [8][9][10][11].

Scanline optimization stereo is essentially Viterbi decoding in my mind. We evaluate along an epipolar line. In the case of a rectified image, this is along the horizontal scanline. For each pixel along that scanline we evaluate each possible disparity result. The costs of each pixel along the scanline can then be stacked into a matrix. A scanline was highlighted in the above picture. The cost of each pixel (x-direction) versus each possible disparity value (y-direction) is shown in the picture below. The solution for the disparity along this scanline is then the path through this matrix/image that has minimum costs (dark areas). We also have to include some smoothness constraint otherwise our disparity result could follow the jagged jumps in this map that don’t represent reality.

Finding the minimum path is then an application of Linear Programming. We iterate through the matrix left to right and take a rolling sum. The cost of an element in the rolling sum vector for the current pixel and disparity combination is equal to the cost for the current location plus the lowest summed cost from the set of all possible disparities for the prior pixel location. Heiko applies some additional constraints in that he penalizes the cost when ever the disparity changes. He penalizes higher for multiple disparity value transitions than he does for 1. Penality for an increment of 1 in disparity is P1 and anything greater is P2. This entire paragraph can more elegantly be described in the following equation.

Applying this forward and backward for each scanline we can solve for a disparity map. Here’s an example.

Notice there’s a lot of tearing between the scanlines. The image looks as if we had tracking error on a VCR. We could fix this by using a larger kernel size. For the above, the kernel size was 1 x 1 px. Something more unique would insure matches that are constrained between lines. Another approach would be to insure some smoothness constraint across lines as opposed to just disparity transitions. Heiko’s solution to this issue is what makes SGM what it is. He opted to instead perform scanline optimization at multiple angles and then take the summed cost vector to determine the final disparity result. Note, that even though we evaluate the scanline along an angle, the disparity is still defined as going along the epipolar line (perfectly horizontal in this case). Each line direction produces results like the following:

The sum of their cost vectors and then taking the minimum produces a beautiful result like the following:

My Implementation and Results

All of the pictures above were created with my implementation of SGM. In my version, I only evaluate 8 line directions. So my results are noisier than what’s seen in Heiko’s original paper. Despite this, the end results are pretty impressive. Here’s line up of ASP result, my SGM result, Heiko’s result, and the ground truth result. ASP performs so badly because it has a large kernel size that can’t handle the sudden jumps in depth. ASP then blurs the disparity discontinuities.

Unfortunately I must mention the bad side of this method. There are several cons the first and weakest of arguments is the required CPU time. My implementation of this method takes about 23 seconds to evaluate this with 8 paths. 16 paths like the paper would have doubled the processing time. ASP chops through this image in seconds. Heiko says he got the processing time down to 1.3 seconds in 2005. So I’m doing something horribly wrong and could improve my implementation. However speed is always an issue, some ideas to address this issue are iSGM[5] and wSGM[6]. These are hierarchical methods of SGM and fancy maps to reduce the length required to integrate paths for cost evaluation.

A bigger issue is that SGM requires an absurd amount of memory. All costs for all pixels and all possible disparity values are evaluated up front in a big tensor that has a size of W * H * D * 1 byte. We also need a copy of this tensor for evaluating paths and another to store summing for all paths. Those are two allocations of memory that are W * H * D * 2 bytes. They need to be a higher data type to avoid integer rollover artifacts. This demo set is 450 x 375 px and I evaluated it across 64 possible disparities. Thus SGM required 51 MB. That doesn’t include the memory cost of just loading the images up and allocating space for the disparity result. Imagine tackling a satellite image where we we have a disparity range of 2000 pixels.

Another pesky complaint against SGM is how to figure out what the values should be for the two penalty arguments. Heiko never mentioned what he used; likely he tuned the values for each stereo pair to get best results. However these penalty values ultimately determine how this algorithm responds to occlusion and angled surfaces. What works for World View 1 in glacier regions (low frequencies) might not necessarily apply to World View 1 in the city (square wave patterns). In practice, we would want to have tuned parameters for each instrument we work on and for each type of terrain.

The final and most harsh criticism of SGM is that it can only be applied to 1D disparity searches and the range must be defined beforehand. 1D searches work for calibrated stereo rigs such as the imagery used in this post. However it is my opinion that real data always has imperfections and finding the true disparity requires searching in the Y direction still. Examples of this are linescan cameras that have jitter but the spacecraft ephemeris isn’t sampled high enough to capture such as MOC, HiRISE, and LROC. There’s also the case were the camera isn’t perfect such as the World View cameras where there is a subpixel misregistration of all 50 CCDs. They can’t be easily corrected for because we can’t have raw imagery. ASP also doesn’t have a perfect method for epipolar rectification of linescan cameras. We have a linear approximation with our affine method but the problem is nonlinear.

SGM is still an amazing algorithm that is incredibly useful. There are ton of papers out there that find it to be perfect for their applications. Beside the incredible detail it resolves, my other favorite bit about the algorithm is that its runtime is deterministic. It depends squarely on search range and there is no worst-case path versus best-case path that we have to deal with in ASP’s binary search approach. Despite this, SGM seems to be a non-ideal match for ASP. ASP hopes to address the generic correlation problem where we don’t always trust our camera information or our data. I want something that can still handle 2D searching. In my next post I’ll show off another promising algorithm that seems to address that concern along with runtime and memory requirements. Until then, have some music.

Update: Code is now available here.

Works Cited

[1]  Barnea, D. (1972). A Class of Algorithms for Fast Digital Image Registration. IEEE Transactions on Computers.
[2]  Hirschmuller, H. (2005). Accurate and Efficient Stereo Processing by Semi Global Matching and Mutual Information. CVPR .
[3]  Hirschmuller, H. (2008). Stereo Processing by Semiglobal Matching and Mutual Information. Pattern Analysis and Machine Intelligence .
[4]  Hirschmulller, H., Buder, M., & Ernst, I. (2012). Memory Efficient Semi-Global Matching. Remote Sensing and Spatial Information Sciences .
[5]  Klette, S. H. (2012). Iterative Semi-Global Matching for Robust Driver Assistance Systems. ACCV .
[6]  Spangenberg, R., Langner, T., & Rojas, R. (2013). Weighted Semi-Global Matching and Center-Symmetric Census Transform for Robust Driver Assistance. CAIP .
[7]  Sun, C. (1997). A Fast Stereo Matching Method. Digital Image Computing: Techniques and Application.
[8] Scharstein, D., Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision .
[9] Scharstein, D., Szeliski, R. (2003). High-accuracy stereo depth maps using structured light. CVPR .
[10] Scharstein, D., Pal, C. (2007). Learning conditional random fields for stereo. CVPR .
[11] Hirschmuller, H., Scharstein, D. (2007). Evaluation of cost functions for stereo matching. CVPR .

Noproj’n LRO Imagery

Earlier this year I found out that I got my first proposal funded. I’ve had directed funding before thanks to NASA’s cyrosphere program. I’ve also been a Co-I on numerous other funded proposals with co-workers and friends at USGS. However my LASER proposal to perform Mass DEM production from LROC-NA imagery was something I wrote as PI and was competed. Now that it is accepted, I have two years to follow through and I’d like to share the whole process here on the blog.

The first problem I’m going to write about has bugged me for a while. That problem is that each LROC-NA observation is actually 2 files and makes using ASP awkward. In the past I’ve been telling people to run perform 4 runs with stereo. Do the combinations of LE-LE, LE-RE, RE-LE, and RE-RE. However UofA had the same problem with HiRISE, which comes down in 20 different files. They had worked out an ISIS process chain that would noproj those files together among other things to make a single observation. I don’t know why such step doesn’t exist for LROC-NA but today I’ll show you what I’ve come up with.

If you try right now to noproj the RE cub file to the LE cube file you’ll find that the program errors out because an ideal camera model for LROC has not been defined. Definitions for ideal noproj cameras for all ISIS cameras are defined in a file at $ISIS3DATA/base/applications/ noprojInstruments003.pvl. Looking at that file we see that there are 5 elements that can be defined for the ideal camera model, TransY, ItransS, TransX, ItransL, and DetectorSamples. DetectorSamples is easy; it’s what the output image width should be in the final image.  The Trans* variables are measured in millimeters and define a focal plane offset from the original camera model we are using. I plan to noproj with the match file being the LE file. Thus Trans* references a shift from the original position of the LE sensors. The Itrans* variables are pixel conversion of the millimeter measurements. It’s used by ISIS to run the math the other direction. If Trans* and Itrans* variable don’t match, funny errors will occur where the CCDs noproj correctly but the triangulation in ASP is completely bonkers. Relating the two is easy, just use the pixel pitch. For LROC-NA that is 7 micrometers per pixel.

So how do we decide what to set the TransY and TransX values to be? If those values are left to zero, the LE image will be centered on the new image and the RE will be butted up beside but falling off the edge of the new image. A good initial guess would be to set TransY to be a shift half the CCD width. A better idea I thought to use was to look at the FK kernel and identify the angle differences between the two cameras and then use the instantaneous field of view measurement to convert to pixel offset between the two CCD origins. Use pixel pitch to convert to millimeters and then divide by two will be the shift that we want. Below are the final noproj settings I used for LROC.

 Group = "LUNAR RECONNAISSANCE ORBITER/NACL"
    TransY = 16.8833
    ItransS = -2411.9
    TransX = 0.6475
    ItransL = -92.5
    DetectorSamples = 10000
  End_Group

At this point we can noproj the imagery and then handmos them together. A naïve version would look something like the following.

> noproj from=originalRE.cub to=RE.cub match=originalLE.cub
> noproj from=originalLE.cub to=LE.cub match=origianlLE.cub
> cp LE.cub LERE_mosaic.cub
> handmos from=RE.cub mosaic=LERE_mosaic.cub

Alas, the LE and RE imagery does not perfectly align. In the HiRISE processing world we would use hijitreg to determine a mean pixel offset. There is no generic form of hijitreg that we can use for LROC-NA. There is the coreg application, but in all my tests this program failed to find any correct match points between the images. I’ve tried two other successful methods. I’ve manually measured the offset using Qtie. Warning: Sending this images to Qtie requires twiddling with how serial numbers are made for ideal cameras. This is something I should document at some point as it would allow bundle adjusting ideal cameras like fully formed HiRISE and LROC images. My other solution was the example correlate program in Vision Workbench.  I did the following to make processing quick (5 minutes run time).

> crop from=LE.cub to=LE.centerline.cub sample=4900 nsamples=200
> crop from=RE.cub to=RE.centerline.cub sample=4900 nsamples=200
> parallel isis2std from={} to={.}.tif format=tiff ::: *centerline.cub
> correlate --h-corr-min -60 --h-corr-max 0 --v-corr-min 0 --v-corr-max 60 LE.centerline.tif RE.centerline.tif

That creates a disparity TIF file. The average of the valid pixels (third channel is 1) can then be used to solve for the mean translation. That translation can then be used during handmos by using the outsample and outline options. Ideally this would all be one program like hijitreg but that is for another time. Below is the final result.

Hijitreg actually does more than just solve for the translation between CCDs on HiRISE. It correlates a line of pixels between the CCDs in hope of determining the jitter of the spacecraft. I can do the same!

From the above plots, it doesn’t look like there is much jitter or the state of the data is not in form that I could determine. The only interesting bit here is that the offset in vertical direction changes with the line number. I think this might be disparity due to terrain. The imagery I used for my testing was M123514622 and M123521405, which happened to be focused on the wall of Slipher crater. The NA cameras are angled 0.106 degrees off from each other in the vertical direction. Ideal stereo geometry would 30 degrees or 15 degrees, but 0.106 degrees should allow some disparity given the massive elevation change into the crater. I wouldn’t recommend using this for 3D reconstruction but it would explain the vertical offset signal. The horizontal signal has less amplitude but does seem like it might be seeing spacecraft jitter. However it looks aliased, impossible to determine what the frequency is.

Anyways, making a noproj version of the LROC-NA observation is a huge boon for processing with Ames Stereo Pipeline. Using the options of affine epipolar alignment, no map projection, simple parabola subpixel, and it is possible to make a beautiful DEM/DTM in 2.5 hours. 70% of that time was just during triangulation because ISIS is single threaded. That would be faster with the application parallel_stereo (renamed from stereo_mpi in ASP 2.2.2).

Satellite Epipolar Rectification

Whenever I meet anyone remotely familiar with stereo algorithms, they tend to ask me why ASP doesn’t have an epipolar rectification step.

“It does, but for pinhole camera models.”
“Why not satellites?”
“Well, they don’t really have epipolar lines but instead curves! Making rectification very difficult and not reversible.”

This has been my default response the last few years. But it has been a long time since I first tried to understand this topic. Back then I thought the solution would require some difficult to implement polynomial mapping or rubbersheet transforms [1]. I’ve been with NASA a while now, maybe this time I can better grasp what is required.

We’ll I’ve gone back through the papers. Surprisingly epipolar rectification is still a popular subject to get published in the photogrammetric and remote sensing world. The new ones share a lot of my complaints in that proper resampling using the rigorous camera model is hard. However they show that for the height ranges that stereo modeling is interested in, the epipolar lines can be considered to be linear. Even more interesting, the papers note that all the epipolar lines are parallel [2][3]. They then attempt to solve for an affine transform for both the left and the right images that reduces Y disparity and minimizes image distortion. Solving for the best affine that doesn’t warp the image too much is the primary trick to this task and a few master’s theses have been written on the topic.

Some of the students like to introduce a new camera model, like the Parallel Projection model. I’d like to connect the dots and say that this is not a novel idea at all. In the computer science community this is known as the affine camera model. My favorite book has a whole chapter on the topic [4]. Hartley even goes into a few pages about solving for the affine fundamental matrix and then gives a teaser image about it’s application to stereo rectification for high altitude aerial imagery.

Math

Hartley never goes into how to do affine epipolar rectification in his book. I’d like to document my attempt at this topic as I believe it is a lot simpler than the remote sensing papers make it out to be. It is also simpler than some of the computer science papers make it to be [5].

The general plan of affine epipolar image rectification is best described in 3 steps.

1.)  Identify the slope of the epipolar lines and rotate them so they are horizontal. I’ll be using image correspondences to solve for the affine fundamental matrix, which will get me the epipolar lines and then rotation matrices.
2.)  Align and scale the Y axis of the right image to match the left image.
3.)  Align, scale, and solve for an appropriate skew to minimize the amount of searching a stereo algorithm would have to do in the X axis.

Solving for the fundamental matrix is an algorithm I took from [4]. An affine fundamental matrix differs from a regular fundamental matrix in the sense that its epipoles are at infinity. This makes the top left 2×2 area of the matrix zero. Only the remaining 5 elements contain any information. Solving for those elements we force the constraint that:

When you expand that equation out, you’ll see that the bottom right corner element is not multiplied by any of the image correspondence measurements. So we’ll solve for the other elements first and then we could back substitute for ‘e’ later. But I’m never going to use that element … so let’s forget ‘e’ exists. Solving for the other parameters is achieved by finding the eigen vector with the smallest singular value.

The last row and column of the affine fundamental matrix represent the epipoles of the left and right image.

Let’s use them to solve for the rotation matrices to flatten those epipolar lines. Here was my derivation where I attempted to avoid any trigonometric functions. In hindsight, this makes the algorithm harder to read. At least we avoided a CPU table lookup!

Now we are finally at step 2 where we need to find scaling and offset for the Y axis. This can be performed in a single blow using LAPACK least_squares routine if we just arrange the problem correctly.

Step 3 is solving for the scaling, skew, and offset for X. This too can be performed in a single blow using least_squares.

Again we’ll apply the X scaling and offset only to the right affine transform. However I thought that skew should be applied equally to both images so to minimize distortion. Splitting the skew evenly between both image transforms and how to apply the X scaling is achieved via:

Application to Mars Imagery

I wrote a C++ implementation of the above algorithm and applied it to the example stereo pairs ASP supplies for MOC, CTX, and HiRISE missions. I collect image correspondences for each stereo pair to work out what the search range needs to be in order to create a DEM. The area of this search range is proportional to how much work the stereo correlator would need to do. (Please ignore that in ASP, we use a hierarchal method that provides some optimization making processing time not linearly related to area of the search range.) I then compared the expected search range with alignment options of NONE, HOMOGRAPHY, and the new algorithm. Below are the results.

MOC CTX HiRISE
Image Size 1024×8064 5000×30720 20000×80000
NONE's Range [-389 -729]x[-104 604] [-488 -1320]x[468 -1295] [-15532 2112]x[18325 8145]
NONE's Area 379905 23900 204259281
HOMOGRAPHY's Range [-7 -2]x[7 3] [-56 -10]x[57 11] [-16176 -342]x[18164 402]
HOMOGRAPHY's Area 70 2373 25548960
EPIPOLE's Range [-7 -2]x[7 3] [-58 -2]x[57 1] [-16228 -53]x[18128 287]
EPIPOLE's Area 70 345 11681040

In the MOC case, homography alignment produced the same results as epipolar alignement. But once the images started to get bigger, it becomes clear that epipolar alignment is a winning strategy. The next test to be performed is to see if the stereo correlator still works on this imagery. We might be introducing skew or scaling errors that the correlator can’t recover from. I have yet to finish that part though.

Another fun note, epipolar rectification is how you produce anaglyphs! Here are few examples of what I was able to do with GIMP. I followed the guide linked here.

So … it looks like I need to put this in ASP 2.2 now. That’s what I’m going to be doing after this blog post.

References:

[1] Oh, Jaehong. Novel Approach to Epipolar Resampling of HRSI and Satellite Stereo Imagery-based Georeferencing of Aerial Images. Diss. The Ohio State University, 2011.
[2] Morgan, Michel, et al. “Epipolar resampling of space-borne linear array scanner scenes using parallel projection.” Photogrammetric engineering and remote sensing 72.11 (2006): 1255-1263.
[3] Wang, Mi, Fen Hu, and Jonathan Li. “Epipolar resampling of linear pushbroom satellite imagery by a new epipolarity model.” ISPRS Journal of Photogrammetry and Remote Sensing 66.3 (2011): 347-355.
[4] Hartley, Richard, and Andrew Zisserman. Multiple view geometry in computer vision. Vol. 2. Cambridge, 2000.
[5] Liansheng, Sui, Zhang Jiulong, and Cui Duwu. “Image rectification using affine epipolar geometric constraint.” Computer Science and Computational Technology, 2008. ISCSCT’08. International Symposium on. Vol. 2. IEEE, 2008.

Processing Antarctica

I’ve been sick all last week. That hasn’t stopped me from trying to process World View imagery in bulk on NASA’s Pleiades supercomputer. Right now I’m just trying to characterize how big of a challenge it is to process this large satellite data on a limited memory system for an upcoming proposal. I’m not pulling out all the tricks we have to insure that all parts of the image correlate. Still that hasn’t stopped ASP from producing this interesting elevation model of a section of Antarctica’s coastline, just off of Ross Island. Supposedly Marble Point Heliport is in this picture (QGIS told me it was the blue dot at the bottom of the coastline).

I’m using homography alignment, auto search range, parabola subpixel, and no hole filling. The output DEMs were rasterized at 5 meters per pixel. The crosses or fiducials in the image are posted 5 km apart. This represents a composite of 10 pairs of WV01 stereo imagery from 2009 to 2011 and no bundle adjustment or registration has been applied. The image itself is just a render in QGIS where the colorized DEM has had a hillshade render of the same DEM overlayed at 75% transparency.

I haven’t investigated why more of the mountains didn’t come out. When it looks like a whole elevation contour has been dropped, that’s likely because auto search range didn’t guess correctly. When it looks like a side of the mountain didn’t resolve, that’s likely because there was shadow or highlight saturation in the image. Possibly it could also be that ASP couldn’t correlate correctly on such a steep slope.