Abstract
We study the issue of computational efficiency for Gauss-Newton (GN) non-linear least-squares optimization in the context of image alignment. We introduce the Constant Jacobian Gauss-Newton (CJGN) optimization, a GN scheme with constant Jacobian and Hessian matrices, and the equivalence and independence conditions as the necessary requirements that any function of residuals must satisfy to be optimized with this efficient approach. We prove that the Inverse Compositional (IC) image alignment algorithm is an instance of a CJGN scheme and formally derive the compositional and extended brightness constancy assumptions as the necessary requirements that must be satisfied by any image alignment problem so it can be solved with an efficient compositional scheme. Moreover, in contradiction with previous results, we also prove that the forward and inverse compositional algorithms are not equivalent. They are equivalent, however, when the extended brightness constancy assumption is satisfied. To analyze the impact of the satisfaction of these requirements we introduce a new image alignment evaluation framework and the concepts of short- and wide-baseline Jacobian. In wide-baseline Jacobian problems the optimization will diverge if the requirements are not satisfied. However, with a good initialization, a short-baseline Jacobian problem may converge even if the requirements are not satisfied.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Amberg, B., & Vetter, T. (2009). On compositional imge alignment, with an application to active appearance models. In Proceedings of Computer Vision and Pattern Recognition Conference.
Baker, S., & Matthews, I. (2001). Equivalence and efficiency of image alignment algorithms. In Proceedings of Computer Vision and Pattern Recognition Conference (vol. 1, pp. 1090–1097). IEEE.
Baker, S., & Matthews, I. (2004). Lucas-kanade 20 years on: A unifiying framework. International Journal of Computer Vision, 56(3), 221–255.
Baker, S., Patil, R., Cheung, G., & Matthews, I. (2004). Lucas-kanade 20 years on: Part 5. Technical Report CMU-RI-TR-04-64, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.
Bartoli, A. (2008). Groupwise geometric and photometric direct image registration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(12), 2098–2108.
Benhimane, S., Ladikos, A., Lepetit, V., & Navab, N. (2007). Linear and quadratic subsets for template-based tracking. In Proceedings of Computer Vision and Pattern Recognition Conference.
Benhimane, S., & Malis, E. (2007). Homography-based 2D visual tracking and servoing. International Jounal of Robotics Research, 26(7), 661–676.
Brooks, R., & Arbel, T. (2010). Generalizing inverse compositional and esm image alignment. International Journal of Computer Vision, 11(87), 191–212.
Buenaposada, J., Muñoz, E., & Baumela, L. (2009). Efficient illumination independent appearance-based face tracking. Image and Vision Computing, 27(5), 560–578.
Buenaposada, J. M., & Baumela, L. (2002). Real-time tracking and estimation of plane pose. In Proceedings of Computer Vision and Pattern Recognition Conference (vol. II, pp. 697–700). IEEE, Quebec.
Buenaposada, J. M., Muñoz, E., & Baumela, L. (2004). Efficient appearance-based tracking. In Proceedings of Computer Vision and Pattern Recognition Conference-Workshop on Nonrigid and Articulated Motion. IEEE.
Cobzas, D., Jagersand, M., & Sturm, P. (2009). 3D SSD tracking with estimated 3d planes. Image and Vision Computing, 27(1–2), 69–79.
Cootes, T., Edwards, G., & Taylor, C. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.
Dowson, N., & Bowden, R. (2008). Mutual information for lucas-kanade tracking (MILK): An inverse compositional formulation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(1), 180–185.
Gonzalez-Mora, J., Guil, N., & De la Torre, F. (2009). Efficient image alignment using linear appearance models. In Proceedings of Computer Vision and Pattern Recognition Conference.
Gross, R., Matthews, I., & Baker, S. (2006). Active appearance models with occlusion. Image and Vision Computing, 24(6), 593–604.
Hager, G., & Belhumeur, P. (1998). Efficient region tracking with parametric models of geometry and illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(10), 1025–1039.
Hinterstoisser, S., Lepetit, V., Benhimane, S., Fua, P., & Navab, N. (2011). Learning real-time perspective patch rectification. International Journal of Computer Vision, 91(1), 107–130.
Holzer, S., Ilic, S., & Navab, N. (2013). Multilayer adaptive linear predictors for real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 105–117.
Holzer, S., Pollefeys, M., Ilic, S., Tan, D. J., & Navab, N. (2012). Online learning of linear predictors for real-time tracking. In Proceedings of European Conference on Computer Vision. Firenze.
Jurie, F., & Dhome, M. (2002). Hyperplane approximation for template matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 996–100.
Lucas, B. D., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In Proceedings of International Joint Conference on Artificial Intelligence (pp. 674–679).
Matthews, I., & Baker, S. (2004). Active appearance models revisited. International Journal of Computer Vision, 60(2), 135–164.
Matthews, I., Xiao, J., & Baker, S. (2007). 2D vs. 3D deformable face models: Representational power, construction, and real-time fitting. International Journal of Computer Vision, 75(1), 93–113.
Megret, R., Authesserre, J., & Berthoumieu, Y. (2010). Bidirectional composition on lie groups for gradient-based image alignment. IEEE Transactions on Image Processing, 19(9), 2369–2381.
Muñoz, E., Buenaposada, J. M., & Baumela, L. (2005). Efficient model-based 3d tracking of deformable objects. In Proceedings of IEEE International Conference on Computer Vision (vol. I, pp. 877–882). Beijing.
Muñoz, E., Buenaposada, J. M., & Baumela, L. (2009). A direct approach for efficiently tracking with 3D morphable models. In Proceedings of IEEE International Conference on Computer Vision (vol. I). Kyoto.
Navarathna, R., Sridharan, S., & Lucey, S. (2011). Fourier active appearance models. In Proceedings of IEEE International Conference on Computer Vision.
Nguyen, M. H., & De la Torre, F. (2010). Metric learning for image alignment. International Journal of Computer Vision, 88(1), 69–84.
Nocedal, J., & Wright, S. (2006). Numerical optimization. New York: Springer.
Romdhani, S., & Vetter, T. (2003). Efficient, robust and accurate fitting of a 3D morphable model. Proceedings of International Conference on Computer Vision, 1, 59–66.
Shum, H. Y., & Szeliski, R. (2000). Construction of panoramic image mosaics with global and local alignment. International Journal of Computer Vision, 36(2), 101–130.
Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2011). Robust and efficient parametric face alignment. In Proceedings of International Conference on Computer Vision, (pp. 1847–1854).
Xu, Y., & Roy-Chowdhury, A. K. (2008). Inverse compositional estimation of 3D pose and lighting in dynamic scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(7), 1300–1307.
Zimmermann, K., Matas, J., & Svoboda, T. (2009). Tracking by an optimal sequence of linear predictors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(4), 677–692.
Acknowledgments
The authors are grateful to Pascal Fua for interesting discussions about this work. They also thank the anonymous reviewers for their comments. Research funded by the Ministerio de Economía y Competitividad of Spain under contract TIN2013-47630-C2-2-R
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by M. Hebert.
Appendices
Appendix 1: Derivative of Inverse Warps
Let \(f({\varvec{x}},\varvec{\phi })\) be a warp function and \(f^{-1}({\varvec{x}}, \varvec{\phi })\) its inverse, such that \(f(f^{-1}({\varvec{x}}, \varvec{\phi }), \varvec{\phi }) = {\varvec{x}}\), where \(\varvec{\phi }\) is a small disturbance around the identity warp, \(\varvec{\phi }_0\). The derivative of this expression with respect to \(\varvec{\phi }\) is
that can be expanded using the chain rule:
As \(f({\varvec{x}}, \varvec{\phi }_0)\) is the identity warp,
Finally,
Appendix 2: In-plane Translation
We consider the case of a plane \(\varvec{\pi }\) that moves perpendicular to its normal \(\mathbf {n}\) at a distance \(d\) from the origin. The set of points of the plane are \(\mathcal {V} = \{{\varvec{x}}\in \mathbb {R}^3 : \mathbf {n}^\top {\varvec{x}}+d=0\}\), that is a two-dimensional surface embedded in \(\mathbb {R}^3\), and therefore, it is a closed set. The support set is a finite subset of \(\mathcal {V}, \mathcal {X}\subset \mathcal {V}\).
The in-plane translation has two degrees of freedom. Thus, the pair of warps \(\mathbf{f}\) and \(\mathbf{g}\) are parametrized respectively by \({\varvec{\mu }}\in \mathbb {R}^3\) and \(\varDelta {\varvec{\phi }}\in \mathbb {R}^2\). The two warps are \(\mathbf{f}({\varvec{x}},{\varvec{\mu }})={\varvec{x}}+ {\varvec{\mu }}\) and \(\mathbf{g}({\varvec{x}},\varDelta {\varvec{\phi }}) = {\varvec{x}}+ [{\varvec{u}}\ {\varvec{v}}]\cdot \varDelta {\varvec{\phi }}\), where \({\varvec{u}},{\varvec{v}}\in \mathbb {R}^3\) are two independent vectors perpendicular to \(\mathbf {n}\).
We will prove that this system satisfies both the CA and the EBCA. The CA states that, for any \({\varvec{\mu }}\) and \(\varDelta {\varvec{\phi }}\), there exists a \({\varvec{\mu }}'\) such that \(\mathbf{f}({\varvec{x}},{\varvec{\mu }}') = \mathbf{f}(\mathbf{g}({\varvec{x}},\varDelta {\varvec{\phi }}),{\varvec{\mu }})\). This is trivially proved taking \({\varvec{\mu }}' = {\varvec{\mu }}+ [{\varvec{u}}\ {\varvec{v}}]\cdot \varDelta {\varvec{\phi }}\). The identity \(\mathbf{g}\)-warp is obtained for \(\varDelta {\varvec{\phi }}_0=[0\ 0]^T\).
To prove the EBCA, we write the expression for Requirement 2
The derivative of the \(\mathbf{f}\)-warp with respect to \({\varvec{x}}\) is the \(3\times 3\) identity. Also, the derivative of the \(\mathbf{g}\)-warp with respect to \(\varDelta {\varvec{\phi }}\) is \([{\varvec{u}}\ {\varvec{v}}]\). Therefore, (22) becomes
We do not know \(I\) nor \(T\) but, thanks to the brightness constancy assumption, we know a relation between them \(I[f({\varvec{x}},{\varvec{\mu }}),t] = T[{\varvec{x}}],\ \forall {\varvec{x}}\in \mathcal {V}\). The partial derivatives of two functions that are equal in a closed subset \(\mathcal {V}\) of their domain are not, in general, equal in that subset. However, the partial derivatives projected onto \(\mathcal {V}\) are equal. Thus, given a projection matrix \(\mathbf {\Pi }\) onto the plane \(\mathcal {V}\) we have that:
Since we can choose \(\mathbf {\Pi }=[{\varvec{u}}\ {\varvec{v}}]\), expression (23) is true. This proves the EBCA.
Rights and permissions
About this article
Cite this article
Muñoz, E., Márquez-Neila, P. & Baumela, L. Rationalizing Efficient Compositional Image Alignment. Int J Comput Vis 112, 354–372 (2015). https://doi.org/10.1007/s11263-014-0769-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-014-0769-6