[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Rationalizing Efficient Compositional Image Alignment

The Constant Jacobian Gauss-Newton Optimization Algorithm

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We study the issue of computational efficiency for Gauss-Newton (GN) non-linear least-squares optimization in the context of image alignment. We introduce the Constant Jacobian Gauss-Newton (CJGN) optimization, a GN scheme with constant Jacobian and Hessian matrices, and the equivalence and independence conditions as the necessary requirements that any function of residuals must satisfy to be optimized with this efficient approach. We prove that the Inverse Compositional (IC) image alignment algorithm is an instance of a CJGN scheme and formally derive the compositional and extended brightness constancy assumptions as the necessary requirements that must be satisfied by any image alignment problem so it can be solved with an efficient compositional scheme. Moreover, in contradiction with previous results, we also prove that the forward and inverse compositional algorithms are not equivalent. They are equivalent, however, when the extended brightness constancy assumption is satisfied. To analyze the impact of the satisfaction of these requirements we introduce a new image alignment evaluation framework and the concepts of short- and wide-baseline Jacobian. In wide-baseline Jacobian problems the optimization will diverge if the requirements are not satisfied. However, with a good initialization, a short-baseline Jacobian problem may converge even if the requirements are not satisfied.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Amberg, B., & Vetter, T. (2009). On compositional imge alignment, with an application to active appearance models. In Proceedings of Computer Vision and Pattern Recognition Conference.

  • Baker, S., & Matthews, I. (2001). Equivalence and efficiency of image alignment algorithms. In Proceedings of Computer Vision and Pattern Recognition Conference (vol. 1, pp. 1090–1097). IEEE.

  • Baker, S., & Matthews, I. (2004). Lucas-kanade 20 years on: A unifiying framework. International Journal of Computer Vision, 56(3), 221–255.

    Article  Google Scholar 

  • Baker, S., Patil, R., Cheung, G., & Matthews, I. (2004). Lucas-kanade 20 years on: Part 5. Technical Report CMU-RI-TR-04-64, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.

  • Bartoli, A. (2008). Groupwise geometric and photometric direct image registration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(12), 2098–2108.

    Article  Google Scholar 

  • Benhimane, S., Ladikos, A., Lepetit, V., & Navab, N. (2007). Linear and quadratic subsets for template-based tracking. In Proceedings of Computer Vision and Pattern Recognition Conference.

  • Benhimane, S., & Malis, E. (2007). Homography-based 2D visual tracking and servoing. International Jounal of Robotics Research, 26(7), 661–676.

    Article  Google Scholar 

  • Brooks, R., & Arbel, T. (2010). Generalizing inverse compositional and esm image alignment. International Journal of Computer Vision, 11(87), 191–212.

    Article  Google Scholar 

  • Buenaposada, J., Muñoz, E., & Baumela, L. (2009). Efficient illumination independent appearance-based face tracking. Image and Vision Computing, 27(5), 560–578.

    Article  Google Scholar 

  • Buenaposada, J. M., & Baumela, L. (2002). Real-time tracking and estimation of plane pose. In Proceedings of Computer Vision and Pattern Recognition Conference (vol. II, pp. 697–700). IEEE, Quebec.

  • Buenaposada, J. M., Muñoz, E., & Baumela, L. (2004). Efficient appearance-based tracking. In Proceedings of Computer Vision and Pattern Recognition Conference-Workshop on Nonrigid and Articulated Motion. IEEE.

  • Cobzas, D., Jagersand, M., & Sturm, P. (2009). 3D SSD tracking with estimated 3d planes. Image and Vision Computing, 27(1–2), 69–79.

    Article  Google Scholar 

  • Cootes, T., Edwards, G., & Taylor, C. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.

    Article  Google Scholar 

  • Dowson, N., & Bowden, R. (2008). Mutual information for lucas-kanade tracking (MILK): An inverse compositional formulation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(1), 180–185.

    Article  Google Scholar 

  • Gonzalez-Mora, J., Guil, N., & De la Torre, F. (2009). Efficient image alignment using linear appearance models. In Proceedings of Computer Vision and Pattern Recognition Conference.

  • Gross, R., Matthews, I., & Baker, S. (2006). Active appearance models with occlusion. Image and Vision Computing, 24(6), 593–604.

    Article  Google Scholar 

  • Hager, G., & Belhumeur, P. (1998). Efficient region tracking with parametric models of geometry and illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(10), 1025–1039.

  • Hinterstoisser, S., Lepetit, V., Benhimane, S., Fua, P., & Navab, N. (2011). Learning real-time perspective patch rectification. International Journal of Computer Vision, 91(1), 107–130.

  • Holzer, S., Ilic, S., & Navab, N. (2013). Multilayer adaptive linear predictors for real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 105–117.

    Article  Google Scholar 

  • Holzer, S., Pollefeys, M., Ilic, S., Tan, D. J., & Navab, N. (2012). Online learning of linear predictors for real-time tracking. In Proceedings of European Conference on Computer Vision. Firenze.

  • Jurie, F., & Dhome, M. (2002). Hyperplane approximation for template matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 996–100.

    Article  Google Scholar 

  • Lucas, B. D., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In Proceedings of International Joint Conference on Artificial Intelligence (pp. 674–679).

  • Matthews, I., & Baker, S. (2004). Active appearance models revisited. International Journal of Computer Vision, 60(2), 135–164.

    Article  Google Scholar 

  • Matthews, I., Xiao, J., & Baker, S. (2007). 2D vs. 3D deformable face models: Representational power, construction, and real-time fitting. International Journal of Computer Vision, 75(1), 93–113.

    Article  Google Scholar 

  • Megret, R., Authesserre, J., & Berthoumieu, Y. (2010). Bidirectional composition on lie groups for gradient-based image alignment. IEEE Transactions on Image Processing, 19(9), 2369–2381.

    Article  MathSciNet  Google Scholar 

  • Muñoz, E., Buenaposada, J. M., & Baumela, L. (2005). Efficient model-based 3d tracking of deformable objects. In Proceedings of IEEE International Conference on Computer Vision (vol. I, pp. 877–882). Beijing.

  • Muñoz, E., Buenaposada, J. M., & Baumela, L. (2009). A direct approach for efficiently tracking with 3D morphable models. In Proceedings of IEEE International Conference on Computer Vision (vol. I). Kyoto.

  • Navarathna, R., Sridharan, S., & Lucey, S. (2011). Fourier active appearance models. In Proceedings of IEEE International Conference on Computer Vision.

  • Nguyen, M. H., & De la Torre, F. (2010). Metric learning for image alignment. International Journal of Computer Vision, 88(1), 69–84.

    Article  Google Scholar 

  • Nocedal, J., & Wright, S. (2006). Numerical optimization. New York: Springer.

    MATH  Google Scholar 

  • Romdhani, S., & Vetter, T. (2003). Efficient, robust and accurate fitting of a 3D morphable model. Proceedings of International Conference on Computer Vision, 1, 59–66.

  • Shum, H. Y., & Szeliski, R. (2000). Construction of panoramic image mosaics with global and local alignment. International Journal of Computer Vision, 36(2), 101–130.

    Article  Google Scholar 

  • Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2011). Robust and efficient parametric face alignment. In Proceedings of International Conference on Computer Vision, (pp. 1847–1854).

  • Xu, Y., & Roy-Chowdhury, A. K. (2008). Inverse compositional estimation of 3D pose and lighting in dynamic scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(7), 1300–1307.

  • Zimmermann, K., Matas, J., & Svoboda, T. (2009). Tracking by an optimal sequence of linear predictors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(4), 677–692.

    Article  Google Scholar 

Download references

Acknowledgments

The authors are grateful to Pascal Fua for interesting discussions about this work. They also thank the anonymous reviewers for their comments. Research funded by the Ministerio de Economía y Competitividad of Spain under contract TIN2013-47630-C2-2-R

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luis Baumela.

Additional information

Communicated by M. Hebert.

Appendices

Appendix 1: Derivative of Inverse Warps

Let \(f({\varvec{x}},\varvec{\phi })\) be a warp function and \(f^{-1}({\varvec{x}}, \varvec{\phi })\) its inverse, such that \(f(f^{-1}({\varvec{x}}, \varvec{\phi }), \varvec{\phi }) = {\varvec{x}}\), where \(\varvec{\phi }\) is a small disturbance around the identity warp, \(\varvec{\phi }_0\). The derivative of this expression with respect to \(\varvec{\phi }\) is

$$\begin{aligned} \left. \dfrac{\partial f(f^{-1}({\varvec{x}}, \varvec{\phi }), \varvec{\phi })}{\partial \varvec{\phi }} \right| _{\varvec{\phi }= \varvec{\phi }_0} = \dfrac{\partial {\varvec{x}}}{\partial \varvec{\phi }} = \mathbf 0, \end{aligned}$$

that can be expanded using the chain rule:

$$\begin{aligned}&\left. \dfrac{\partial f({\varvec{x}},\varvec{\phi })}{\partial \varvec{\phi }}\right| _{\varvec{\phi }=\varvec{\phi }_0} + \left. \dfrac{\partial f({\varvec{x}}',\varvec{\phi }_0)}{\partial {\varvec{x}}'}\right| _{{\varvec{x}}'={\varvec{x}}}\cdot \left. \dfrac{\partial f^{-1}({\varvec{x}}, \varvec{\phi })}{\partial \varvec{\phi }} \right| _{\varvec{\phi }=\varvec{\phi }_0}\\&= \mathbf 0. \end{aligned}$$

As \(f({\varvec{x}}, \varvec{\phi }_0)\) is the identity warp,

$$\begin{aligned} \left. \dfrac{\partial f({\varvec{x}},\varvec{\phi })}{\partial \varvec{\phi }}\right| _{\varvec{\phi }=\varvec{\phi }_0} + \left. \dfrac{\partial f^{-1}({\varvec{x}}, \varvec{\phi })}{\partial \varvec{\phi }} \right| _{\varvec{\phi }=\varvec{\phi }_0} = \mathbf 0. \end{aligned}$$

Finally,

$$\begin{aligned} \left. \dfrac{\partial f({\varvec{x}},\varvec{\phi })}{\partial \varvec{\phi }}\right| _{\varvec{\phi }=\varvec{\phi }_0} = - \left. \dfrac{\partial f^{-1}({\varvec{x}}, \varvec{\phi })}{\partial \varvec{\phi }} \right| _{\varvec{\phi }=\varvec{\phi }_0}. \end{aligned}$$

Appendix 2: In-plane Translation

We consider the case of a plane \(\varvec{\pi }\) that moves perpendicular to its normal \(\mathbf {n}\) at a distance \(d\) from the origin. The set of points of the plane are \(\mathcal {V} = \{{\varvec{x}}\in \mathbb {R}^3 : \mathbf {n}^\top {\varvec{x}}+d=0\}\), that is a two-dimensional surface embedded in \(\mathbb {R}^3\), and therefore, it is a closed set. The support set is a finite subset of \(\mathcal {V}, \mathcal {X}\subset \mathcal {V}\).

The in-plane translation has two degrees of freedom. Thus, the pair of warps \(\mathbf{f}\) and \(\mathbf{g}\) are parametrized respectively by \({\varvec{\mu }}\in \mathbb {R}^3\) and \(\varDelta {\varvec{\phi }}\in \mathbb {R}^2\). The two warps are \(\mathbf{f}({\varvec{x}},{\varvec{\mu }})={\varvec{x}}+ {\varvec{\mu }}\) and \(\mathbf{g}({\varvec{x}},\varDelta {\varvec{\phi }}) = {\varvec{x}}+ [{\varvec{u}}\ {\varvec{v}}]\cdot \varDelta {\varvec{\phi }}\), where \({\varvec{u}},{\varvec{v}}\in \mathbb {R}^3\) are two independent vectors perpendicular to \(\mathbf {n}\).

We will prove that this system satisfies both the CA and the EBCA. The CA states that, for any \({\varvec{\mu }}\) and \(\varDelta {\varvec{\phi }}\), there exists a \({\varvec{\mu }}'\) such that \(\mathbf{f}({\varvec{x}},{\varvec{\mu }}') = \mathbf{f}(\mathbf{g}({\varvec{x}},\varDelta {\varvec{\phi }}),{\varvec{\mu }})\). This is trivially proved taking \({\varvec{\mu }}' = {\varvec{\mu }}+ [{\varvec{u}}\ {\varvec{v}}]\cdot \varDelta {\varvec{\phi }}\). The identity \(\mathbf{g}\)-warp is obtained for \(\varDelta {\varvec{\phi }}_0=[0\ 0]^T\).

To prove the EBCA, we write the expression for Requirement 2

$$\begin{aligned}&\left. \dfrac{\partial I[{\varvec{x}}, t]}{\partial {\varvec{x}}} \right| _{{\varvec{x}}=f(\mathcal {X}, {\varvec{\mu }})} \cdot \left. \dfrac{\partial \mathbf{f}({\varvec{x}},{\varvec{\mu }})}{\partial {\varvec{x}}}\right| _{{\varvec{x}}=\mathcal {X}} \cdot \left. \dfrac{\partial \mathbf{g}(\mathcal {X},\varvec{\phi })}{\partial \varvec{\phi }}\right| _{\varvec{\phi }=\varvec{\phi }_0}\nonumber \\&= \left. \dfrac{\partial T[{\varvec{x}}]}{\partial {\varvec{x}}} \right| _{{\varvec{x}}=\mathcal {X}} \cdot \left. \dfrac{\partial \mathbf{g}(\mathcal {X},\varvec{\phi })}{\partial \varvec{\phi }}\right| _{\varvec{\phi }=\varvec{\phi }_0}. \end{aligned}$$
(22)

The derivative of the \(\mathbf{f}\)-warp with respect to \({\varvec{x}}\) is the \(3\times 3\) identity. Also, the derivative of the \(\mathbf{g}\)-warp with respect to \(\varDelta {\varvec{\phi }}\) is \([{\varvec{u}}\ {\varvec{v}}]\). Therefore, (22) becomes

$$\begin{aligned} \left. \dfrac{\partial I[{\varvec{x}}, t]}{\partial {\varvec{x}}} \right| _{{\varvec{x}}=f(\mathcal {X}, {\varvec{\mu }})} \cdot [{\varvec{u}}\ {\varvec{v}}] = \left. \dfrac{\partial T[{\varvec{x}}]}{\partial {\varvec{x}}} \right| _{{\varvec{x}}=\mathcal {X}} \cdot [{\varvec{u}}\ {\varvec{v}}]. \end{aligned}$$
(23)

We do not know \(I\) nor \(T\) but, thanks to the brightness constancy assumption, we know a relation between them \(I[f({\varvec{x}},{\varvec{\mu }}),t] = T[{\varvec{x}}],\ \forall {\varvec{x}}\in \mathcal {V}\). The partial derivatives of two functions that are equal in a closed subset \(\mathcal {V}\) of their domain are not, in general, equal in that subset. However, the partial derivatives projected onto \(\mathcal {V}\) are equal. Thus, given a projection matrix \(\mathbf {\Pi }\) onto the plane \(\mathcal {V}\) we have that:

$$\begin{aligned} \left. \dfrac{\partial I[{\varvec{x}}, t]}{\partial {\varvec{x}}} \right| _{{\varvec{x}}=f(\mathcal {X}, {\varvec{\mu }}_t)} \cdot \mathbf {\Pi } = \left. \dfrac{\partial T[{\varvec{x}}]}{\partial {\varvec{x}}} \right| _{{\varvec{x}}=\mathcal {X}} \cdot \mathbf {\Pi }. \end{aligned}$$

Since we can choose \(\mathbf {\Pi }=[{\varvec{u}}\ {\varvec{v}}]\), expression (23) is true. This proves the EBCA.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Muñoz, E., Márquez-Neila, P. & Baumela, L. Rationalizing Efficient Compositional Image Alignment. Int J Comput Vis 112, 354–372 (2015). https://doi.org/10.1007/s11263-014-0769-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-014-0769-6

Keywords

Navigation