[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content

Advertisement

Log in

Block Kaczmarz Method with Inequalities

  • Published:
Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Abstract

The randomized Kaczmarz method is an iterative algorithm that solves systems of linear equations. Recently, the randomized method was extended to systems of equalities and inequalities by Leventhal and Lewis. Even more recently, Needell and Tropp provided an analysis of a block version of this randomized method for systems of linear equations. This paper considers the use of a block type method for systems of mixed equalities and inequalities, bridging these two bodies of work. We show that utilizing a matrix paving over the equalities of the system can lead to significantly improved convergence, and prove a linear convergence rate as in the standard block method. We also demonstrate that using blocks of inequalities offers similar improvement only when the system satisfies a certain geometric property. We support the theoretical analysis with several experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. This assumption is both for notational convenience, and because the use of matrix pavings discussed below only hold for standardized matrices. In practice, one can employ pre-conditioning on non-standardized systems, or extend the construction of matrix pavings to non-standardized systems [37].

  2. The standard definition of a row paving also includes a constant \(\alpha \) which serves as a lower bound to the smallest singular value. We ignore that parameter here since it will not be utilized.

References

  1. Bourgain, J., Tzafriri, L.: Invertibility of “large” submatrices with applications to the geometry of Banach spaces and harmonic analysis. Israel J. Math. 57(2), 137–224 (1987). doi:10.1007/BF02772174

    Article  MATH  MathSciNet  Google Scholar 

  2. Bourgain, J., Tzafriri, L.: On a problem of Kadison and Singer. J. Reine Angew. Math. 420, 1–43 (1991). JRMAA8; 46L05 (46L30 47B35 47D25); 1124564 (92j:46104); H. Halpern.

  3. Byrne, C.L.: Applied Iterative Methods. A K Peters Ltd., Wellesley (2008)

    MATH  Google Scholar 

  4. Censor, Y., Eggermont, P.P.B., Gordon, D.: Strong underrelaxation in kaczmarz’s method for inconsistent systems. Numer. Math. 41(1), 83–92 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  5. Censor, Y.: Row-action methods for huge and sparse systems and their applications. SIAM Rev. 23(4), 444–466 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  6. Chen, X., Powell, A.: Almost sure convergence of the Kaczmarz algorithm with random measurements. J. Fourier Anal. Appl. 1–20 (2012).doi:10.1007/s00041-012-9237-2

  7. Chrétien, S., Darses, S.: Invertibility of random submatrices via tail decoupling and a matrix chernoff inequality. Stat. Probab. Lett. 82(7), 1479–1487 (2012)

    Article  MATH  Google Scholar 

  8. Eggermont, P.P.B., Herman, G.T., Lent, A.: Iterative algorithms for large partitioned linear systems, with applications to image reconstruction. Linear Algebra Appl. 40, 37–67 (1981). doi:10.1016/0024-3795(81)90139-7

    Article  MATH  MathSciNet  Google Scholar 

  9. Elfving, T.: Block-iterative methods for consistent and inconsistent linear equations. Numer. Math. 35(1), 1–12 (1980). doi:10.1007/BF01396365

    Article  MATH  MathSciNet  Google Scholar 

  10. Feichtinger, H.G., Cenker, C., Mayer, M., Steier, H., Strohmer, T.: New variants of the POCS method using affine subspaces of finite codimension with applications to irregular sampling. In: Applications in Optical Science and Engineering, pp. 299–310. International Society for Optics and Photonics, Bellingham (1992).

  11. Feichtinger, H.G., Strohmer, T.: A kaczmarz-based approach to nonperiodic sampling on unions of rectangular lattices. In: SampTA ’95: 1995 Workshop on Sampling Theory and Applications, pp. 32–37. Jurmala, Latvia (1995).

  12. Gordon, R., Bender, R., Herman, G.T.: Algebraic reconstruction techniques (ART) for three-dimensional electron microscopy and X-ray photography. J. Theor. Biol. 29, 471–481 (1970)

    Article  Google Scholar 

  13. Hamaker, C., Solmon, D.C.: The angles between the null spaces of X-rays. J. Math. Anal. Appl. 62(1), 1–23 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  14. Hanke, M., Niethammer, W.: On the acceleration of kaczmarz’s method for inconsistent linear systems. Linear Algebra Appl. 130, 83–98 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  15. Herman, G., Meyer, L.: Algebraic reconstruction techniques can be made computationally efficient. IEEE Trans. Med. Imaging 12(3), 600–609 (1993)

    Article  Google Scholar 

  16. Herman, G.T.: Fundamentals of Computerized Tomography: Image Reconstruction from Projections. Springer, New York (2009)

    Book  Google Scholar 

  17. Hoffman, A.J.: On approximate solutions of systems of linear inequalities. J. Res. Nat. Bur. Stand. 49, 263–265 (1952)

    Article  Google Scholar 

  18. Kaczmarz, S.: Angenäherte auflösung von systemen linearer gleichungen. Bull. Int. Acad. Polon. Sci. Lett. Ser. A 35, 335–357 (1937)

    Google Scholar 

  19. Leventhal, D., Lewis, A.S.: Randomized methods for linear constraints: convergence rates and conditioning. Math. Oper. Res. 35(3), 641–654 (2010). doi:10.1287/moor.1100.0456

    Article  MATH  MathSciNet  Google Scholar 

  20. Liu, J., Wright, S.J., Srikrishna, S.: An asynchronous parallel randomized kaczmarz algorithm (2014). Available at arXiv:1401.4780.

  21. Natterer, F.: The mathematics of computerized tomography, Classics in Applied Mathematics, vol. 32. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (2001). doi: 10.1137/1.9780898719284. Reprint of the 1986 original.

  22. Needell, D.: Randomized Kaczmarz solver for noisy linear systems. BIT 50(2), 395–403 (2010). doi:10.1007/s10543-010-0265-5

    Article  MATH  MathSciNet  Google Scholar 

  23. Needell, D., Srebro, N., Ward, R.: Stochastic gradient descent and the randomized kaczmarz algorithm (2013). Submitted.

  24. Needell, D., Tropp, J.A.: Paved with good intentions: analysis of a randomized block Kaczmarz method. Linear Algebra Appl. 441, 199–221 (2014)

    Article  MATH  MathSciNet  Google Scholar 

  25. Needell, D., Zhao, R., Zouzias, A.: Randomized block kaczmarz method with projection for solving least squares (2014). Submitted.

  26. Popa, C.: Block-projections algorithms with blocks containing mutually orthogonal rows and columns. BIT 39(2), 323–338 (1999). doi:10.1023/A:1022398014630

    Article  MATH  MathSciNet  Google Scholar 

  27. Popa, C.: A fast Kaczmarz–Kovarik algorithm for consistent least-squares problems. Korean J. Comput. Appl. Math. 8(1), 9–26 (2001)

    MATH  MathSciNet  Google Scholar 

  28. Popa, C.: A Kaczmarz–Kovarik algorithm for symmetric ill-conditioned matrices. An. Ştiinţ. Univ. Ovidius Constanţa Ser. Mat 12(2), 135–146 (2004)

    MATH  Google Scholar 

  29. Recht, B., Ré, C.: Beneath the valley of the noncommutative arithmetic-geometric mean inequality: conjectures, case studies, and consequences. In: Proceedings of the 25th Annual Conference on Learning Theory, Edinburgh (2012).

  30. Sezan, M.I., Stark, H.: Applications of convex projection theory to image recovery in tomography and related areas. In: Stark, H. (ed.) Image Recovery: Theory and Application, pp. 155–270. Elsevier, Amsterdam (1987)

    Google Scholar 

  31. Strohmer, T., Vershynin, R.: A randomized Kaczmarz algorithm with exponential convergence. J. Fourier Anal. Appl. 15(2), 262–278 (2009). doi:10.1007/s00041-008-9030-4

    Article  MATH  MathSciNet  Google Scholar 

  32. Tanabe, K.: Projection method for solving a singular system of linear equations and its applications. Numer. Math. 17(3), 203–214 (1971)

    Article  MATH  MathSciNet  Google Scholar 

  33. Tropp, J.A.: The random paving property for uniformly bounded matrices. Studia Math. 185(1), 67–82 (2008). doi:10.4064/sm185-1-4. 46B09 (15A52 46B20 60E15); 2379999 (2008k:46030); Sasha Sodin

  34. Tropp, J.A.: Column subset selection, matrix factorization, and eigenvalue optimization. In: Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 978–986. SIAM, Philadelphia (2009).

  35. Tropp, J.A.: Improved analysis of the subsampled randomized hadamard transform. Adv. Adapt. Data Anal. 3(01n02), 115–126 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  36. Tropp, J.A.: User-friendly tail bounds for sums of random matrices. Found. Comput. Math. 12(4), 389–434 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  37. Vershynin, R.: John’s decompositions: Selecting a large part, Israel Journal of Mathematics, 122(1), 253–277 (2001). Inst. Math. Statist, Beachwood, OH (2006). doi: 10.1214/074921706000000815. 46B09 (46B07 46B20); 2387766 (2009h:46023); Dirk Werner

  38. Vershynin, R.: Random sets of isomorphism of linear operators on Hilbert space, High dimensional probability, vol. 51, pp. 148–154. Inst. Math. Statist, Beachwood, OH (2006). doi:10.1214/074921706000000815. 46B09 (46B07 46B20); 2387766 (2009h:46023); Dirk Werner

  39. Whitney, T.M., Meany, R.K.: Two algorithms related to the method of steepest descent. SIAM J. Numer. Anal. 4(1), 109–118 (1967)

    Article  MATH  MathSciNet  Google Scholar 

  40. Xu, J., Zikatanov, L.: The method of alternating projections and the method of subspace corrections in Hilbert space. J. Amer. Math. Soc. 15(3), 573–597 (2002). doi:10.1090/S0894-0347-02-00398-3

    Article  MATH  MathSciNet  Google Scholar 

  41. Zouzias, A., Freris, N.M.: Randomized extended kaczmarz for solving least squares. SIAM J. Matrix Anal. A 34(2), 773–793 (2013)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deanna Needell.

Proof of Theorem 2

Proof of Theorem 2

Proof

Fix an iteration \(j\) of Algorithm 3. As in the proof of Theorem 1, if a block of equalities is selected this iteration, then we again have (11). So we next instead consider the case when a block of inequalities is selected, and call this block \(\tau '\), and its pruned subset \(\sigma \). Set \(\varvec{s} = P_S\varvec{x}_{j-1}\), where again \(P_S\) denotes the orthogonal projection onto the solution set \(S\). If we write \(\tilde{H}_i = \{\varvec{x} : \langle \varvec{a_i}, \varvec{x}\rangle \le b_i\}\) and \({H}_i = \{\varvec{x} : \langle \varvec{a_i}, \varvec{x}\rangle = b_i\}\), then by their definitions we have

$$\begin{aligned} \varvec{s} \in \cap _{i\in \sigma }\tilde{H}_i,\quad \varvec{x}_{j-1} \in \cap _{i\in \sigma }\tilde{H}_i^c,\quad \text {and}\quad \varvec{x}_j = P_{\cap _{i\in \sigma }{H}_i}\varvec{x}_{j-1}. \end{aligned}$$

Then since \(\sigma \) is part of an obtuse paving, the angle between \(\varvec{x}_j - \varvec{x}_{j-1}\) and \(\varvec{s} -\varvec{x}_{j-1}\) must be obtuse. There thus exists a point \(\varvec{t}\) on the line segment \(L = \{\gamma \varvec{x}_{j-1} + (1-\gamma )\varvec{s} : 0 \le \gamma \le 1\}\) such that \(\varvec{x}_{j-1} - \varvec{x}_j\) and \(\varvec{t} - \varvec{x}_j\) are orthogonal (see Fig. 4).

Fig. 4
figure 4

Geometry of system

Now since \(\varvec{t} \in L\), we have \(||{\varvec{t} - \varvec{x}_{j-1}} ||_2 \le ||{\varvec{x}_{j-1} - \varvec{s}} ||_2\), and thus letting \(\theta \) denote the angle between \(\varvec{x}_j - \varvec{x}_{j-1}\) and \(\varvec{t} - \varvec{x}_{j-1}\) (see Fig. 4), we have

$$\begin{aligned} ||{\varvec{x}_j - \varvec{x}_{j-1}} ||_2&\le ||{\varvec{x}_{j-1} - \varvec{s}} ||_2\cdot \frac{||{\varvec{x}_j - \varvec{x}_{j-1}} ||_2 }{||{\varvec{t} - \varvec{x}_{j-1}} ||_2}\\&= ||{\varvec{x}_{j-1} - \varvec{s}} ||_2\cdot \cos \theta \\&= \frac{||{\varvec{x}_j - \varvec{x}_{j-1}} ||_2\cdot ||{\varvec{s} - \varvec{x}_{j-1}} ||_2\cdot \cos \theta }{||{\varvec{x}_j - \varvec{x}_{j-1}} ||_2}\\&= \frac{\langle \varvec{s} - \varvec{x}_{j-1}, \varvec{x}_j - \varvec{x}_{j-1}\rangle }{||{\varvec{x}_j - \varvec{x}_{j-1}} ||_2}\\&= \frac{-\langle \varvec{x}_{j-1} - \varvec{s}, \varvec{x}_j - \varvec{x}_{j-1}\rangle }{||{\varvec{x}_j - \varvec{x}_{j-1}} ||_2}. \end{aligned}$$

Thus, we have that

$$\begin{aligned} \langle \varvec{x}_{j-1} - \varvec{s}, \varvec{x}_{j} - \varvec{x}_{j-1}\rangle \le -||{\varvec{x}_j - \varvec{x}_{j-1}} ||_2^2. \end{aligned}$$

By the definition of \(\varvec{x}_j\), this means that

$$\begin{aligned} \langle \varvec{x}_{j-1} - \varvec{s}, \varvec{A}_{\sigma }^\dagger (\varvec{b}_{\sigma } - \varvec{A}_{\sigma }\varvec{x}_{j-1})\rangle \le - ||{{\varvec{A}_{\sigma }^\dagger }(\varvec{b}_{\sigma } - \varvec{A}_{\sigma }\varvec{x}_{j-1})} ||_2^2. \end{aligned}$$
(12)

Using this along with the paving properties we see that

$$\begin{aligned} ||{\varvec{x}_{j} - \varvec{s}} ||_2^2&= ||{\varvec{x}_{j-1} - \varvec{s} + {\varvec{A}_{\sigma }^\dagger }(\varvec{b}_{\sigma } - \varvec{A}_{\sigma }\varvec{x}_{j-1})} ||_2^2\\&= ||{\varvec{x}_{j-1} \!-\! \varvec{s}} ||_2^2 \!+\! 2\langle \varvec{x}_{j-1} \!-\! \varvec{s}, \varvec{A}_{\sigma }^\dagger (\varvec{b}_{\sigma } \!-\! \varvec{A}_{\sigma }\varvec{x}_{j-1})\rangle \\&\quad + ||{\varvec{A}_{\sigma }^\dagger (\varvec{b}_{\sigma } - \varvec{A}_{\sigma }\varvec{x}_{j-1})} ||_2^2\\&\le ||{\varvec{x}_{j-1} - \varvec{s}} ||_2^2 - ||{\varvec{A}_{\sigma }^\dagger (\varvec{b}_{\sigma } - \varvec{A}_{\sigma }\varvec{x}_{j-1})} ||_2^2\\&= d(\varvec{x}_{j-1}, S)^2 - ||{\varvec{A}_{\sigma }^\dagger (\varvec{b}_{\sigma } - \varvec{A}_{\sigma }\varvec{x}_{j-1})} ||_2^2\\&\le d(\varvec{x}_{j-1}, S)^2 - \frac{1}{\beta '}||{\varvec{b}_{\sigma } - \varvec{A}_{\sigma }\varvec{x}_{j-1}} ||_2^2. \end{aligned}$$

Thus, taking expectation (over the choice of \(\tau '\), conditioned on previous choices), yields

$$\begin{aligned} \mathbb {E}[d(\varvec{x}_{j}, S)^2]&\le \mathbb {E}||{\varvec{x}_{j} - \varvec{s}} ||_2^2\\&\le d(\varvec{x}_{j-1}, S)^2 - \frac{1}{\beta '}\mathbb {E}||{\varvec{b}_{\sigma } - \varvec{A}_{\sigma }\varvec{x}_{j-1}} ||_2^2\\&= d(\varvec{x}_{j-1}, S)^2 - \frac{1}{\beta '}\mathbb {E}||{e(\varvec{b}_{\tau '} - \varvec{A}_{\tau '}\varvec{x}_{j-1})} ||_2^2\\&=\! d(\varvec{x}_{j-1}, S)^2 \!-\! \frac{1}{m'\beta '}\sum _{\tau '\in T'}||{e(\varvec{b}_{\tau '} \!-\! \varvec{A}_{\tau '}\varvec{x}_{j-1})} ||_2^2\\&= d(\varvec{x}_{j-1}, S)^2 - \frac{1}{m'\beta '}||{e(\varvec{b}_{\le } - \varvec{A}_{\le }\varvec{x}_{j-1})} ||_2^2. \end{aligned}$$

Combining this with (11) and letting \(E_=\) and \(E_\le \) denote the events that a block from \(T\) and a block from \(T'\) is selected, respectively, we have

$$\begin{aligned}&\mathbb {E}\left[ (d(\varvec{x}_j,S)^2\right] = p \cdot \mathbb {E}[ d(\varvec{x}_{j},S)^2 | E_=] + (1-p) \cdot \mathbb {E}[ d(\varvec{x}_{j},S)^2 | E_{\le }] \\&\le p\left[ d(\varvec{x}_{j-1}, S)^2 - \frac{1}{\ \beta m} \sum _{i \in I_=} e(\varvec{A}_{=} \varvec{x}_{j-1} - \varvec{b}_=)_{i}^2\right] \\&\;\;+ (1-p) \left[ d(\varvec{x}_{j-1}, S)^2 - \frac{1}{m'\beta '}||{e(\varvec{b}_{\le } - \varvec{A}_{\le }\varvec{x}_{j-1})} ||_2^2\right] \\&= d(\varvec{x}_{j-1}, S)^2 - p \cdot \frac{1}{\ \beta m} \sum _{i \in I_=} e(\varvec{A}_{=} \varvec{x}_{j-1} - \varvec{b}_=)_{i}^2\\&\;\;- (1-p) \cdot \frac{1}{m'\beta '}||{e(\varvec{b}_{\le } - \varvec{A}_{\le }\varvec{x}_{j-1})} ||_2^2 \end{aligned}$$

Since \(p = \frac{\beta m}{\beta ' m' + \beta m}\), we have \(\frac{1-p}{\beta ' m'} = \frac{1}{\beta ' m'+\beta m}\) and we can simplify

$$\begin{aligned} \mathbb {E}\left[ d(\varvec{x}_j,S)^2\right]&\le d(\varvec{x}_{j-1}, S)^2\\&- \frac{1}{\beta ' m' + \beta m} \Big [\sum _{i \in I_=} e(\varvec{A}_{=} \varvec{x}_{j-1} - \varvec{b}_=)_{i}^2 \\&\;\;+ ||{e(\varvec{b}_{\le } - \varvec{A}_{\le }\varvec{x}_{j-1})} ||_2^2 \Big ] \\&= d(\varvec{x}_{j-1}, S)^2 - \frac{1}{\beta ' m' + \beta m} ||{e(\varvec{A} \varvec{x}_{j-1} - \varvec{b})} ||_2^2 \\&\le d(\varvec{x}_{j-1}, S)^2 - \frac{1}{L^2(\beta ' m' + \beta m)} \cdot d(\varvec{x}_{j-1},S)^2 \\&= \left[ 1 - \frac{1}{L^2(\beta ' m' + \beta m)}\right] d(\varvec{x}_{j-1},S)^2, \end{aligned}$$

where we have utilized the Hoffman bound (7) in the second inequality.

Iterating this relation along with independence of the random control completes the proof. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Briskman, J., Needell, D. Block Kaczmarz Method with Inequalities. J Math Imaging Vis 52, 385–396 (2015). https://doi.org/10.1007/s10851-014-0539-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10851-014-0539-7

Keywords

Navigation