Abstract
Transportation maps between probability measures are critical objects in numerous areas of mathematics and applications such as PDE, fluid mechanics, geometry, machine learning, computer science, and economics. Given a pair of source and target measures, one searches for a map that has suitable properties and transports the source measure to the target one. Here, we study maps that possess the no-collision property; that is, particles simultaneously traveling from sources to targets in a unit time with uniform velocities do not collide. These maps are particularly relevant for applications in swarm control problems. We characterize these no-collision maps in terms of half-space preserving property and establish a direct connection between these maps and binary-space-partitioning (BSP) tree structures. Based on this characterization, we provide explicit BSP algorithms, of cost \(O(n \log n)\), to construct no-collision maps. Moreover, interpreting these maps as approximations of optimal transportation maps, we find that they succeed in computing nearly optimal maps for q-Wasserstein metric (\(q=1,2\)). In some cases, our maps yield costs that are just a few percent off from being optimal.
Similar content being viewed by others
References
Altschuler, J., Bach, F., Rudi, A., Weed, J.: Approximating the quadratic transportation metric in near-linear time. arXiv preprint arXiv:1810.10046 (2018)
Altschuler, J., Weed, J., Rigollet, P.: Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pp. 1961–1971. Curran Associates Inc., USA (2017). http://dl.acm.org/citation.cfm?id=3294771.3294958. Accessed Dec 14 2019
Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows in Metric Spaces and in the Space of Probability Measures. Lectures in Mathematics ETH Zürich, 2nd edn. Birkhäuser Verlag, Basel (2008)
Blum, M., Floyd, R.W., Pratt, V., Rivest, R.L., Tarjan, R.E.: Time bounds for selection. J. Comput. Syst. Sci. 7(4), 448–461 (1973). https://doi.org/10.1016/S0022-0000(73)80033-9
Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, pp. 2292–2300. Curran Associates, Inc. (2013). http://papers.nips.cc/paper/4927-sinkhorn-distances-lightspeed-computation-of-optimal-transport.pdf
Cuturi, M., Doucet, A.: Fast computation of Wasserstein barycenters. In: Xing, E.P., Jebara, T. (eds.) Proceedings of the 31st International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 32, pp. 685–693. PMLR, Bejing, China (2014). http://proceedings.mlr.press/v32/cuturi14.html. Accessed Dec 14 2019
Dasgupta, S., Freund, Y.: Random projection trees and low dimensional manifolds. Citeseer (2008)
Dasgupta, S., Sinha, K.: Randomized partition trees for exact nearest neighbor search. In: Conference on Learning Theory, pp. 317–337 (2013)
Flamary, R., Courty, N.: POT Python optimal transport library (2017). https://github.com/rflamary/POT. Accessed Dec 14 2019
Genevay, A., Cuturi, M., Peyré, G., Bach, F.: Stochastic optimization for large-scale optimal transport. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, pp. 3440–3448. Curran Associates Inc., USA (2016). http://dl.acm.org/citation.cfm?id=3157382.3157482. Accessed Dec 14 2019
Indyk, P.: Algorithms for dynamic geometric problems over data streams. In: Proceedings of the Thirty-sixth Annual ACM Symposium on Theory of Computing, STOC ’04, pp. 373–380. ACM, New York (2004). https://doi.org/10.1145/1007352.1007413
Indyk, P.: Nearest neighbors in high-dimensional spaces. In: Handbook of Discrete and Computational Geometry. Citeseer (2004)
Indyk, P.: A near linear time constant factor approximation for Euclidean bichromatic matching (cost). In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’07, pp. 39–42. Society for Industrial and Applied Mathematics, Philadelphia (2007). http://dl.acm.org/citation.cfm?id=1283383.1283388. Accessed Dec 14 2019
Indyk, P., Thaper, N.: Fast image retrieval via embeddings. In: Proceedings of the 3rd International Workshop on Statistical and Computational Theories of Vision, vol. 2, no. 3, p. 5. (2003)
Jacobs, M., Léger, F.: A fast approach to optimal transport: the back-and-forth method. Preprint (2019). ArXiv:1905.12154 [math.OC]
Johnson, J., Douze, M., Jégou, H.: Billion-scale similarity search with GPUs. IEEE Trans. Big Data (2019). https://doi.org/10.1109/TBDATA.2019.2921572
Knothe, H.: Contributions to the theory of convex bodies. Mich. Math. J. 4, 39–52 (1957)
Li, K., Malik, J.: Fast k-nearest neighbour search via prioritized DCI. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 2081–2090. JMLR.org (2017)
Musser, D.R.: Introspective sorting and selection algorithms. Softw. Pract. Exp. 27(8), 983–993 (1997)
Peyré, G., Cuturi, M.: Computational Optimal Transport: With Applications to Data Science. Now (2019). https://ieeexplore.ieee.org/document/8641476. Accessed Dec 14 2019
Radha, H., Vetterli, M., Leonardi, R.: Image compression using binary space partitioning trees. IEEE Trans. Image Process. 5(12), 1610–1624 (1996). https://doi.org/10.1109/83.544569
Rosenblatt, M.: Remarks on a multivariate transformation. Ann. Math. Stat. 23, 470–472 (1952). https://doi.org/10.1214/aoms/1177729394
Schmitzer, B.: Stabilized sparse scaling algorithms for entropy regularized transport problems. SIAM J. Sci. Comput. 41(3), A1443–A1481 (2019)
Schrijver, A.: Combinatorial Optimization: Polyhedra and Efficiency, vol. 24. Springer, Berlin (2003)
Villani, C.: Topics in Optimal Transportation. Graduate Studies in Mathematics, vol. 58. American Mathematical Society, Providence (2003). https://doi.org/10.1007/b12016
Villani, C.: Optimal Transport, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 338. Springer, Berlin (2009). https://doi.org/10.1007/978-3-540-71050-9
Wu, X.: Image coding by adaptive tree-structured segmentation. IEEE Trans. Inf. Theory 38(6), 1755–1767 (1992). https://doi.org/10.1109/18.165448
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
L.N. was supported by Simons Foundation and the Centre de Recherches Mathématiques, through the Simons-CRM scholar-in-residence program, AFOSR MURI FA9550-18-1-0502, and ONR Grant N00014-18-1-2527. This material is based on work supported by the Air Force Office of Scientific Research under Award Number FA9550-18-1-0167.
Appendix
Appendix
Proof (Theorem 3)
We assume that \(\{e_i\}_{i=1}^d\) is the standard basis in \(\mathbb {R}^d\) because the proof for a general basis is identical up to a multiplication by a suitable volume element.
1. Suppose that \(x \in \mathrm {int}\left( \mathrm {supp}(\mu )\right) \), and \(~x^{\prime }\in \mathbb {R}^d,~x\ne x^{\prime },\) but they never get separated by a hyperplane. Therefore, whenever a common subset containing \(x,x^{\prime }\) gets partitioned they always stay in the same side. Suppose that \(\{A_k\}\) is the sequence of subsets that they both belong during the cutting process. By construction, we have that
Since \(\{v_k\}_{k=1}^\infty \subset \{e_1,e_2,\ldots ,e_d\}\) we have that
where \(-\infty \le \alpha ^k_i <\beta ^k _i \le +\infty \). Denote by
Without loss of generality, assume that
Since \(\{A_k\}\) are rectangles we have that
is also a rectangle, and we denote by
and we have that
Since \(x,x^{\prime } \in R\) we have that
Furthermore, we have that
If \(\mathcal {L}^d(R)>0\) then we have that \(\mathrm {int}(R)\ne \emptyset \) and \(\mathrm {int}(R)\cap \mathrm {supp}(\mu )=\emptyset \) which contradicts to the fact that \(x \in \mathrm {int}\left( \mathrm {supp}(\mu )\right) \). Therefore, we have that \(\mathcal {L}^d(R)=0\) which means that \(\alpha _i=\beta _i\) for some \(i>l\). Without loss of generality assume that
We have that \(q\ge l\). Moreover, \(\alpha _i=\beta _i=x_i=x^{\prime }_i\), and \(-\infty<\alpha _i^k<\beta _i^k<\infty \) for all \(i>q\) and k large enough. Additionally, if \(-\infty<\alpha _i<\beta _i<\infty \) for some \(1\le i \le q\) then \(-\infty<\alpha _i^k<\beta _i^k<\infty \) for k large enough. In what follows we assume that k is so large that this previous statements hold.
Furthermore, assume that \(M>0\) is such that
Since \(\mu =fdx\), by construction we have that
Therefore, using \(c\le f \le C,~\mu \) a.e. we get that
Now, suppose that for some k the set \(A_k\) gets partitioned in the direction \(e_1\). There are three possibilities: (a) \(-\infty<\alpha _1<\beta _1<\infty \), (b) \(-\infty<\alpha _1<\beta _1=\infty \), (c) \(-\infty =\alpha _1<\beta _1<\infty \).
(a) \(-\infty<\alpha _1<\beta _1<\infty \). In this case, we have that \(-\infty<\alpha _1^k<\beta _1^k<\infty \) since k is large enough.Therefore, either
or
for some \(\alpha _1^k<\gamma <\beta _1^k\). Suppose that we are in the former case.
Since \(x\in \mathrm {int}(\mathrm {supp}(\mu ))\) we have that there exists a \(\sigma >0\) such that
We have that
and therefore
where we used the fact that \(\beta _1\le \gamma < \beta _1^k\) and
On the other hand, we have that
therefore
Hence, we obtain that
Similarly, if \(x,x^{\prime }\) fall in the upper (in \(e_1\) direction) half of \(A_k\) we get that
(b) \(-\infty<\alpha _1<\beta _1=\infty \). In this case we have that \(-\infty<\alpha _1^k<\beta _1^k=\infty \), and
for some \(\alpha _1^k<\gamma <\infty \). As before, we have that
Similarly, we have that
and thus
c) \(-\infty =\alpha _1<\beta _1<\infty \). In this case, we have that \(-\infty =\alpha _1^k<\beta _1^k<\infty \), and
Summarizing, we get whenever \(A_k\) gets partitioned in \(e_1\) we have that
We get similar estimates for partitions in any of the directions \(\{e_i\}_{i=1}^q\). Thus, if we take a subsequence \(\{A_{k_m}\}\) that get partitioned in one of these directions we get that
which contradicts to (6). Thus, the first item is proven.
2. Firstly, we will show that for every \(x\in \mathrm {int}(\mathrm {supp}(\mu ))\) there exists \(y \in \mathrm {supp}(\nu )\) such that
Assume that \(\{A_k\}\) is the sequence of partition sets that contain x. Again, we have that
and
for some \(-\infty \le \alpha ^k_i <\beta ^k _i \le +\infty \). Moreover, from the previous item we obtain that
because \(x\in \mathrm {int}(\mathrm {supp}(\mu ))\), and the intersection cannot contain any other point. Therefore, we have that \(-\infty< \alpha ^k_i<\beta ^k _i < +\infty \), and
where \(x=(x_1,x_2,\ldots ,x_d)\).
Denote by \(\{B_k\}\) the dual sequence of \(\{A_k\}\) that partition \(\nu \). Again, we have that
for some \(-\infty< \gamma ^k_i<\delta ^k _i < +\infty \). Thus, our first task is to show that
Since \(\alpha _i^k<x_i\) we have that \(\{\alpha _i^k\}_k\) is not an eventually constant sequence. Therefore, \(\{\gamma ^k_i\}_k\) is also not an eventually constant sequence. Besides, \(\{\gamma ^k_i\}_k\) and \(\{\delta ^k_i\}_k\) are, respectively, nondecreasing and nonincreasing sequences. Therefore, we have that
where
In fact, we have that
Since \(\nu (B_k)>0\) we have that \(\mathrm {cl}(B_k) \cap \mathrm {supp}(\nu ) \ne \emptyset \). Thus,
is a nested family of nonempty compact sets. Therefore, we have that
If \(W_x\cap \mathrm {int}(\mathrm {supp}(\nu )) \ne \emptyset \) then by item 1, we get that
for some \(y \in \mathrm {int}(\mathrm {supp}(\nu ))\). Hence, to complete the proof of item 1, we need to show that there exists a \(F_0 \in \mathcal {B}(\mathbb {R}^d)\) such that \(\mu (F_0)=0\), and
For every k denote by
and
Since
we obtain that
Furthermore, for every \(B \in \varDelta _k^{\prime \prime }\) we have that \(\nu (B)>0\), and therefore \(B \cap \mathrm {supp}(\nu ) \ne \emptyset \). On the other hand, \(B \cap \partial (\mathrm {supp}(\nu )) = \emptyset \), and B is connected. Hence, \(B \subset \mathrm {int}(\mathrm {supp}(\nu ))\), and
Note that
By item 1 we have that for every \(y \in \mathrm {int}(\mathrm {supp}(\nu ))\) there exists a partition rectangle B such that \(y\in B \subset \mathrm {int}(\mathrm {supp}(\nu ))\). Hence, \(B \in \varDelta ^{\prime \prime }_k\) for some k, and \(y \in G_k\). Therefore, we obtain that
Consequently,
Since \(\nu (\mathrm {int}(\mathrm {supp}(\nu )))=1\) we get that
Denote by \(\varOmega _k^{\prime }\) and \(\varOmega _k^{\prime \prime }\) the families dual to \(\varDelta _k^{\prime }\) and \(\varDelta _k^{\prime \prime }\). Furthermore, denote by
By construction, we have that
Moreover,
Denote by
Then, we have that \(F_0 \in \mathcal {B}(\mathbb {R}^d)\), and
Finally, note that if \(W_x \cap \mathrm {int}(\mathrm {supp}(\nu ))=\emptyset \) then \(x\in F_0\).
3. From items 1, 2 we have that the map \(\hat{t}: \mathrm {int}(\mathrm {supp}(\mu )) {\setminus } F_0 \rightarrow \mathrm {int}(\mathrm {supp}(\nu ))\) given by
is well defined. Our first task is to show that \(\hat{t}\) is Borel measurable. For that, we need to show that \(\hat{t}^{-1}(G) \in \mathcal {B}(\mathbb {R}^d)\) for any open set \(G \subset \mathbb {R}^d\). Since \(\mathrm {Im}(\hat{t}) \subset \mathrm {int}(\mathrm {supp}(\nu ))\) we have that
Therefore, we may assume that \(G \subset \mathrm {int}(\mathrm {supp}(\nu ))\). Furthermore, denote by
Next, define
Then we have that
From item 1, we have that for every \(y\in G\) there exists a partition set B such that \(y\in B \subset G\). Hence, we get that
This means that
On the other hand, from Eq. (4) in Theorem 2 we have that
where \(\varOmega _k^{\prime },\varOmega _k^{\prime \prime }\) are the dual families of \(\varDelta _k^{\prime },\varDelta _k^{\prime \prime }\). Therefore, we have that \(\hat{t}^{-1}(G) \in \mathcal {B}(\mathbb {R}^d)\). Furthermore, denote by
By construction, we have that
and hence
On the other hand, from construction and equations \(\mu (F_0)=0\), \(\mu (\mathrm {int}(\mathrm {supp}(\mu )))=1\), we have that
Therefore, we obtain that
Thus, \(\hat{t}\) is Borel measurable and \(\hat{t}\sharp \mu =\nu \). Finally, from item 1 we have that points in \(\mathrm {Im}(\hat{t})\subset \mathrm {int}(\mathrm {supp}(\nu ))\) get eventually separated. Therefore, by Theorem 2 we obtain that \(\hat{t}\) is half-space-preserving. \(\square \)
Proof (Theorem 4)
Denote by H the union of all the hyperplanes that partition \(\mu \). Then we have that \(\mathcal {L}^d(H)=\mu (H)=0\). We prove that \(\hat{t}\) is continuous on \(\mathrm {Dom}(\hat{t}){\setminus } H\).
Suppose \(x\in \mathrm {Dom}(\hat{t}){\setminus } H\), and \(y=\hat{t}(x)\). Furthermore, denote by \(\{A_k\}\) and \(\{B_k\}\) the rectangles that contain x and y, respectively. We have that
Moreover, since \(x\notin H\) we have that \(x\in \mathrm {int}(A_k)\) for all k. Therefore, by construction, we have that \(y\in \mathrm {int}(B_k)\) for all k. Thus, we obtain that
which yields the continuity. \(\square \)
Rights and permissions
About this article
Cite this article
Nurbekyan, L., Iannantuono, A. & Oberman, A.M. No-Collision Transportation Maps. J Sci Comput 82, 45 (2020). https://doi.org/10.1007/s10915-020-01143-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10915-020-01143-x