Keywords

1 Introduction

2D-combinatorial maps are algebraic structures which allow to describe and work with plane graphs, that is, embeddings of planar graphs, with applications in Image Processing for instance [1]. Using such structures has allowed to establish several algorithmic properties. E.g., it is possible to decide whether two drawings of planar graphs, or two maps, are isomorphic or not in quadratic time [24]. Moreover, deciding whether a pattern (i.e., a drawing made of a connected subset of faces) appears in a map is also tractable in quadratic time; this property is interesting since determining whether a connected graph is a sub-graph of a planar graph is known to be an \(\mathcal {NP}\)-complete problem [46]. So focusing on a particular drawing of a planar graph (among possibly an exponential number of possibilities) is very helpful from an algorithmic point of view. It has also been shown that searching for a disconnected-pattern, built from several disconnected patterns, is an \(\mathcal {NP}\)-hard problem [4].

A related problem consists in finding large common patterns in two maps. In order to get a common pattern, one must eliminate subsets of faces from both maps and obtain the same pattern up to an isomorphism. E.g., in Fig. 1, map (c) is the maximum common pattern of maps (a) and (b); the eliminated faces are shown with dotted lines. It has been proved in [7] that computing such large common patterns is an \(\mathcal {NP}\)-hard problem, even when patterns are connected.

In this paper, we constrain the definition of common patterns: we now require the pattern to be extendable to both maps when adding independent groups of faces. Independence means that if any new face is added in a connected way to the common pattern, then the result ceases to be a pattern of one of the two maps; moreover, if any face is added to the pattern in order to get one of the maps, then no face can be added at the same place in the pattern to get the second map. See Fig. 1(d) for an example. Every common pattern that has this property results of an overlap between both maps, where pairs of faces of both maps were merged together: such an overlap defines an overlapping pattern.

Fig. 1.
figure 1

Maps (a) and (b) have map (c) as maximal common pattern, and map (d) as maximal overlapping pattern. Dotted lines are construction features, and do not belong to the maps.

Notice that the overlapping patterns can be smaller than the maximal common patterns. On the other hand, while the latter are not tractable in polynomial time [7], we show in this paper that computing any connected overlap is tractable in linear time, and enumerating all of the connected overlap is a quadratic-time problem. It follows that finding the largest connected overlap can also be done in polynomial time and space. In contrast, we prove that finding large possibly disconnected overlaps is \(\mathcal {NP}\)-hard.

Finally, in terms of applications, every maximal overlapping pattern O yields a distance defined by: \(d(M_1,M_2)=\text{ size }(M_1)+\text{ size }(M_2)-2.\text{ size }(O).\) If one can find any maximal overlap in polynomial time, distance d is efficiently computable, and may then be used as an efficient rough approximation to tighter \(\mathcal {NP}\)-hard graph edit distances.

In Sect. 2, we recall the definitions of full and semi-open maps. The overlaps are introduced in Sect. 3, and the overlapping patterns in Sect. 4. The polynomial problems, related to the existence and the enumeration of the connected overlaps, are investigated in Sect. 5. The correctness of both algorithms is sketched with no detail, due to the lack of space. The case of disconnected overlaps is discussed in Sect. 6. Finally, Sect. 7 concludes the paper.

2 Combinatorial Maps

Definition 1

Let D be a finite set of darts. A 2D full combinatorial map is a triple \(M=(D,\alpha ,\beta )\) such that (1) \(\alpha :D\rightarrow D\) is a 1-to-1 mapping (i.e., a permutation over D), and (2) \(\beta :D\rightarrow D\) is a 1-to-1 mapping such that for all \(d\in D\), \(\beta (\beta (d))=d\) (i.e., an involution over D). Two darts d and \(d'\) such that \(d'=\alpha (d)\) or \(d'=\beta (d)\) are respectively said \(\alpha \)-sewn or \(\beta \)-sewn.

Figure 2 shows an example of a full map. Notice that, given a dart d, the face which is incident to d is obtained by iterating \(\alpha \). E.g., in Fig. 2, the face incident to dart 12 is described with set \(\{12,13,14,15\}\) and we have \(\alpha (12)=13\), \(\alpha (13)=14\), \(\alpha (14)=15\) and \(\alpha (15)=12\). Similarly, the edges and the vertices of a full map are respectively introduced as the orbits of permutations \(\alpha \) and \(\beta \circ \alpha \).

Fig. 2.
figure 2

An example of full map. The darts are represented by numbered black segments. Two \(\alpha \)-sewn darts are drawn consecutively, and two \(\beta \)-sewn darts are drawn concurrently and in reverse orientation, with little gray segment between them.

All the faces of a full map are defined, which is irrelevant if this map is expected to overlap with others; some faces must be invisible, so function \(\beta \) must be partially defined. This leads us to introduce semi-open maps, simply called maps throughout the rest of this paper [8]. The idea is to implicitly add a new element \(\varepsilon \) to the set of darts, and allow any dart to be \(\beta \)-linked with \(\varepsilon \) whenever such a dart has no adjacent face. Figure 3 shows an example.

Fig. 3.
figure 3

An example of (semi-open) map. Darts a, b, d, f and g are not \(\beta \)-sewn.

Definition 2

Let D be a finite set of darts and \(\varepsilon \not \in D\) a fresh implicit dart. A semi-open map, or simply map, is a triple \(M=(D,\alpha ,\beta )\) such that

  • \(\alpha :D\cup \{\varepsilon \}\rightarrow D\cup \{\varepsilon \}\) is a 1-to-1 mapping with \(\alpha (\varepsilon )=\varepsilon \);

  • \(\beta :D\cup \{\varepsilon \}\rightarrow D\cup \{\varepsilon \}\) is a partial involution [9, Definition 4], that is, a mapping such that (1) \(\beta (\varepsilon )=\varepsilon \) and (2) for all \(d\in D\), if \(\beta (d)\not =\varepsilon \), then \(\beta (\beta (d))=d\).

The complexity of the problems that we address on the maps is often related to connectivity. For instance, searching for a connected pattern (subset of contiguous faces) in a map is a quadratic problem, whereas searching for a disconnected pattern (subset of independent patterns) is \(\mathcal {NP}\)-complete [9].

Definition 3

Let \(M=(D,\alpha ,\beta )\) be a semi-open map. Any subset \(U\subseteq D\) is connected in map M if for all \(d,d'\in U\), there exists a sequence \(d_0,d_1,\ldots d_n\in U\) such that (1) \(d_0=d\) and (2) \(d_n=d'\) and (3) for all \(0\le k<n\), we have \(d_{k+1}=\gamma _k(d_k)\) with \(\gamma _k=\alpha \) or \(\gamma _k=\beta \). We say that map M is connected if set D itself is connected in map M.

3 The Overlaps of Two Maps

An overlap is a maximal one-to-one matching. Let \(M_1=\langle D_1,\alpha _1,\beta _1\rangle \) and \(M_2=\langle D_2,\alpha _2,\beta _2\rangle \) be two fixed semi-open maps.

Definition 4

A (one-to-one) matching is a function \(h:U_1\rightarrow U_2\) such that

  1. 1.

    \(U_1\subseteq D_1\) and \(U_2\subseteq D_2\),

  2. 2.

    h is bijective,

  3. 3.

    for all \(d_1\in U_1\) and \(d_2 = h(d_1)\),

    • \(\alpha _1(d_1)\in U_1\) if and only if \(\alpha _2(d_2)\in U_2\), and \(h(\alpha _1(d_1))=\alpha _2(d_2)\);

    • \(\beta _1(d_1)\in U_1\) if and only if \(\beta _2(d_2)\in U_2\), and \(h(\beta _1(d_1))=\beta _2(d_2)\).

Fig. 4.
figure 4

Consider the maps \(M_1\) and \(M_2\) above. Mapping \(h=\{1\mapsto 8,2\mapsto 9,6\mapsto 13\}\) is a matching, whereas mapping \(h'=\{1\mapsto 8,2\mapsto 9,6\mapsto 10\}\) is not; indeed, we have \(\alpha _2(9)=10\), but \(\alpha _1(2)\not =6\). Any matching must preserve the seams of both maps.

An example is given in Fig. 4. Notice that \(\varepsilon \not \in U_1\) and \(\varepsilon \not \in U_2\) (since \(\varepsilon \not \in D_1\) and \(\varepsilon \not \in D_2\)); nevertheless, for convenience reasons, we shall implicitly suppose that \(h(\varepsilon )=\varepsilon \).

Definition 5

Let \(h:U_1\rightarrow U_2\) be a one-to-one matching as in Definition 4. We say that h is an overlap if:

  1. 1.

    for all \(d_1\in U_1\), we have \(\alpha _1(d_1)\in U_1\), and

  2. 2.

    for all \(d_1\in U_1\) and \(d_2=h(d_1)\), if \(\beta _1(d_1)\not =\varepsilon \) and \(\beta _2(d_2)\not =\varepsilon \), then \(\beta _1(d_1)\in U_1\) and \(\beta _2(d_2)\in U_2\).

We say that h is a connected overlap if subset \(U_1\) is connected in map \(M_1\). We say that h is a disconnected overlap otherwise.

The first condition above implies that if a dart \(d_1\) of \(M_1\) matches a dart \(d_2\) of \(M_2\), then the whole face incident to \(d_1\) must match the whole face incident to \(d_2\). The second condition means that if darts \(d_1\) and \(d_2\) match together and both have adjacent faces, then opposite \(\beta \)-sewn darts, and thus adjacent faces must also match together. In consequence, each pieces of both maps \(M_1\) and \(M_2\) must be as large as possible, that is, the matching must be “maximal”. See Figs. 5 and 6 for examples.

Fig. 5.
figure 5

An example of connected overlap \(h=\{1\mapsto 1',2\mapsto 2',\ldots 8\mapsto 8'\}\). Notice that no overlap can be built with darts 5 and 4’ matched together.

Fig. 6.
figure 6

An example of disconnected overlap \(h=\{1\mapsto 1',2\mapsto 2',\ldots 8\mapsto 8'\}\).

4 Properties of the Overlapping Patterns

Given two maps \(M_1\) and \(M_2\) and an overlap \(h:U_1\rightarrow U_2\), the elimination of all the darts, but those from sets \(U_1\) and \(U_2\), in maps \(M_1\) and \(M_2\) respectively, defines two isomorphic sub-maps of \(M_1\) and \(M_2\). In other words, an overlap defines an overlapping common pattern:

Definition 6

Map \(P=\langle D,\alpha ,\beta \rangle \) is a pattern of map \(M=\langle D',\alpha ',\beta '\rangle \) if there exists a one-to-one matching \(\varphi :D\rightarrow V\) (with \(V\subseteq D'\)). Moreover, any map P is a common pattern of maps \(M_1\) and \(M_2\) if P is a pattern of both \(M_1\) and \(M_2\).

As direct consequences of Definitions 4 and 5, the overlapping patterns have the following properties:

  • Every overlapping pattern of \(M_1\) and \(M_2\) is a pattern of both \(M_1\) and \(M_2\);

  • Given every pair of darts (\(d_1,d_2\)), there exists at most one connected overlap, and thus at most one connected overlapping pattern, in which \(d_1\) and \(d_2\) are matched together;

  • An overlapping pattern is maximal in the following sense: if we add to this pattern something and the result is a semi-open map, then this map is no longer a sub-map of \(M_1\) or of \(M_2\).

Concerning computational issues, we have provided in [3] an algorithm in \(\mathcal {O}(|D_1|\times |D_2|)\) time to decide whether two (possibly non-connected) maps were isomorphic or not. An extension of this algorithm can be used to prove that a connected map is a pattern of another map. In both cases, due to the technical necessity for the matching h to commute with bijections \(\alpha \) and \(\beta \), there are no more than \(|D_1|\) possible matching functions, and the algorithms actually enumerate all of them in \(\mathcal {O}(|D_1|\times |D_2|)\) time. Concerning the second problem, the fact that the first map is connected is crucial, since this problem is \(\mathcal {NP}\)-complete for disconnected patterns [7].

With respect to finding large common connected patterns, it has unfortunately been shown in [7] that this problem is \(\mathcal {NP}\)-hard too. But looking for large overlapping patterns is simpler: whereas in general the pattern can be placed anywhere in the maps, we now insist on the pattern to somehow be placed on the border of both maps, corresponding to the part where they overlap.

In order to illustrate and discuss this point, we turn to strings. In terms of strings, common patterns would be common sub-strings of two given strings: given \(u,v\in \varSigma ^*\), a pattern is a string \(w\in \varSigma ^+\) such that \(u=lwr\) and \(v=l'wr'\) for some \(l, l', r, r'\in \varSigma ^*\).

On the other hand, an overlap defines a string w as above, such that \(u=lwr\) and \(v=l'wr'\) but with the added condition that \(l=\epsilon \) or \(l'=\epsilon \) on one hand, \(r=\epsilon \) or \(r'=\epsilon \) on the other. This means that exactly 4 cases are possible:

  1. 1.

    \(u=lw\) and \(v=wr'\), w is a suffix of u and a prefix of v,

  2. 2.

    \(u=wr'\) and \(v=l'w\), w is a prefix of u and a suffix of v,

  3. 3.

    \(u=lwr\) and \(v=w\), v is a sub-string of u,

  4. 4.

    \(u=w\) and \(v=l'wr'\), u is a sub-string of v.

So the problem is simpler.

5 Finding Connected Overlaps Efficiently

In this section, we show that it is possible to efficiently find the largest connected overlap of two maps.

5.1 A Linear Time and Space Algorithm to Check Whether a Connected Overlap Exists

Let \(M_1=\langle D,\alpha _1,\beta _1\rangle \) and \(M_2=\langle D_2,\alpha _2,\beta _2\rangle )\) be two semi-open maps. The first problem we tackle consists in determining whether two darts \(d_1\in D_1\) and \(d_2\in D_2\) can match together in an overlap. This is the purpose of Algorithm 1.

This procedure performs a parallel traversal of both maps \(M_1\) and \(M_2\), starting from the darts \(d_1\) and \(d_2\), which are grouped together into a couple \((d_1,d_2)\). The procedure uses the \(\alpha \)- and \(\beta \)-functions of both maps to discover new couples of darts from the couples that have been discovered so far. It more precisely builds a candidate overlap h such that \(h[d_1]=d_2\). So initially, \(h[d_1]\) is set to \(d_2\), whereas h[d] are set to nil for all other darts d.

Each time a couple \((a,a')\) is discovered from another couple \((d,d')\) by using the \(\alpha \)-functions, we check whether both darts a and \(a'\) have ever been met. If either a or \(a'\) have already been visited through the traversal, we carefully check basic conditions which ensure us to get a valid matching h at the end of the algorithm. Otherwise, h[a] is set to \(a'\) and the couple \((a,a')\) is further used to discover new darts. With respect to \(\beta \)-functions, the same principle holds, but more cases of failure can occur, as the darts d or \(d'\) may have no adjacent face.

figure a

Note that Algorithm 1 returns a single Boolean value. Nevertheless, one can easily modify the procedure and get the connected overlap h as a certificate, in case of success. The following theorem claims the correctness of this algorithm. We shall sketch a proof just after Theorem 3.

Theorem 1

Algorithm 1 is correct, that is:

  • If checkConnectedOverlap \((M_1,M_2,d_1,d_2)\) returns true, then the array h that is built by the procedure encodes a connected overlap \(h:U_1\rightarrow U_2\) such that \(h(d_1)=d_2\).

  • If checkConnectedOverlap \((M_1,M_2,d_1,d_2)\) returns false, then no overlap \(h:U_1\rightarrow U_2\) exists such that \(h(d_1)=d_2\).

With respect to the complexity of the algorithm, we have:

Theorem 2

Algorithm 1 runs in \(\mathcal{O}(\min (|D_1|,|D_2|))\) time and \(\mathcal{O}(\max (|D_1|,|D_2|))\) space.

Proof

Suppose that \(|D_1|\le |D_2|\) without loss of generality. Then the while loop of Algorithm 1 is iterated at most \(|D_1|\) times. Indeed, (1) at each iteration, exactly one dart \(d\in D_1\) is removed from the stack S, within a couple (dx) for some \(x\in D_2\), and (2) each dart \(d\in D_1\) enters S at most once, within a couple (dx) for any \(x\in D_2\): d enters S only if \(h[d]=nil\), and before entering S, h[d] is set to dart x. So the algorithm runs in \(\mathcal{O}(\min (|D_1|,|D_2|))\) time. As for space issues, notice that the arrays h and g respectively have \(|D_1|\) and \(|D_2|\) entries, while stack S never contains more than \(\min (|D_1|,|D_2|)\) couples of darts.    \(\square \)

Notice that Algorithm 1 exploits the same key idea as the algorithms developed in [3] to solve the map isomorphism and sub-isomorphism problems. We nevertheless improve them as the failure cases are detected during the traversal of the maps, and no further verification stage is needed after the traversal.

5.2 A Quadratic Time and Space Algorithm to Get All the Connected Overlaps

Let \(M_1=\langle D,\alpha _1,\beta _1\rangle \) and \(M_2=\langle D_2,\alpha _2,\beta _2\rangle \) be two semi-open maps. In Sect. 5.1, we have given a procedure that checks whether two darts \(d_1\in D_1\) and \(d_2\in D_2\) can match together in a connected overlap. To achieve this goal, Algorithm 1 performs a parallel traversal of both maps \(M_1\) and \(M_2\), starting from couple \((d_1,d_2)\), and using the \(\alpha \)- and \(\beta \)-functions of both maps to investigate new couples of darts from the couples that have been visited so far. Clearly, this procedure may visit any couple of \(D_1\times D_2\). So we use this set to define a new map, denoted \(M_1\otimes M_2\), and call the product map of maps \(M_1\) and \(M_2\):

Definition 7

The product of two maps \(M_1\) and \(M_2\) is \(M_1\otimes M_2=\langle D_1\times D_2,\alpha ,\beta \rangle \) where, for all \((d_1,d_2)\in D_1\times D_2\):

$$\begin{aligned} \alpha (d_1,d_2)= & {} (\alpha _1(d_1),\alpha _2(d_2)), \text{ and } \\ \beta (d_1,d_2)= & {} \left\{ \begin{array}{ll} (\beta _1(d_1),\beta _2(d_2)) &{} \text{ if } \beta _1(d_1)\not =\varepsilon \text{ and } \beta _2(d_2)\not =\varepsilon , \\ \varepsilon &{} \text{ otherwise. } \end{array} \right. \end{aligned}$$

For instance, consider the maps of Fig. 7, respectively made of 7 and 6 darts. The product map \(M_1\otimes M_2\), which has 42 darts, contains 6 connected components. Clearly, two kinds of connected components appear. The four small components will be called real, and the two large ones, imaginary:

Fig. 7.
figure 7

Given the maps \(M_1\) and \(M_2\), we build the product map \(M_1\otimes M_2\) and display its connected components. The two big components are imaginary, and the four small ones are real. Couple (1, 8) belongs to an imaginary component, so following Theorem 3, no overlap exists such that darts 1 and 8 match. Conversely, couple (1, 9) is in a real component and mapping \(\{1\mapsto 9,2\mapsto 10,3\mapsto 8\}\) is a connected overlap.

Definition 8

A connected component \(C=\langle D,\alpha ,\beta \rangle \) of product map \(M_1\otimes M_2\) is said real if for all \((d_1,d_2), (d'_1,d'_2)\in D\),

  1. 1.

    \(d_1=d'_1\) iff \(d_2=d'_2\), and

  2. 2.

    \(\beta _1(d_1)=d'_1\) iff \(\beta _2(d_2)=d'_2\).

A connected component which is not real is said imaginary.

Remark 1

The reader may wonder why no condition addressing the \(\alpha \)-functions is given in Definition 8. Actually, a consequence of Condition 1 is that for all \((d_1,d_2)\), \((d'_1,d'_2)\in D\), we have \(\alpha _1(d_1)=d'_1\) iff \(\alpha _2(d_2)=d'_2\). Indeed, let \((d_1,d_2)\in D\) and suppose that \((\alpha _1(d_1),d'_2)\in D\); as component C is connected, we have \(\alpha (d_1,d_2)=(\alpha _1(d_1),\alpha _2(d_2))\in D\); so using Condition 1, we deduce that \(d'_2=\alpha _2(d_2)\). Such a proof cannot be given for Condition 2, due to the fact that \(d_1\) or \(d_2\) can \(\beta \)-free.

Connected overlaps of maps \(M_1\) and \(M_2\) on the one hand, and real connected components of product map \(M_1\otimes M_2\) on the other hand, are strongly related. Indeed, we get the following result, which proceeds from the definitions:

Theorem 3

Let \(M_1\) and \(M_2\) be two semi-open maps.

  • For each connected overlap \(h:U_1\rightarrow U_2\), there exists a real connected component C of product map \(M_1\otimes M_2\) whose set of darts is \(\{(d,h(d)):d\in U_1\}\);

  • Conversely, for every real connected component \(C=\langle D,\alpha ,\beta \rangle \) of product map \(M_1\otimes M_2\), set D is the graph of a connected overlap of \(M_1\) and \(M_2\), that is to say, if we fix \(h(d_1)=d_2\) for all \((d_1,d_2)\in D\), then h is a connected overlap.

Note that we get Theorem 1 as a consequence of Theorem 3. Indeed, Algorithm 1 actually traverses the connected component where initial couple of darts \((d_1,d_2)\) stands; if the traversal returns false, then we get an evidence that the component is imaginary, so no connected overlap h exists such that \(h(d_1)=d_2\). Otherwise, we can show that the component is real and the array h is a connected overlap.

As a consequence of Theorem 3, we also deduce an efficient algorithm to enumerate all the connected overlaps (see Algorithm 2).

figure b

Theorem 4

Algorithm 2 is correct and runs in \(\mathcal {O}(|D_1|\cdot |D_2|)\) time and space.

Proof

The correctness follows from Theorem 3. As for the complexity, the product map \(M_1\otimes M_2\) has \(|D_1|\cdot |D_2|\) darts. Obviously, the computation of functions \(\alpha \) and \(\beta \), and the computation of the connected components, and the selection of the real connected components, are in linear time and space with respect to \(|D_1|\cdot |D_2|\).    \(\square \)

5.3 Finding the Largest Overlap of Two Maps

It is straightforward to use the results from the previous algorithm to return the largest overlap, provided this one is connected.

Furthermore a direct procedure exists to find the largest possibly disconnected overlap: consider a graph whose nodes are the connected overlaps between maps \(M_1\) and \(M_2\), each with a weight indicating how many darts they concern, and an edge indicates that two overlaps are compatible (do not contain common darts). Finding a clique with maximum sum of weights gives the largest possibly disconnected overlap. This procedure clearly runs in exponential time, and the following section shows that we cannot hope better.

6 Finding Large Disconnected Overlaps Is Intractable

We consider the following problem:

  • Name Large_Disconnected_Overlap;

  • Instance An integer N, and two semi-open maps \(M_1\) and \(M_2\);

  • Problem Does there exist a disconnected overlap \(h:U_1\rightarrow U_2\) s.t. \(|U_1|\ge N\)?

We get the following result:

Theorem 5

Problem Large_Disconnected_Overlap is \(\mathcal {NP}\)-complete.

Proof

Basically, Problem Large_Disconnected_Overlap is in class \(\mathcal {NP}\) since any certificate h can easily be verified. Now consider following problem:

  • Name Disconnected_Pattern;

  • Instance Two semi-open maps \(M_1\) and \(M_2\);

  • Problem Is map \(M_1\) a disconnected pattern of map \(M_2\)?

One can prove, by reduction from Separable_Planar_3SAT [10, Lemma 1], that this problem is \(\mathcal {NP}\)-complete. We do not give the details here: the proof is essentially the same as that provided in [7, Sect. 4]Footnote 1. We can finally reduce Disconnected_Pattern to Large_Disconnected_Overlap: we simply need to fix \(N=|D_1|\). Indeed, map \(M_1\) is a pattern of map \(M_2\) iff an overlap \(h:U_1\rightarrow U_2\) exists with \(|U_1|\ge |D_1|\) (that is, \(U_1=D_1\)).    \(\square \)

7 Conclusion and Future Works

Computing the overlaps has two advantages over computing the common patterns of two maps: on one hand, the optimisation problems are tractable (Sect. 5), and on the other, the overlaps are maximal objects (Sect. 4).

The overlaps allow us also to consider super-maps, where a super-map of \(M_1\) and \(M_2\) is a map of which both \(M_1\) and \(M_2\) are patterns. Then the smallest common super-map is obtained by adding to the largest overlap the faces which belong to \(M_1\) and \(M_2\) but are not matched.

Super-maps offer interesting possibilities as smallest common super-maps would be constructable in polynomial time whereas their duals, the largest common sub-maps, are not.

Finally, notice that the techniques introduced in this paper can, with no difficulty, be extended to open mapsFootnote 2, nD-maps and n-Gmaps [1].