Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In this paper, we present a neuro-mathematical model for the Poggendorff illusion, a well-known geometrical optical illusion [14, 33, 34], in which the presence of the central bar induces a misalignment of an oblique transversal [10, 31]. See Fig. 1, left. Our interest is to provide a model that takes into account the neuro-physiology of the visual process, starting from the investigation of Hubel and Wiesel [19]. They discovered the hypercolumnar structure of the primary visual cortex V1, meaning that for each point of the retina a whole set of cells in V1 sensitive to all possible orientations (hypercolumn) spikes. This structure induced a great interest among scientists from many communities: first neuro-mathematical model for the functionality of V1 of mammalians have been presented in [2, 6, 18, 21, 25].

We base our model on [6], where the functional architecture and neural connectivity of V1 is modelled as a principal fiber bundle of the roto-translation space \({\text {SE(2)}}\) equipped with a sub-Riemannian (SR) metric. In particular, the latter provides a justified model of perceptual completion through surfaces, ruled in geodesics. As a consequence, an illusory contour appears as a geodesic in the metric defined in the cortex. In this work, we propose a new SR-metric in \({\text {SE(2)}}\), with a neuro-physiological basis and directly induced by data-adaptivity to a visual stimulus. The polarized SR metric obtained in this way is presumably responsible for the misperception in such type of illusions. In particular, illusory curves arise as geodesics of the considered metric.

We compute globally optimal geodesics (length-minimizers) via SR Fast-Marching(SR-FM) [28], a fast accurate numerical method for solving SR-eikonal system. The solution is a SR distance map and minimizers are recovered through back-tracking on it. We show that illusory contours are well approximated by length-minimizers of \({\text {SE(2)}}\), meaning that the perceptual phenomena is naturally explained by the geometry of V1. This hypothesis is verified qualitatively for the Poggendorff illusion in Sect. 4.1.

Fig. 1.
figure 1

Left: The Poggendorff illusion: a transversal line, obstructed by a surface, appears to be misaligned. Center: the initial stimulus with a second transversal corresponding to the perceptual completion. Right: A level set of \(\mathcal {C}(x,y,\theta )\) for \(\theta = 0\).

2 Preliminaries

2.1 Gabor Filters

The visual process is the result of several retinal and cortical mechanisms which act on the visual signal. The retina is the first part of the visual system responsible for the transmission of the signal, which passes through the Lateral Geniculate Nucleus and arrives in the visual cortex, where it is processed. The receptive field (RF) of a cortical neuron is the portion of the retina which the neuron reacts to, and the receptive profile (RP) \(\psi (\chi )\) is the function that models its activation when a stimulus is applied to a point \(\chi = (\chi _1,\chi _2)\) of the retinal plane. Let us notice that \(\chi = (\chi _1,\chi _2)\) denotes the local portion of the retina, centered at \(\eta = (x,y)\), to which the neuron reacts to, while \(\eta = (x,y)\) refers to the global coordinates system of the retina \(\mathbb {R}^2\). Simple cells of V1 are sensitive to position and orientation of the contrast gradient of an image. Their properties have been experimentally described by De Angelis in [11]. Considering a basic geometric model, the set of simple cells RPs can be obtained via translation of vector \(\eta = (x, y)\) and rotation of angle \(\theta \) from a unique mother profile \(\psi _0(\chi )\). As pointed out by Daugmann in [9] and Jones and Palmer in [20], Gabor filters are a good model of receptive profiles and they provide a good estimation of the spiking responses. In our contribution, odd part of Gabor filters are employed to detect orientation and polarity of contours and to identify the presence of surfaces. A mother Gabor filter is given by

$$\begin{aligned} \psi _0 (\chi ) = \psi _0 (\chi _1, \chi _2) = \frac{\alpha }{2 \pi \sigma ^2} e^{\frac{-(\chi _1^2 + \alpha ^2 \chi _2^2)}{2\sigma ^2}}e^{\frac{2 i \chi _2}{\lambda }}, \end{aligned}$$
(1)

where \(\lambda >0\) is the spatial wavelength of the cosine factor, \(\alpha >0\) is the spatial aspect ratio and \(\sigma >0\) is the standard deviation of the Gaussian envelope. Translations and rotations can be expressed as:

$$\begin{aligned} A_{(x,y,\theta )}(\chi ) = \left( \begin{array}{l} x \\ y \end{array}\right) + \left( \begin{array}{ll} \cos \theta &{}-\sin \theta \\ \sin \theta &{}\cos \theta \end{array}\right) \left( \begin{array}{l} \chi _1 \\ \chi _2 \end{array} \right) . \end{aligned}$$
(2)

Hence a general RP can be expressed as:

$$\begin{aligned} \psi _{(x,y,\theta )} (\chi _1, \chi _2) = \psi _0(A^{-1}_{(x,y,\theta )}(\chi _1, \chi _2)). \end{aligned}$$

2.2 Output of Receptive Profiles

The retinal plane is identified with \(\mathbb {R}^2\), see [6]. When a visual stimulus of intensity \(I(x,y): M \subset \mathbb {R}^2 \rightarrow \mathbb {R}^+\), activates the retinal layer of photoreceptors, the neurons whose RFs intersect M spike and their spike frequencies \(O(x,y,\theta )\) can be modelled (taking into account just linear contributions) as the integral of the signal I with the set of Gabor filters. Indeed we assume the treated visual stimulus I to be integrable, i.e. \(I \in L ^1(\mathbb {R}^2)\). The expression for this output is:

$$\begin{aligned} O(x,y,\theta ) = \int _M \, I(\chi _1, \chi _2)\, \psi _{(x,y,\theta )}\, (\chi _1, \chi _2) \, d\chi _1 d\chi _2 . \end{aligned}$$
(3)

In the right hand side of the equation, the integral of the signal with the real and imaginary part of the Gabor filter is expressed. These last model two families of simple cells which have different shapes, hence they detect different features.

2.3 Sub-Riemannian Structure on \({\mathbf {SE(2)}}\)

The Lie group \({\text {SE(2)}}\) of planar roto-translations is identified with the coupled space of positions and orientations \({\mathbb {R}}^2 \times S^1\), and for each \(\eta = (x,y,\theta )\), \(\eta ' = (x',y',\theta ') \in {\mathbb {R}}^2 \times S^1\) one has left multiplication

$$\begin{aligned} L_{\eta } \eta ' = (x' \cos \theta + y' \sin \theta + x, -x' \sin \theta + y' \cos \theta + y, \theta '+\theta ). \end{aligned}$$
(4)

Via the push-forward \((L_{\eta })_*\) one gets the left-invariant frame \(\{X_1, X_2, X_3\}\) from the Lie-algebra basis \(\{A_1,A_2,A_3\} = \{\left. \partial _{x}\right| _{e}, \left. \partial _{\theta }\right| _{e}, \left. \partial _{y}\right| _{e}\}\) at the unity \(e = (0, 0, 0)\):

$$\begin{aligned} X_{1}= \cos \theta \, \partial _{x} +\sin \theta \, \partial _{y}, \quad X_{2}= \partial _{\theta }, \quad X_{3}= -\sin \theta \, \partial _{x} +\cos \theta \, \partial _{y}. \end{aligned}$$
(5)

A SR manifold is given by a triple \(({\text {SE(2)}}, \varDelta , \mathcal {G}),\) where \(\varDelta \) is a subbundle of the tangent bundle, and \(\mathcal {G}\) is a metric defined on \(\varDelta \). For a general introduction to SR-geometry see [24]. If we define the horizontal distribution \(\varDelta = \text {span}(X_1, X_2)\), the length of a curve \(\gamma : [0,T] \mapsto {\text {SE(2)}}\) whose derivative belongs to the distribution \(\varDelta \) is defined as

$$\begin{aligned} \begin{array}{c} \dot{\gamma }(t) = u_1(t) \, X_{1}|_{\gamma (t)} + u_2(t) \, X_{2}|_{\gamma (t)}, \quad l(\gamma ) := \int \limits _0^{T} \sqrt{\mathcal {G}_{\gamma (t)}(\dot{\gamma }(t), \dot{\gamma }(t))}\, \mathrm{d} t \end{array} \end{aligned}$$
(6)

Here \(\mathcal {G}\) is a diagonal metric, represented in the form \(\mathcal {G}= \text {diag}(\frac{1}{\mathcal {C}}, \frac{1}{\mathcal {C}})\), with respect to the chosen basis \(\{X_1,X_2\}\). The Riemannian approximation of \(\mathcal {G}\) is \(\mathcal {G}^{\epsilon }\), formally defined over T\({\text {SE(2)}}\), with respect to the basis \(\{X_1,X_2,X_3\}\). \(\mathcal {C}\) is a strictly positive function, called external cost. In this framework, given a starting point \(\eta _0\) and a terminal point \(\eta _1\), the geodesic problem is to find a Lipschitzian curve \(\gamma \) with \(\dot{\gamma } \in \varDelta \) almost everywhere on an unknown interval [0, T] with controls \(u_1, u_2: [0,T] \rightarrow {\mathbb {R}}\) in \(L^\infty [0,T]\) that minimizes the distance between the two given points [24]:

(7)

SR-geodesics and their application to image analysis were also studied in [4, 17, 22]. For explicit formulas of SR-geodesics in \({\text {SE(2)}}\) in the particular case of uniform external cost \(\mathcal {C} = 1\), see [27]. One of the most efficient method to compute geodesics in the Euclidean setting is Fast-Marching, introduced by Sethian in [30]. The method allows to compute a distance map, viscosity solution (in the sense by [7, 8]) of the eikonal equation, which in the SR setting has the following expression:

(8)

The fast-marching method has been extended in the case of Riemannian metric by Mirebeau, [23], and in  [3] was developed in the \({\text {SE(2)}}\) equipped with a SR metric, with arbitrary external cost. The theoretical counterpart for viscosity solution of the Eikonal equation in the sub-Riemmanian case can be found in [12]. In the current study we use method [3] to compute geodesics. It is based on Riemannian approximation of SR-structure. To compute the minimizers, a Riemannian limit [15] is applied, where d is approximated by Riemannian distance \(d^{\epsilon }\) induced by a Riemannian metric \(\mathcal {G}^{\epsilon }= \text {diag}(\frac{1}{\mathcal {C}}, \frac{1}{\mathcal {C}}, \frac{1}{\mathcal {C} \epsilon ^2})\), for \(0< \epsilon \ll 1\), and then the SR gradient \(\nabla _{\mathcal {G}} W(\eta )= \mathcal {C}(\eta )\left( X_1 W(\eta ) \left. X_{1}\right| _{\eta } + X_2 W(\eta ) \left. X_{2}\right| _{\eta }\right) \) is used to find geodesics via steepest descent on \(W(\eta ):=d^{\epsilon }(e, \eta )\).

In particular, in [3] it was shown that if \(\eta _1 \ne e\) be chosen such that there exists a unique minimizing geodesic \(\gamma _{\epsilon }: [0,T] \rightarrow {\text {SE(2)}}\) of \(d^{\epsilon }(e,\eta _1)\) for \(\epsilon \ge 0\) sufficiently small, that does not contain conjugate points (i.e. the differential of the exponential map of the Hamiltonian system is non-degenerate along \(\gamma _{\epsilon }\), cf. [1]), then \(\gamma _0(t)\) is given by \(\gamma _0(t)=\gamma _b(T-t)\) and the backtracking equation writes as:

$$\begin{aligned} \left\{ \begin{array}{l} \dot{\gamma }_b(t) = - \nabla _{\mathcal {G}} W(\gamma _b(t)), \ \ t \in [0,T] \\ \gamma _b(0)=\eta _1, \end{array} \right. \end{aligned}$$
(9)

with \(W(\eta )\) the viscosity solution of the eikonal system (8).

3 Neuro-Mathematical Model for the Poggendorff Illusion

3.1 Perceptual Curves as SR-length Minimizers

The Poggendorff illusion consists in an apparent misalignment of two collinear, oblique, transversals separated by a rectangular surface (Fig. 1, left). Psychological elements contributing to this misperception have been presented in [10, 26, 31]. It is largely accepted that cortical connectivity propagates the output of the cells \(O(x,y,\theta )\), defined in (3), through the metric of the cortex, and is responsible for visual completion and appearance of illusory contours [6]. First model of boundary completion in a contact structure has been presented in [25]. As proved in [5, 29], cortical mechanisms of completion induces minimal surfaces ruled in geodesics. Hence it is possible to look for an illusory contour as minimizer of the geodesic problem in \({\text {SE(2)}}\) with a SR metric. Here we develop the idea that the metric is modulated by the output of the cells \(O(x,y,\theta )\) induced by the presence of a stimulus I(xy): cells already activated by the output are more sensitive to cortical propagation. In this way we can define a polarizing term which will be maxima is correspondence of edges of the visual stimulus:

$$\begin{aligned} \mathcal {C}(x,\,y,\,\theta ) = 1 + {\text {Im}}(O(x,\,y,\,\theta )). \end{aligned}$$
(10)
Fig. 2.
figure 2

Left: Representation of a section of \((x,y,\theta , \mathcal {C}(x,y,\theta ))\), graph of \(\mathcal {C}(x,y,\theta )\), for y fixed. \(\mathcal {C}(x,y,\theta )\) is the output positively shifted of odd receptive profiles of simple cells. \(\mathcal {C}(x,y,\theta )\) is constant along y. Right: \(\nabla \mathcal {C}(x,y,\theta )\) is visualized in correspondence of the contours of the central bar, projected onto the \((x,\theta )\) plane. We represent x, and \(\theta \) component of \(\nabla \mathcal {C}\), since the y component vanishes.

Imaginary part of the output corresponds to the response of Odd Gabor filters, responsible for detecting polarity of a surface in an initial stimulus. Polarity means that contours with the same orientation but opposite contrast are referred to opposite angles. In the Poggendorff illusion (1, left) the contribution of odd Gabor receptive profiles, responsible for the detection of surfaces, indeed plays a role. For this reason we assume that the orientation \(\theta \) takes values in \([-\pi , \pi )\). Furthermore the contribution of odd receptive profiles over a straight line is null. Then the inverse of the metric \(\mathcal {G}\) in Sect. 2.3, polarized by the output of odd simple cells and expressed in Euclidean coordinates, is given by the following:

$$\begin{aligned} \mathcal {G}^{-1}(x,y,\theta )= \mathcal {C}(x,y,\theta ) \, \left( \begin{array}{ccc} \cos ^2\theta &{} \sin \theta \cos \theta &{} 0\\ \sin \theta \cos \theta &{} \sin ^2\theta &{} 0 \\ 0 &{} 0 &{} 1 \\ \end{array}\right) , \end{aligned}$$
(11)

where \(\mathcal {C}(x,y,\theta )\) is given by (10). Let us also observe that the analysis of Fig. 1 (center) can be reduced to the processing of a stimulus in which we neglect the contribution of \(\mathcal {C}(x,y,\theta )\) over the entry trasversals. These last are employed to recover \(\theta \) boundary condition for the experiments. The resulting idea is that the metric mainly depends on the vertical bar and the perceptual curves are obtained as geodesics of the polarized metric (11).

3.2 Implementation

In order to test the model, first the convolution of the initial image with a bank of odd Gabor filters is performed. A response \({\text {Im}}(O)\) is produced, corresponding to the polarization of our SR metric, which is shifted to positive values \(\mathcal {C}(x,y,\theta )\) and used as weight for the connectivity (Fig. 2, left). The SR geodesic that solves (7) is obtained in two steps: 1) computation of the distance map by solving (8) via SR-FM, 2) computation of the geodesic by steepest descent (9). In such a way, we construct a metric in \(\mathbb {R}^2\!\times \! S ^1\), Riemannian approximation of the SR metric, weighted by external cost \(\mathcal {C}(x,y,\theta )\). When switching from image coordinates to mathematical coordinates one should take care of correctly evaluating \(\xi \), which represents the anisotropy between the two horizontal direction, \(\xi \varDelta x = \varDelta \theta \), where \(\varDelta x, \varDelta \theta \) are the discretization steps along x and \(\theta \). Then equation (11), adding the Riemannian approximation of the metric, becomes:

$$\begin{aligned} (\mathcal {G^{\epsilon }})^{-1} (x,y,\theta )= \mathcal {C}(x,y,\theta ) \left( \begin{array}{ccc} \xi ^{-2} (\cos ^2\theta + \epsilon ^{2} \sin ^2 \theta ) &{} \xi ^{-2} (1-\epsilon ^{2}) \sin \theta \cos \theta &{} 0\\ \xi ^{-2} (1-\epsilon ^{2}) \sin \theta \cos \theta &{} \xi ^{-2} (\sin ^2\theta + \epsilon ^{2} \cos ^2\theta )&{} 0 \\ 0 &{} 0 &{} 1 \\ \end{array}\right) . \end{aligned}$$

where \(\epsilon \) is a parameter of Riemannian approximation (note \(\epsilon = 0\) in SR-case). In the experiments we set \(\epsilon = 0.1\). As was shown in [28], this value is taken sufficiently small to give an accurate approximation of the SR-case.

4 Results

In this section we discuss the results obtained through the method presented in Sect. 3.1 and implemented in Sect. 3.2.

Fig. 3.
figure 3

From left to right: (1) minimum of distance map \(W(\eta )\) from the boundary value condition (initial seed) of Eq. 8, along the direction \(\theta \), computed through SR-Fast-Marching. (2): 2D projection of the computed geodesics. The perceptual curve is blue, the actual completion of the left side transversal is the red curve. (3): 3D plot of the computed geodesics. (Color figure online)

4.1 Poggendorff Simulation and Comparison with Quantitative Psychophysical Results

Manipulating the elements of Poggendorff to understand how to magnify the illusory phenomenon has been done in many works [10, 32]. In [32], the authors performed psychophysical experiments to obtain quantitative measures of the magnitude of the illusion: the illusory effects increased linearly with increasing separation between the parallels as well as increasing the width of the obtuse angle formed by the transversal. Here we consider odd Gabor filters with the following values: \(\alpha = 1.5\), \(\theta \in [-\pi , \pi ]\) (65 values), \(\sigma \in \{2,4\}\) (pixels), \(\frac{\sigma }{\lambda } = 2\). This means that for each point of the image (of dimension \(50 \times 100\) pixels) we took \(65\times 2\) receptive profiles. The scale parameter \(\sigma \) is chosen in relationship with the mentioned image resolution. In this example we took the mean between the filter responses for \(\sigma =2\) and for \(\sigma =4\). As initial test we chose \(\theta = \pi /4\) and width = 15 pixels, see Fig. 3 (center): the SR-length of the red curve is 2.0480 (correct transversal), and the SR-length of the blue curve is 1.8094 (perceptual completion). The shortest curves implemented through this model are the perceptual ones. Then we tested the following widths for the central surfaces, 15, 25, 9 pixels (see Fig. 4) and the following angles for the transversals \(\theta = \pi /4, \pi /10, \pi /2\). Keeping fixed the width of the bar, we varied the angle of the transversal, to create an increased obtuse angle effect (\(\theta = \pi /10\)) and a non illusory effect (\(\theta = \pi /2\)). In Fig. 4 (left), a 2D projection of computed geodesics for transversal oriented at \(\theta = \pi /2\) is presented. In this case no illusion is shown and the geodesic is a straight line. In the center a 2D representation of the correct transversal (red one), called actual geodesic, and the perceptual one (blue) is shown, for \(\theta = \pi /4\) and width = 25 pixels. Right, same 2D plot for \(\theta = \pi /4\) and width = 9 pixels.

Fig. 4.
figure 4

From left to right: 2D projection of the experiment for \(\theta = \pi /2\) and central width = 15 pixels. Experiment for \(\theta = \pi /4\) and central width = 25 pixels, 2D projection. Right, \(\theta = \pi /4\) and central width = 9 pixels. (Color figure online)

Discussion. In this paragraph we show a table reporting the collected data concerning the SR lengths of the computed curve. It refers to the change of length varying the widths and angles, underlining the observed phenomena.

Type of curve

Width = 9 pixels

Width = 15 pixels

Width = 25 pixels

Percep. curve \(\theta = \pi /4\)

1.0366

1.8094

3.1113

Actual curve \(\theta = \pi /4\)

1.1369

2.0480

3.5354

Percep. curve \(\theta = \pi /10\)

2.1033

3.4719

4.9411

Actual curve \(\theta = \pi /10\)

2.8925

4.4927

7.3924

Percep. curve \(\theta = \pi /2\)

1.0320

1.4412

2.5196

4.2 Round Poggendorff Illusion

Now we consider a variant of the Poggendorff illusion, called Round Poggendorff, see Fig. 5, left. The presence of the central surface induces a misperception: the arches do not seem cocircular and the left arc seems to be projected to some point with a certain orientation on the left bar.

Fig. 5.
figure 5

Left: Round Poggendorff illusion, courtesy of Talasli and Inan see [31]. Center: A family of geodesics starting from \((x_0,y_0,\theta _0)\) with multiple endpoints. The aim is to determine \((y,\theta )\) minimizing the length of the perceptual curve. Right: A minimizer has end point \((y,\theta ) = (0.88, -0.27)\)

Computation of shortest curve with terminal manifold. The seed is fixed at the crossing point between the right arc and the right bar, \(\xi = 2.5\). In order to compute the corrected perceptual completion curve we provide a terminal set to the SR-Fast Marching (SR-FM). Possible terminal orientations belong to \([-\pi /10,0]\), where \(\theta = 0\) is the angle corresponding to the orthogonal projection over the left bar and \(\theta = -\pi /10\) is the boundary condition of the circle at crossing point with the left bar. In an analogous way we take a discretization between possible values of the y coordinate and run SR-FM, which is able to identify the minimizer connecting a certain seed with multiple end (terminal) points. The aim is to identify the correct angle \(\theta \) and coordinate y for the end point of the perceptual curve. The SR length of the minimizing geodesic is 1.32668 and the corresponding computed end point is \(\{0.3, 0.88, -0.27\}\).

5 Conclusion

In this paper a neuro-mathematical model for the perceptual Poggendorff phenomenon is presented, based on the functional architecture of V1. Perceptual curves arise as geodesics of a polarized metric in \({\text {SE(2)}}\), directly induced by the visual stimulus. The geodesics are computed through SR-FM and the perceptual curves result to be shorter (w.r.t. SR-metric) than the corresponding geometrical continuation. The model has been compared with psychophysical evidences which explain how the effect varies depending on the width of the central surface and the angle of the transversal. Our measurements are in accord with those studies. Such approach can be extended to other illusions such that Hering, Zollner and Wundt illusions ([16]). Improving the understanding of these phenomena is very important because it can lead to insights about the behaviour of the visual cortex [13], allowing new applications to be developed.