research-article

Open access

A Normal Form Algorithm for Tensor Rank Decomposition

Authors:

Simon Telen,

Nick VannieuwenhovenAuthors Info & Claims

ACM Transactions on Mathematical Software, Volume 48, Issue 4

Article No.: 38, Pages 1 - 35

https://doi.org/10.1145/3555369

Published: 19 December 2022 Publication History

All formats PDF

Abstract

We propose a new numerical algorithm for computing the tensor rank decomposition or canonical polyadic decomposition of higher-order tensors subject to a rank and genericity constraint. Reformulating this computational problem as a system of polynomial equations allows us to leverage recent numerical linear algebra tools from computational algebraic geometry. We characterize the complexity of our algorithm in terms of an algebraic property of this polynomial system—the multigraded regularity. We prove effective bounds for many tensor formats and ranks, which are of independent interest for overconstrained polynomial system solving. Moreover, we conjecture a general formula for the multigraded regularity, yielding a (parameterized) polynomial time complexity for the tensor rank decomposition problem in the considered setting. Our numerical experiments show that our algorithm can outperform state-of-the-art numerical algorithms by an order of magnitude in terms of accuracy, computation time, and memory consumption.

1 Introduction

We introduce an original direct numerical algorithm for tensor rank decomposition or canonical polyadic decomposition(CPD) in the low-rank regime. By “direct,” we mean an algorithm that does not rely on numerical optimization or other iteratively refined approximations with a data-dependent number of iterations.

Consider the vector space $\mathbb {C}^{(n_1 + 1) \times \cdots \times (n_{d} + 1)}$ whose elements represent order-$d$ tensors in coordinates relative to a basis. We say that a tensor in such a space is of rank 1 or elementary if it is of the following form:

\begin{equation*} {\alpha }^1 \otimes {\alpha }^2 \otimes \cdots \otimes {\alpha }^d := \left(\alpha ^1_{j_1} \alpha ^2_{j_2} \cdots \alpha ^d_{j_d} \right)_\begin{array}{c}0 \le j_k \le n_k \\ k=1,\ldots ,d\end{array}, \quad \text{where} \quad {\alpha }^k = \left(\alpha ^k_{j} \right)_{0 \le j \le n_k} \in \mathbb {C}^{n_k + 1}. \end{equation*}

Every tensor $\mathcal {A}$ can be expressed as a linear combination of rank-1 tensors:

\begin{align} \mathcal {A} = \sum _{i=1}^r {\alpha }_i^1 \otimes \cdots \otimes {\alpha }_i^d, \quad \text{with} \quad \alpha ^k_i = \left(\alpha ^k_{i,j} \right)_{0 \le j \le n_k} \in \mathbb {C}^{n_k + 1}. \end{align}

(CPD)

If $r$ is minimal among all such expressions of $\mathcal {A}$, then $r$ is called the rank of the tensor according to Hitchcock [1927], and Equation (CPD) is called a CPD. The problem of computing a CPD of $\mathcal {A}$, i.e., determining a set of rank-1 tensors summing to $\mathcal {A}$, has many applications in science and engineering, see for instance Kolda and Bader [2009] and Sidiropoulos et al. [2017].

The strategy we propose for computing (Equation (CPD)) relies on the fact that the problem is equivalent to solving a certain system of polynomial equations. Under suitable assumptions, these equations can be obtained from the nullspace of a flattening $\mathcal {A}_{(1)} \in \mathbb {C}^{(n_1+1) \times \prod _{k=2}^d (n_k+1)}$ of the tensor $\mathcal {A}$, as in Kuo and Lee [2018]. Once we have obtained these polynomial equations, whose solutions correspond to the rank-1 terms in Equation (CPD), we use recent numerical normal form techniques from Bender and Telen [2020] and Telen [2020b];, 2020a] to solve them. The following example, which is used as a running example throughout the article, illustrates how this works.

Contributions

We formulate our results for the case $d = 3$, as the general case can be handled using the standard reshaping trick (Section 2.2). Our main contribution is a new numerical algorithm for computing the rank-$r$ CPD of a third-order tensor $\mathcal {A}$ based on linear algebra, under the assumption that $\mathcal {A}$ is $r$-identifiable and a flattening of $\mathcal {A}$ has rank $r$, see assumption 2.1. We call this algorithm cpd_hnf. It is based on a well-known reformulation of the problem as a system of polynomial equations, which we solve using state-of-the-art methods from computational algebraic geometry. This results in 3.1. We show that this algorithm generalizes pencil-based algorithms [Beltrán et al. 2019; Leurgans et al. 1993; Lorber 1985; Sanchez and Kowalski 1990; De Lathauwer 2006] from very low ranks to much higher ranks in the unbalanced case; see theorem 3.1 3.2 3.3.

We give a new, explicit description of the complexity of the tensor rank decomposition problem in terms of an algebraic property of aforementioned polynomial system: the multigraded regularity of a non-saturated ideal in the homogeneous coordinate ring of $\mathbb {P}^m \times \mathbb {P}^n$ (proposition 3.1). We characterize the regularity in terms of the rank of a structured matrix obtained directly from the rank-1 terms in Equation (CPD), see example 4.1. These new insights allow us to formulate a conjecture regarding the regularity (conjecture 1). We prove this conjecture for many formats, see theorem 4.2. These results are of independent interest for the field of polynomial system solving. They have the following consequence related to the complexity of our algorithm.

Theorem 1.1.

Consider the tensor space $\mathbb {C}^{\ell +1}\otimes \mathbb {C}^{m+1}\otimes \mathbb {C}^{n+1}$ of dimension $M = (\ell +1)(m+1)(n+1)$ with $\ell \ge m \ge n$. If conjecture 1 holds and $\mathcal {A}$ is a generic¹ tensor of rank $r \le \phi m n$ with $\phi \in [0,1)$ a fixed constant, then cpd_hnf runs in polynomial time $\mathcal {O}(M^{\frac{5}{2} \lceil \frac{1}{1-\phi } \rceil +1})$.

Our numerical experiments in Section 6 show that the proposed algorithm is highly efficient and gives accurate results (see Figure 2). For instance, we can compute the decomposition of a $7 \times 7 \times 7 \times 7 \times 6 \times 6 \times 5 \times 5$ tensor of rank 1,000 in double-precision arithmetic with an accuracy of order $10^{-15}$ in 441 s—a feat we believe has not been matched by other tensor decomposition algorithms. Moreover, cpd_hnf seems to behave well in the presence of noise.

A Julia implementation of cpd_hnf, including driver routines to reproduce our numerical experiments, is provided at https://gitlab.kuleuven.be/u0072863/homogeneous-normal-form-cpd.

Related Work

The idea of computing tensor decompositions via polynomial root finding is central in apolarity-based approaches such as Bernardi et al. [2011];, 2013]; Bernardi and Taufer [2020], and Brachat et al. [2010]; Nie [2017] for the symmetric case. The Hankel operators play the role of normal forms in this context. These operators can be obtained partially from the tensor $\mathcal {A}$. In most cases, an additional polynomial system needs to be solved to complete the Hankel operators [Mourrain 2018, Section 4]. For instance, this is step (2) in Algorithm 5.1 of Brachat et al. [2010]. Although our method works only under certain assumptions on the rank of $\mathcal {A}$, in contrast to apolarity-based methods it requires only linear algebra computations, and it operates in polynomial rings with fewer variables. Moreover, the method from Brachat et al. [2010] uses connected-to-one bases for its normal form computations. The choice of such a basis is discussed at length in Sections 4.3 and 7.2 of Bernardi and Taufer [2020]. In this article, we exploit the flexibility of truncated normal forms [Telen et al. 2018] to achieve better numerical results. For a comparison, see Section 4.3.3 of Telen [2020b].

Kuo and Lee [2018] obtained an affine system of polynomial equations as in example 1.1, but it is solved using homotopy continuation methods. This approach is infeasible for some formats that are handled without any problems by our algorithm. For instance, for the aforementioned eighth-order tensor of rank 1,000, the method of Kuo and Lee [2018] would need to track over $40 \times 10^9$ homotopy paths. We argue that the eigenvalue-based methods proposed in this article are more natural to use in an overdetermined setting.

The state-of-the-art algorithms for tensor rank decomposition using only linear algebra computations were proposed by Domanov and De Lathauwer [2014];, 2017]. Although these methods work under slightly milder conditions, our numerical experiments suggest that Algorithm 1 of Domanov and De Lathauwer [2017] often requires the construction of larger matrices than those in our algorithm. There is no explicit connection with polynomial equations. The algorithm and its complexity depend on a parameter $l$, which is chosen incrementally by trial and error for each format. In an analogous way, the complexity of our algorithm is governed by the choice of a parameter. However, our analysis in Section 4 tells us a priori which parameter value should be used, circumventing a trial-and-error approach. In Section 6, we demonstrate that cpd_hnf improves on Domanov and De Lathauwer [2014];, 2017] in terms of computational complexity and accuracy.

Outline

In Section 2, we state our assumptions and show how computing the CPD of a tensor $\mathcal {A}$ is formulated as a system of polynomial equations. In Section 3, we make the connection with normal form methods explicit. That is, we describe how a pre-normal form can be computed directly from the tensor $\mathcal {A}$ and how this allows us to reduce the above polynomial system to an eigenvalue problem. We explain how the approach generalizes so-called pencil-based algorithms in Section 3.3. This leads to a complete, high-level description of our algorithm cpd_hnf in Section 3.4. In Section 4, we study the regularity of the ideal associated to our polynomial system. These results are the starting point for our analysis of the complexity of cpd_hnf in Section 5. In Section 6, we demonstrate the efficiency and accuracy of cpd_hnf relative to the state of the art through several numerical experiments. The article is completed in Section 7 with some final conclusions.

2 From Tensor Decomposition to Polynomial Equations

In this section, we explain how tensor rank decomposition, under some restrictions, can be reduced to solving a polynomial system of equations whose coefficients are directly obtained from the tensor $\mathcal {A}$. The next two subsections state the restrictions under which the proposed algorithm operates. In Section 2.3, the polynomial system is constructed. We show that it gives a nice algebraic description of a certain projection of the rank-1 tensors appearing in $\mathcal {A}$’s decomposition. Section 2.4 provides a pseudo-algorithm summarizing the main steps of the proposed numerical algorithm.

2.1 Identifiability, Flattenings, and the Main Assumption

For the moment, let us assume $\mathcal {A} \in \mathbb {C}^{\ell +1}\otimes \mathbb {C}^{m+1}\otimes \mathbb {C}^{n+1}$ is a third-order tensor of rank $r$. The cpd_hnf algorithm works under the following assumption.

Assumption 2.1.

The tensor $\mathcal {A}\in \mathbb {C}^{\ell +1} \otimes \mathbb {C}^{m+1} \otimes \mathbb {C}^{n+1}$ with $\ell \ge m \ge n \gt 0$ is generic of rank

\begin{align} r \le \min \left\lbrace \ell +1, mn \right\rbrace \!. \end{align}

(R)

By generic, we mean that $\mathcal {A}$ is contained in a Zariski dense open subset of the set of all rank-$r$ tensors. We will describe this open subset more explicitly in lemma 2.2. For now, we point out that in this open set, each tensor $\mathcal {A}$ is such that

(i)

$\mathcal {A}$ is $r$-identifiable, and

(ii)

the standard 1-flattening $\mathcal {A}_{(1)}$ of the tensor $\mathcal {A}$ is of the same rank as $\mathcal {A}$.

These are necessary conditions for our algorithm to work. We now briefly recall their meaning.

Condition (i) is usually very weak and only of a technical nature. Recall that the set of all rank-1 tensors forms an algebraic variety, i.e., the solution set of a system of polynomial equations, called the Segre variety $\mathcal {S}$. A rank-$r$ CPD of a tensor $\mathcal {A}$ is a set of $r$ rank-1 tensors whose sum is $\mathcal {A}$. The set of all such rank-$r$ CPDs is denoted by $\mathcal {S}^{[r]} = \lbrace \mathcal {X} \subset \mathcal {S} \mid |\mathcal {X}|=r \rbrace$. Tensor rank decomposition consists of computing an element of the fiber of

\begin{equation*} f: \mathcal {S}^{[r]} \rightarrow \mathbb {C}^{\ell +1} \otimes \mathbb {C}^{m+1} \otimes \mathbb {C}^{n+1},\quad \lbrace \mathcal {A}_1, \ldots , \mathcal {A}_r\rbrace \mapsto \mathcal {A}_1 + \cdots + \mathcal {A}_r \end{equation*}

at a rank-$r$ tensor $\mathcal {A}$. For brevity, we denote the image of $f$ by

\begin{equation*} \mathcal {S}_r = f(\mathcal {S}^{[r]}). \end{equation*}

Tensor $\mathcal {A}$ is called $r$-identifiable if the fiber $f^{-1}(\mathcal {A})$ contains exactly one element. That is, there is a unique set in $\mathcal {S}^{[r]}$ whose sum is $\mathcal {A}$. Generally, $r$-identifiability fails only on a strict closed subvariety of (the Zariski closure of) $\mathcal {S}_r$; see Chiantini and Ottaviani [2012]; Chiantini et al. [2014];, 2017]; Bocci et al. [2014]. This property is called the generic $r$-identifiability of $\mathcal {S}$. These results entail that $r$-identifiability fails only on a subset of $\mathcal {S}_r$ of Lebesgue measure zero if the rank $r$ and dimensions $(\ell +1,m+1,n+1)$ satisfy some very weak conditions; see Section 1 in Chiantini et al. [2014] and Section 3 in Chiantini et al. [2017] for more details and statements for higher-order tensors as well. If $\mathcal {S}_r$ is generically $r$-identifiable, then there is a Zariski-open subset of the closure of $\mathcal {S}_r$ so that $f^{-1}$ is an analytic, bijective tensor decomposition function. We aspire to solve the tensor decomposition problem only in this well-behaved setting.

Condition (ii) is more restrictive, but allows us to tackle the tensor decomposition problem using only efficient linear algebra. Recall that the standard 1-flattening of $\mathcal {A} \in \mathbb {C}^{\ell +1} \otimes \mathbb {C}^{m+1} \otimes \mathbb {C}^{n+1}$ consists of interpreting $\mathcal {A}$ as the matrix $\mathcal {A}_{(1)} \in \mathbb {C}^{(\ell +1) \times (m+1)(n+1)}$. For a rank-1 tensor $\mathcal {A}=\alpha \otimes \beta \otimes \gamma$ this identification is defined by

\begin{equation*} \mathcal {A}_{(1)} = \alpha (\beta \otimes \gamma)^\top , \end{equation*}

and the general case follows by linearity. The tensor product² in the foregoing expression is also called the reverse-order Kronecker product

\begin{equation*} \otimes : \mathbb {C}^{m+1} \times \mathbb {C}^{n+1} \rightarrow \mathbb {C}^{(m+1)(n+1)},\; (\beta , \gamma) \mapsto [ \beta _{i} \gamma _{j} ]_{(i,j)}, \end{equation*}

where the standard bases were assumed for these Euclidean spaces and the indices $(i,j)$ are sorted by the reverse lexicographic order. Note that the 1-flattening is easy to compute when $\mathcal {A}$ is given in coordinates relative to the standard tensor product basis. In that case it suffices to reshape the coordinate array to an $(\ell +1) \times (m+1)(n+1)$ array (e.g., as in Julia’s or Matlab’s reshape function).

2.2 The Reshaping Trick

For dealing with higher-order tensors $\mathcal {A}\in \mathbb {C}^{n_1+1}\otimes \cdots \otimes \mathbb {C}^{n_d+1}$ with $d \gt 3$, we rely on the well-known reshaping trick. It consists of interpreting a higher-order tensor as a third-order tensor. While the approach described in the remainder of the article could be applied directly to higher-order tensors as well, the range of ranks $r$ to which this version would apply is (much) more restrictive than the range obtained from reshaping.

Recall that reshaping $\mathcal {A}$ to a third-order tensor $\mathcal {A}_{(I,J,K)}$ is a linear isomorphism

\begin{align} \mathbb {C}^{n_1+1} \otimes \cdots \otimes \mathbb {C}^{n_d+1} &\simeq (\otimes _{i \in I} \mathbb {C}^{n_i+1}) \otimes (\otimes _{j\in J} \mathbb {C}^{n_j+1}) \otimes (\otimes _{k\in K} \mathbb {C}^{n_k+1}) \nonumber \nonumber \\ &\simeq \mathbb {C}^{\prod _{i\in I} (n_i+1)} \otimes \mathbb {C}^{\prod _{j\in J} (n_j+1)} \otimes \mathbb {C}^{\prod _{k\in K} (n_k+1)} \end{align}

(I)

that identifies $\alpha ^1 \otimes \cdots \otimes \alpha ^d$ with $(\otimes _{i\in I} \alpha ^i) \otimes (\otimes _{j\in J} \alpha ^j) \otimes (\otimes _{k\in K} \alpha ^k)$, wherein $I \sqcup J \sqcup K$ partitions $\lbrace 1,\ldots ,d\rbrace$. The general case follows by linearity from the universal property [Greub 1978].

It was shown in Section 7 of Chiantini et al. [2017] that under some conditions, the unique CPD of $\mathcal {A}$ can be recovered from the CPD of a reshaping $\mathcal {A}_{(I,J,K)}$. Here, we exploit the following slightly more general result that requires no $r$-identifiability of $\mathcal {A}$.

Lemma 2.1.

Let $\mathcal {A} \in \mathbb {C}^{n_1+1}\otimes \cdots \otimes \mathbb {C}^{n_d+1}$ be a rank-$r$ tensor. If there exists a partition $I\sqcup J\sqcup K$ of $\lbrace 1,\ldots ,d\rbrace$ such that the third-order reshaping $\mathcal {A}_{(I,J,K)} \in \mathbb {C}^{\ell +1} \otimes \mathbb {C}^{m+1}\otimes \mathbb {C}^{n+1}$ is $r$-identifiable, then $\mathcal {A}$ is $r$-identifiable and the CPD of $\mathcal {A}_{(I,J,K)}$ is the CPD of $\mathcal {A}$ under the isomorphism in Equation (I).

Proof.

Let $I\sqcup J\sqcup K$ be such a partition and assume that $\mathcal {A}_{(I,J,K)}$’s unique CPD is $\mathcal {A}_{(I,J,K)}=\sum _{q=1}^r \alpha _q\otimes \beta _q\otimes \gamma _q$. Let $\mathcal {A} = \sum _{q=1}^r \alpha _q^1 \otimes \cdots \otimes \alpha _q^d$ be an arbitrary rank-$r$ CPD. Then,

\begin{equation*} \mathcal {A}_{(I,J,K)} = \sum _{q=1}^r \left(\otimes _{i\in I} \alpha _q^i\right) \otimes \left(\otimes _{j\in J} \alpha _q^j\right) \otimes \left(\otimes _{k \in K} \alpha _q^k\right) \end{equation*}

because of Equation (I). Since $\mathcal {A}_{(I,J,K)}$’s rank-$r$ CPD is unique by assumption, it follows there is some permutation $\pi$ and scalars $\rho _q \sigma _q \tau _q = 1$ such that we have, for all $q=1,\ldots ,r$,

\begin{equation*} \rho _q \alpha _{\pi _q} = \otimes _{i\in I} \alpha _q^i,\quad \sigma _q \beta _{\pi _q} = \otimes _{j \in J} \alpha _q^j, \text{ and }\quad \tau _q \gamma _{\pi _q} = \otimes _{k \in K} \alpha _q^k. \end{equation*}

This implies that the CPD of $\mathcal {A}_{(I,J,K)}$ under the linear isomorphism Equation (I) results in a CPD of $\mathcal {A}$.

It remains to show that it is the unique one. Denote the set of $r$ rank-1 summands in any two CPDs of $\mathcal {A}$ by $\lbrace \mathcal {T}_1, \ldots , \mathcal {T}_r \rbrace$ and $\lbrace \mathcal {T}_1^{\prime }, \ldots , \mathcal {T}_r^{\prime }\rbrace$, respectively. The previous paragraph showed that any CPD of $\mathcal {A}$ implies a CPD of $\mathcal {A}_{(I,J,K)}$. Since $\mathcal {A}_{(I,J,K)}$’s CPD is unique, there exist permutations $\pi$ and $\pi ^{\prime }$ such that for all $q=1,\ldots ,r,$

\begin{equation*} \left(\otimes _{i \in I} \alpha _q^i\right) \otimes \left(\otimes _{j\in J} \beta _q^i\right) \otimes \left(\otimes _{k\in K} \gamma _q^k\right) = \left(\mathcal {T}_{\pi _q}\right)_{(I,J,K)} = \left(\mathcal {T}_{\pi _q^{\prime }}^{\prime }\right)_{(I,J,K)}. \end{equation*}

Since reshaping is injective and sends rank-1 tensors in the domain to rank-1 tensors in the codomain, we conclude that $\lbrace \mathcal {T}_1,\ldots ,\mathcal {T}_r\rbrace = \lbrace \mathcal {T}_1^{\prime }, \ldots , \mathcal {T}_r^{\prime }\rbrace$. That is, all of $\mathcal {A}$’s CPDs must consist of the same rank-1 terms, which is exactly the definition of $r$-identifiability.□

For decomposing higher-order tensors $\mathcal {A}$ of rank $r$ via the reshaping trick, we proceed as follows:

(1)

Find $I \sqcup J \sqcup K =\lbrace 1,\ldots ,d\rbrace$ such that $\mathcal {A}_{(I,J,K)}$ is $r$-identifiable and its rank satisfies (R).

(2)

Compute the CPD $\mathcal {A}_{(I,J,K)} = \sum _{q=1}^r \alpha _q \otimes \beta _q \otimes \gamma _q$, e.g., as in 2.1.

(3)

Viewing the recovered vectors as $\alpha _q \in \otimes _{i\in I} \mathbb {C}^{n_i+1}$, $\beta _q \in \otimes _{j\in J} \mathbb {C}^{n_j+1}$, and $\gamma _q \in \otimes _{k\in K} \mathbb {C}^{n_k+1}$, a CPD of $\mathcal {A}$ is $\mathcal {A}_1+\cdots +\mathcal {A}_r$ where the tensors $\mathcal {A}_q$ are isomorphic to $\alpha _q \otimes \beta _q \otimes \gamma _q$ under Equation (I).

Note that the rank-1 decompositions in the third step can be computed for example with a sequentially truncated higher-order singular value decomposition [Vannieuwenhoven et al. 2012], as in our implementation, or with a cross approximation [Oseledets et al. 2008].

Because of the reshaping trick, we will henceforth describe our approach only for $d=3$.

2.3 Polynomial Systems Defined by Flattenings

Having delineated the range (R) in which the proposed cpd_hnf algorithm will work, we continue by describing it. Our strategy to compute the tensor rank decomposition of

\begin{align} \mathcal {A}= \sum _{i=1}^r \alpha _i \otimes \beta _i \otimes \gamma _i ~ \in ~ \mathbb {C}^{\ell +1} \otimes \mathbb {C}^{m+1} \otimes \mathbb {C}^{n+1} \end{align}

(A)

is to compute the points

\begin{equation*} \zeta _i = (\beta _i, \gamma _{i}) \in \mathbb {C}^{m+1} \times \mathbb {C}^{n+1}, \quad i = 1, \ldots , r, \end{equation*}

first as the solution of a system of (low-degree) polynomial equations. Thereafter, we compute the $\alpha _{i}$’s with efficient linear algebra by plugging $\beta _{i}$ and $\gamma _{i}$ into Equation (A) and solving the resulting linear system of equations. Indeed, since

\begin{align} \mathcal {A}_{(1)} = \begin{bmatrix} \alpha _1 & \cdots & \alpha _r \end{bmatrix} \begin{bmatrix} \beta _1 \otimes \gamma _1 & \cdots & \beta _r \otimes \gamma _r \end{bmatrix}^\top \end{align}

(2.1)

and assumption 2.1 guarantees that the second $r \times (m+1)(n+1)$ matrix is of full rank $r$ (see lemma 2.2 below), so it has a right inverse. Applying the latter on the right to $\mathcal {A}_{(1)}$ yields the corresponding $\alpha _i$’s. Note that it suffices to compute the points $\beta _{i}$ and $\gamma _{i}$ up to a nonzero scaling factor. Therefore, it is natural to consider our problem in complex projective space.

Recall that the $k$-dimensional complex projective space $\mathbb {P}^{k}$ is the space of equivalence classes $[x] = \lbrace \lambda x \mid \lambda \in \mathbb {C} \setminus \lbrace 0\rbrace \rbrace$ for $x \in \mathbb {C}^{k+1} \setminus \lbrace 0\rbrace$. The entries of the vector $x = (x_0, \ldots , x_k) \in \mathbb {C}^{k+1} \setminus \lbrace 0\rbrace$ are called homogeneous coordinates of $[x]$. With a slight abuse of notation, we will write $x = (x_0 : \cdots : x_k) \in \mathbb {P}^k$ for both the equivalence class $[x]$ and a set of homogeneous coordinates $x \in \mathbb {C}^{k+1} \setminus \lbrace 0\rbrace$.

The proposed cpd_hnf algorithm exploits the fact that the points $\zeta _i = (\beta _i, \gamma _i) \in \mathbb {P}^m \times \mathbb {P}^n$ are defined by algebraic relations on $X = \mathbb {P}^{m} \times \mathbb {P}^n$ that can be computed directly from the tensor $\mathcal {A}$. Such algebraic relations are homogeneous polynomials in a bi-graded polynomial ring. In this context, it is natural to think of degrees as 2-tuples $(d,e) \in \mathbb {N}^2$, where $d$ is the degree in the variables corresponding to $\mathbb {P}^m$ and $e$ is the degree in the variables corresponding to $\mathbb {P}^n$. Note that this differs from the more familiar setting where the degree is a natural number $d \in \mathbb {N}$. For more on multi-graded rings, see Miller and Sturmfels [2005]. The flexibility provided by this bi-graded setting reduces the complexity of our algorithm. Concretely, we work in the $\mathbb {N}^2$-graded polynomial ring

\begin{equation} S = \mathbb {C}[x_{0}, \ldots , x_{m}, y_{0}, \ldots , y_{n}] = \bigoplus _{(d,e) \in \mathbb {N}^2} S_{(d,e)},\; \text{where} \quad S_{(d,e)} = \bigoplus _{|a| = d, |b|=e} \mathbb {C} \cdot x^{a} y^{b}. \end{equation}

(2.2)

Here, we used the notation $x^{a} = x_{0}^{a_{0}} \cdots x_{m}^{a_{m}}$ and $|a| = a_{0} + \cdots + a_{m}$ for $a = (a_{0}, \ldots , a_{m}) \in \mathbb {N}^{m + 1}$, and analogously for $b$. The graded pieces $S_{(d,e)}$ are vector spaces over $\mathbb {C}$. The variables $x$ correspond to homogeneous coordinates on the factor $\mathbb {P}^{m}$ and in $S_{(d,e)}$, $d$ is the degree in the $x$-variables. Analogously, elements of $S_{(d,e)}$ have degree $e$ in the $y$-variables, which are related to $\mathbb {P}^n$.

An element $f \in S$ is called homogeneous if $f \in S_{(d,e)}$ for some $(d,e)\in \mathbb {N}^2$. The key reason that the ring $S$, with its grading (2.2), is naturally associated to $X$ is that homogeneous elements $f \in S_{(d,e)}$ have well-defined zero sets on $X$. By this, we mean that for $f \in S_{(d,e)}$ and for any $\zeta = (x, y) \in X$, $f(x,y) = 0$ is independent of the choice of homogeneous coordinates. Indeed, this follows from

\begin{equation} f(\lambda x, \mu y) = \lambda ^d \mu ^e f(x,y) \quad \text{for} \quad \lambda , \mu \in \mathbb {C} \setminus \lbrace 0\rbrace . \end{equation}

(2.3)

Therefore, whenever $f$ is homogeneous it makes sense to write $f(\zeta) = 0$ if $f$ vanishes on some set of homogeneous coordinates for $\zeta$, and to define the subvariety of $X$ defined by $f$ as $V_X(f) = \lbrace \zeta \in X ~|~ f(\zeta) = 0\rbrace$. For a homogeneous ideal $I \subset S$ (i.e., $I$ is generated by homogeneous elements with respect to the grading (2.2)), we denote the subvariety of $X$ corresponding to $I$ by

\begin{equation*} V_X(I) = \lbrace \zeta \in X ~|~ f(\zeta) = 0, \text{ for all homogeneous } f \in I \rbrace . \end{equation*}

That is, $V_X(I)$ contains the solutions of the polynomial system defined by the simultaneous vanishing of all homogeneous equations $f \in I$.

Example 2.1.

The polynomials $f_1, \ldots , f_5$ in example 1.1 are homogeneous elements of degree $(1,1)$ in $S$, i.e., $f_i \in S_{(1,1)}$. They generate the homogeneous ideal $I$, whose corresponding subvariety is $V_X(I) = \lbrace \zeta _1,\ldots ,\zeta _4\rbrace \subset X = \mathbb {P}^2 \times \mathbb {P}^2$.

With this notation in place, we turn back to the tensor $\mathcal {A}$ and show that, under suitable assumptions, $V_X(I) = \lbrace \zeta _1,\ldots ,\zeta _r\rbrace$ for some homogeneous ideal $I$ defined by $\mathcal {A}$. This will generalize the procedure in example 1.1. Consider the flattening $\mathcal {A}_{(1)}$ from Equation (2.1). This flattening has a natural interpretation as a $\mathbb {C}$-linear map

\begin{equation*} \mathcal {A}_{(1)} :~ S_{(1,1)} ~\longrightarrow ~ \mathbb {C}^{\ell +1}. \end{equation*}

Indeed, we can identify the space $\mathbb {C}^{m+1} \otimes \mathbb {C}^{n+1}$ with the graded piece $S_{(1,1)}$ of degree $(1,1)$ by

\begin{equation} e_{k} \otimes e_{l} ~ \longmapsto ~ x_k y_l, \qquad 0 \le k \le m,\; 0 \le l \le n, \end{equation}

(2.4)

where $e_{k}$ represents the $(k+1)$-st standard basis vector of $\mathbb {C}^{m + 1}$, and analogously for $e_{l}$ and $\mathbb {C}^{n+1}$. For brevity, we also write $\mathcal {A}_{(1)}$ for a matrix representation of the map $\mathcal {A}_{(1)}$, where the standard basis of monomials (2.4) is used for $S_{(1,1)}$. The $\ell +1$ rows of $\mathcal {A}_{(1)}$ are elements of the dual space $(S_{(1,1)})^\vee$ represented in its dual basis

\begin{equation} \left\lbrace \partial _{kl} = \frac{\partial ^2}{\partial {x_k} \partial {y_l}} \mid 0 \le k \le m,~ 0 \le l \le n \right\rbrace . \end{equation}

(2.5)

From this vantage point, the rows of the second factor in Equation (2.1) are

\begin{equation*} (\beta _i \otimes \gamma _i)^\top = \sum _\begin{array}{c}0 \le k \le m \\ 0 \le l \le n\end{array} \beta _{i,k} \gamma _{i,l} \partial _{kl} ~ \in S_{(1,1)}^\vee . \end{equation*}

It is clear that for any $f \in S_{(1,1)}$, we have $(v \otimes w)^\top (f) = f(v,w)$.

Let $f_1, \ldots , f_s \in S_{(1,1)}$ be a $\mathbb {C}$-basis for the kernel $\ker \mathcal {A}_{(1)}$. The $f_i$ generate a homogeneous ideal $I = \langle {\ker \mathcal {A}_{(1)}} \rangle = \langle {f_1,\ldots ,f_s} \rangle \subset S$. The following lemma characterizes precisely what we mean by generic in assumption 2.1.

Lemma 2.2.

Consider the space $\mathbb {C}^{\ell +1} \otimes \mathbb {C}^{m+1} \otimes \mathbb {C}^{n+1}$ with $\ell \ge m \ge n \gt 0$. Then, for all $r \le \min \left\lbrace \ell +1, mn \right\rbrace$ there exists a Zariski dense open subset $U \subset \mathcal {S}_r$ such that for all tensors $\mathcal {A}\in U$,

(i)

$\mathcal {A}$ is $r$-identifiable,

(ii)

the flattening $\mathcal {A}_{(1)}$ has rank $r$,

(iii)

the ideal $I = \langle {\ker \mathcal {A}_{(1)}}\rangle \subset S$ is such that $V_X(I)$ consists of the points $\lbrace \zeta _1, \ldots , \zeta _r\rbrace$, and these points have multiplicity one.

Proof.

Items (i) and (ii) hold on a dense open subset $U_1$ by Proof of Proposition 8.1 in Bocci et al. [2014]. Item (iii) holds on a dense open subset $U_2$ by the scheme-theoretic version of the Trisecant Lemma [Russo 2016, Proposition 1.4.3]. Our subset $U$ is $U_1 \cap U_2$.□

Point (iii) in lemma 2.2 is a technicality we will need in Section 3 to invoke the results from Section 5.5 in Telen [2020b]. For the reader who is familiar with algebraic geometry, we included a consequence of lemma 2.2 relating $I$ to the vanishing ideal of $V_X(I)$ in corollary A.1.

We were not able to construct an example for which (i)+(ii) are satisfied, but (iii) is not. This raises the question whether (i)+(ii) implies (iii), and if not, how our method extends to counterexamples.

2.4 The High-level Algorithm

We conclude the section by presenting a high-level pseudo-algorithm for computing the tensor rank decomposition (A) of $\mathcal {A}$. This is presented as 2.1. Its steps summarize the discussion up to this point.

Remark 2.1.

In a practical implementation of 2.1, the necessary conditions from lemma 2.2 would be handled as follows. Condition (i) that $\mathcal {A}$ is $r$-identifiable would be treated as a non-checked precondition. After obtaining the algorithm’s output, the user should verify that $\mathcal {A}\approx \sum _{i=1}^r \alpha _i\otimes \beta _i\otimes \gamma _i$. If this holds, then specific $r$-identifiability can be checked using a posteriori certifications such as those in Chiantini et al. [2017]; Kruskal [1977]; Domanov and De Lathauwer [2013];, 2017]; Sidiropoulos and Bro [2000]. In 2.1, condition (ii) is checked in line 2 and condition (iii) in line 4; see Section 3.4.

The main focus of this article is dealing with step 4 of 2.1. We will employ a state-of-the-art method for solving polynomial equations on $X$, based on homogeneous normal forms. This strategy is described in the next section.

3 From Polynomial Equations to Eigenvalues

In this section, we employ tools from computational algebraic geometry for computing points in a product of projective spaces in step 4 of 2.1. The next subsection summarizes the relevant results in the present setting. We show in Section 3.1 that the solutions can be obtained from an eigenvalue problem defined by pre-normal forms. How to compute the latter is explained in Section 3.2. Thereafter, in Section 3.3, we demonstrate that so-called pencil-based algorithms for decomposing tensors of very low rank are closely related to our cpd_hnf algorithm. The full algorithm for performing step 4 of 2.1 is presented in Section 3.4. Finally, we conclude this section with some examples.

3.1 The Eigenvalue Theorem

Our algorithm is built on a multi-homogeneous version of the eigenvalue theorem (theorem 3.1 below), which allows us to find solutions of systems of polynomial equations via eigenvalue computations. Behind this is the theory of homogeneous normal forms. In our context, these are $\mathbb {N}^2$-graded versions of truncated normal forms, introduced in Telen et al. [2018], and special cases of the more general toric homogeneous normal forms used by Telen [2020a]; Bender and Telen [2020] and formally introduced in Telen [2020b]. For our purpose, it suffices to work with slightly simpler objects, called pre-normal forms, so we use homogeneous normal forms only implicitly. For full proofs and more details, we refer to Section 5.5.4 in Telen [2020b].

Consider the space $X = \mathbb {P}^m \times \mathbb {P}^n$ and its associated ring $S$, which is the $\mathbb {N}^2$-graded polynomial ring from Equation (2.2). Let $I \subset S$ be a homogeneous ideal. The ideal $I$ inherits the grading on $S$:

\begin{equation*} I = \bigoplus _{(d,e) \in \mathbb {N}^2} I_{(d,e)}, \qquad \text{where } \qquad I_{(d,e)} = I \cap S_{(d,e)} \end{equation*}

is a $\mathbb {C}$-subspace of $S_{(d,e)}$, and the quotient ring

\begin{equation*} S/I = \bigoplus _{(d,e) \in \mathbb {N}^2} (S/I)_{(d,e)}= \bigoplus _{{(d,e)} \in \mathbb {N}^2} S_{(d,e)}/ I_{(d,e)} \end{equation*}

(these are quotients of vector spaces) similarly inherits this grading. Our objective is to compute a set of homogeneous coordinates of the points in $V_X(I)$ by using linear algebra computations. To accomplish this, it is necessary to work in graded pieces of $S$, $I$, and $S/I$. Let $M$ denote either of the latter. The (multi-graded) Hilbert function $\mbox{HF}_M : \mathbb {N}^2 \rightarrow \mathbb {N}$ is given by

\begin{equation*} \mbox{HF}_M(d,e) = \dim _{\mathbb {C}} M_{(d,e)} \end{equation*}

and keeps track of the dimension of the vector space $M_{(d,e)}$. Note that for a homogeneous ideal $I \subset S$ and $(d,e) \in \mathbb {N}^2$, we have $\mbox{HF}_{S/I}(d,e) = \mbox{HF}_S(d,e) - \mbox{HF}_I(d,e)$.

Example 3.1.

The Hilbert function of the ring $S$ is given explicitly by

\begin{equation*} \mbox{HF}_S(d,e) = \dim _{\mathbb {C}} S_{(d,e)} = \begin{pmatrix}m + d \\ d \end{pmatrix} \begin{pmatrix}n + e \\ e \end{pmatrix}, \quad \text{with} \quad (d,e) \in \mathbb {N}^2. \square \end{equation*}

Before stating the main result of this section, theorem 3.1, we present three auxiliary lemmas. Lemma 3.1 3.2 are well-known in commutative algebra. For lack of a precise reference and completeness, we included short proofs in the Appendix.

Lemma 3.1.

Let $I \subset S$ be a homogeneous ideal such that $V_X(I)$ consists of $r \lt \infty$ points $\lbrace \zeta _1, \ldots , \zeta _r\rbrace$. For each $(d^{\prime },e^{\prime }) \in \mathbb {N}^2$, there exists a homogeneous polynomial $h_0 \in S_{(d^{\prime },e^{\prime })}$ such that $V_X(h_0) \cap V_X(I) = \emptyset$. Equivalently, we can find $h_0 \in S_{(d^{\prime },e^{\prime })}$ such that $h_0(\zeta _i) \ne 0,\; i = 1, \ldots , r$.

Lemma 3.2.

Let $I \subset S$ be a homogeneous ideal such that $V_X(I)$ consists of $r \lt \infty$ points, each with multiplicity one. There exists $(d,e) \in \mathbb {N}^2$ with $(d,e) \ne (1,1)$ and $(d-1,e-1) \ge (0,0)$ (entry-wise) such that $\mbox{HF}_{S/I}(d,e) = r$.

Assumption 3.1.

Henceforth, we assume the following:

(1)

Let $I = \langle {\ker \mathcal {A}_{(1)}}\rangle = \left\langle {f_1,\ldots ,f_s} \right\rangle$ be as in Section 2.3, where $f_1, \ldots , f_s \in S_{(1,1)}$ form a basis for $\ker \mathcal {A}_{(1)}$. We assume that $\mathcal {A}$ satisfies assumption 2.1, so that $V_X(I)$ consists of the $r$ points $\zeta _1, \ldots , \zeta _r$ with multiplicity one by lemma 2.2.

(2)

The tuple $(d,e) \in \mathbb {N}^2$ is such that $(d,e) \ge (1,1)$ and $\mbox{HF}_{S/I}(d,e) = r$. Note that this is satisfied for $(d,e) = (1,1)$ by construction of $I$.³

(3)

We write $(d^{\prime },e^{\prime }) = (d - 1, e-1)$, and $h_0 \in S_{(d^{\prime },e^{\prime })}$ is such that $h_0(\zeta _i) \ne 0, i = 1, \ldots , r$. This makes sense by lemma 3.1.

Lemma 3.3.

Let $I, (d,e)$ and $h_0$ be as in assumption 3.1. If a $\mathbb {C}$-linear map $N: S_{(d,e)} \rightarrow \mathbb {C}^r$ satisfies

\begin{equation} \mbox{rank}(N) = r \quad \text{and}\quad \ker N = I_{(d,e)}, \end{equation}

(3.1)

then the induced linear map $N_{h_0} : S_{(1,1)} \rightarrow \mathbb {C}^r, f \mapsto N(h_0 f)$ has rank $r$.

Proof.

Note that $\mbox{HF}_{S/I}(1,1) = r$ by construction. By Proposition 5.5.7 in Telen [2020b], the fact that $\mbox{HF}_{S/I}(d,e) = r$ implies that $(1,1)$ and $(d^{\prime },e^{\prime })$ form a so-called regularity pair. Surjectivity of the map $N_{h_0}$ then follows directly from Lemma 5.5.3 in Telen [2020b].□

Linear maps $N: S_{(d,e)} \rightarrow \mathbb {C}^r$ satisfying Equation (3.1) will play an important role in translating our polynomial root finding problem into an eigenvalue problem. We will therefore give them a name.

Definition 3.1 (Pre-normal Forms).

A $\mathbb {C}$-linear map $N: S_{(d,e)} \rightarrow \mathbb {C}^r$ satisfying Equation (3.1) is called a pre-normal form on $S_{(d,e)}$.

If $r = \ell +1$, then assumption 2.1 implies that $\mathcal {A}_{(1)}: S_{(1,1)} \rightarrow \mathbb {C}^r$ is a pre-normal form on $S_{(1,1)}$.

We use the terminology pre-normal form, because, following Definition 5.5.6 in Telen [2020b], the term normal form is reserved for linear maps satisfying an extra technical condition. Normal forms are closely related to a classical result in computational algebraic geometry called the eigenvalue theorem [Cox et al. 2006, Section 2.4], which is used for computing isolated solutions to polynomial systems. The pre-normal forms introduced above will be useful for formulating a multi-homogeneous version of this theorem, namely, theorem 3.1 below. Before stating it, we need to fix some additional notation.

In lemma 3.3, $N_{h_0}$ is the composition of $N$ with the linear map that represents multiplication by $h_0 \in S_{(d^{\prime },e^{\prime })}$. More generally, for $g \in S_{(d^{\prime },e^{\prime })}$, we define $N_g : S_{(1,1)} \rightarrow S_{(d,e)}$ as $N_g(f) = N(gf)$. In theorem 3.1, we will restrict the map $N_{h_0}$ to an $r$-dimensional subspace $B \subset S_{(1,1)}$ such that the resulting map is invertible. We denote this restriction by $(N_{h_0})_{|B}$. In practice, it suffices to select $r$ columns of the matrix of $N_{h_0}$ so that the resulting $r \times r$ submatrix $(N_{h_0})_{|B}$ is invertible.

In what follows, we will write $w_i ^\top = (\beta _i \otimes \gamma _i)^\top _{|B} \in B^\vee$ for the linear functional representing evaluation at $\zeta _i = (\beta _i, \gamma _i)$. This is the restriction of the functional $(\beta _i \otimes \gamma _i)^\top \in S_{(1,1)}^\vee$ to the vector space $B$. Concretely, we set

\begin{equation} w_i^\top (b) = (\beta _i \otimes \gamma _i)^\top _{|B} (b) = b(\beta _i,\gamma _i), \quad \text{with} \quad b(x,y) \in B \subset S_{(1,1)}. \end{equation}

(3.2)

After fixing a basis for $B^\vee$, $w_i^\top$ can be represented as a row vector.

If $g, h \in S_{(d^{\prime },e^{\prime })} \setminus \lbrace 0\rbrace$ are two homogeneous polynomials of the same degree, then the fraction $g/h$ is a well-defined function on $X \setminus V_X(h)$. Indeed, the evaluation of this function does not depend on the choice of homogeneous coordinates. Therefore, we may write $(g/h)(\zeta)$ for the evaluation of this rational function at $\zeta \in X \setminus V_X(h)$.

Theorem 3.1 (Eigenvalue Theorem).

Let $I$, $(d,e)$ and $h_0$ be as in assumption 3.1 and let $N$ be a pre-normal form. Let $B \subset S_{(1,1)}$ be any $r$-dimensional subspace such that the restriction $(N_{h_0})_{|B} : B \rightarrow \mathbb {C}^r$ is invertible. For any $g \in S_{(d^{\prime },e^{\prime })}$, we have

\begin{equation} w_i^\top \circ M_{g/h_0} = \frac{g}{h_0}(\zeta _i) \cdot w_i^\top , \quad i = 1, \ldots , r, \end{equation}

(3.3)

where $M_{g/h_0}: B \rightarrow B$ is the composition $(N_{h_0})_{|B}^{-1} \circ (N_g)_{|B}$.

Proof.

This follows from Theorem 5.5.3, Propositions 5.5.4 and 5.5.5 in Telen [2020b] and the fact that $(1,1)$ and $(d^{\prime },e^{\prime })$ form a regularity pair (see the proof of lemma 3.3).□

After fixing a basis for $B$ and representing $w_i$ in the dual basis for $B^\vee$, Equation (3.3) is a standard matrix eigenproblem: $w_i^\top M_{g/h_0} = \lambda _i w_i^\top$. That is, $(\lambda _i,w_i)$ is a left eigenpair of the $r$ $\times$ $r$ matrix $M_{g/h_0}$. Note that theorem 3.1 implies that all maps of the form $M_{g/h_0}$ share a set of eigenvectors.

We now sketch one way of using theorem 3.1 to retrieve the coordinates of $\zeta _i = (\beta _i, \gamma _i)$ from eigenvalues, assuming a pre-normal form $N: S_{(d,e)} \rightarrow \mathbb {C}^r$ is given. The problem of computing a pre-normal form is addressed in the next subsection. We assume $d \ge 2$.⁴ Let $h \in S_{(d^{\prime }-1,e^{\prime })}$ and $h_0 \in S_{(d^{\prime },e^{\prime })}$ be homogeneous polynomials that do not vanish at any of the points $\zeta _i$. These can be chosen generically. Set $g_j = x_j h \in S_{(d^{\prime },e^{\prime })}, j = 0, \ldots , m$. Choose $B \subset S_{(1,1)}$ of dimension $r$ such that $(N_{h_0})_{|B}$ is invertible and compute the matrices $M_j = M_{g_j/h_0} = (N_{h_0})_{|B}^{-1} \circ (N_{g_j})_{|B}$. By theorem 3.1, the eigenvalues of $M_j$ are given by $\lambda _{ji} = (g_j/h_0)(\zeta _i)$. Writing $\beta _{ij}$ for the $j$th coordinate of $\beta _i$, we have

\begin{equation*} (\lambda _{0i} : \cdots : \lambda _{mi}) = \left(\frac{\beta _{i0} h(\zeta _i)}{h_0(\zeta _i)} : \cdots : \frac{\beta _{im} h(\zeta _i)}{h_0(\zeta _i)} \right) = (\beta _{i0} : \cdots : \beta _{im}) = \beta _i. \end{equation*}

Note that if $(d,e) = (2,1)$, we can take $h = 1$.

Subsequently, we compute $\gamma _i$ by solving the linear system of equations

\begin{equation} f_1(\beta _i,y) = \cdots = f_s(\beta _i,y) = 0. \end{equation}

(3.4)

The foregoing approach requires that $(d,e) \ne (1,1)$. Otherwise $S_{(d^{\prime },e^{\prime })}=S_{(0,0)} = \mathbb {C}$, and we can only evaluate constant functions using theorem 3.1.

3.2 Computing Pre-normal Forms

As illustrated in the previous subsection, once we have computed a pre-normal form, the points $\zeta _i$ can be recovered using basic linear algebra computations. A natural next issue to address is how to compute a pre-normal form.

Our starting point is a basis $f_1, \ldots , f_s \in S_{(1,1)}$ of $\ker \mathcal {A}_{(1)}$, generating our ideal $I = \langle {\ker \mathcal {A}_{(1)}}\rangle = \left\langle {f_1,\ldots ,f_s} \right\rangle$. For any tuple $(d,e) \in \mathbb {N}^2$ such that $(d,e) \ge (1,1)$, the degree-$(d,e)$ part $I_{(d,e)}$ of $I$ is the $\mathbb {C}$-vector space spanned by

\begin{equation} \lbrace x^{a^{\prime }} y^{b^{\prime }} f_i ~|~ (|a^{\prime }|, |b^{\prime }|) = (d^{\prime },e^{\prime }),~ i = 1, \ldots , s \rbrace ~ \subset ~ S_{(d,e)}, \end{equation}

(3.5)

where $(d^{\prime },e^{\prime }) = (d-1,e-1)$. If $d = 0$ or $e = 0$, then we have $I_{(d,e)} = \lbrace 0\rbrace$. In analogy with Equation (1.1), we construct a matrix whose rows are indexed by the monomial basis elements of $S_{(d,e)}$ (i.e., the monomials $\lbrace x^ay^b ~|~ |a| = d, |b|=e \rbrace$), and whose columns are the polynomials (3.5) expanded in this basis. We denote this matrix by

\begin{equation} R_I(d,e) \in \mathbb {C}^{\mbox{HF}_S(d,e)~\times ~ s\, \mbox{HF}_S(d^{\prime },e^{\prime })}. \end{equation}

(3.6)

Such matrices represent graded resultant maps in the terminology of Section 5.5.4 in Telen [2020b]. They are multihomogeneous versions of the classical Macaulay matrices [Macaulay 1916]. We present an explicit example below in example 3.2.

Observe that the Hilbert function $\mbox{HF}_I(d,e)$ is given by the rank of $R_I(d,e)$ and $\mbox{HF}_{S/I}(d,e)$ by its corank. This follows immediately from the observation that the columns of $R_I(d,e)$ span $I_{(d,e)}$. A left nullspace matrix $N$ of $R_I(d,e)$ represents a map $S_{(d,e)} \longrightarrow S_{(d,e)}/I_{(d,e)} \simeq \mathbb {C}^{\mbox{HF}_{S/I}(d,e)}$. This has the following consequence.

Proposition 3.1.

If $(d,e) \in \mathbb {N}^2$ is such that $\mbox{HF}_{S/I}(d,e) = r$, then any left nullspace matrix $N$ of $R_I(d,e)$ represents a pre-normal form.

We conclude that a pre-normal form $N$ can be computed, for instance, from a full singular value decomposition (SVD) of $R_I(d,e)$, where $\mbox{HF}_{S/I}(d,e) = r$. This solves the problem of computing a pre-normal form, assuming that we know a degree $(d,e) \in \mathbb {N}^2$ for which $\mbox{HF}_{S/I}(d,e) = r$. The problem of finding such degrees is addressed in Section 4.

3.3 Relation to Pencil-based Algorithms

To obtain the homogeneous coordinates for $\zeta _1, \ldots , \zeta _r$ as eigenvalues of the matrices $M_{g/h_0}$, we usually have to work with pre-normal forms on $S_{(d,e)}$, where $(d,e) \ne (1,1)$ and $(d^{\prime },e^{\prime }) \ge (0,0)$. An exception is the case where $r \le m+1 \le \ell + 1$. For these tensors of very low rank, a pre-normal form $N: S_{(1,1)} \rightarrow \mathbb {C}^r$ will suffice under the mild condition that

\begin{equation} [ \beta _1 ~ \cdots ~ \beta _r] ~ \in \mathbb {C}^{(m+1) \times r} \quad \text{has rank $r$.} \end{equation}

(3.7)

The underlying reason is that vanishing at $\lbrace \zeta _1,\ldots , \zeta _r\rbrace$ gives $r$ linearly independent conditions on $S_{(1,0)}$. The proof of the following theorem is another consequence of the theory of homogeneous normal forms and is deferred to the Appendix.

Theorem 3.2 (Eigenvalue Theorem for Low Ranks).

Let $I = \left\langle {\ker \mathcal {A}_{(1)}} \right\rangle$ where $\mathcal {A}$ has rank $r \le m+1 \le n+1$ and satisfies both assumption 2.1 and Equation (3.7). Let $h_0 \in S_{(0,1)}$ be such that $h_0(\zeta _i) \ne 0$ for $i = 1\ldots , r$ and let $N : S_{(1,1)} \rightarrow \mathbb {C}^r$ be a pre-normal form on $S_{(1,1)}$. For $g \in S_{(0,1)}$, we define

\[\widetilde{N_{g}} : S_{(1,0)} \rightarrow \mathbb {C}^r \quad \text{given by } \quad \widetilde{N_{g}}(f) = N (gf).\]

We have that $\widetilde{N_{h_0}}$ has rank $r$. For any $r$-dimensional subspace $B \subset S_{(1,0)}$ such that the restriction $(\widetilde{N_{h_0}})_{|B} : B \rightarrow \mathbb {C}^r$ is invertible, the eigenvalues of $M_{y_j/h_0} = (\widetilde{N_{h_0}})_{|B}^{-1} \circ (\widetilde{N_{y_j}})_{|B}$ are $\lbrace \gamma _{ij}/h_0(\zeta _i)\rbrace _{i = 1, \ldots , r}$.

This theorem is exploited to compute $\zeta _i = (\beta _i, \gamma _i)$ efficiently as follows. If $\mathcal {A}$ satisfies assumption 2.1 and Equation (3.7), then we take a basis of the $r$-dimensional row span of $\mathcal {A}_{(1)}$ in theorem 3.2 as our pre-normal form. This can be obtained from a compact SVD of $\mathcal {A}_{(1)}$. Once the $\gamma _i$ are computed from the eigenvalues of the $M_{y_j/h_0}$, the $\beta _i$ can be obtained as in Equation (3.4). Alternatively, one can use the eigenvectors of these commuting matrices $M_{y_j/h_0}$ for $j=1,\ldots ,r$ [Telen 2020b, Theorem 5.5.3].

Theorem 3.2 is intimately related to what Beltrán et al. [2019] called pencil-based algorithms for solving Equation (CPD) when the rank satisfies $r \le m+1 \le \ell +1$, such as those by Leurgans et al. [1993]; Lorber [1985]; Sanchez and Kowalski [1990]. Recall that pencil-based algorithms assume that $\mathcal {A}\in \mathbb {C}^{r \times r \times (n+1)}$ is a rank-$r$ tensor.⁵ In addition, they assume that the $\alpha _i$ form a linearly independent set, and likewise for the $\beta _i$’s. Then, we have that the tensor contraction of $\mathcal {A}$ with $h_0^\top \in (\mathbb {C}^{n+1})^\vee$, i.e.,

\begin{align} h_0^\top \cdot _3 \mathcal {A} = \sum _{i=1}^r (\alpha _i \otimes \beta _i) \cdot (h_0^\top \gamma _i) = A D_{h_0} B^\top , \end{align}

(3.8)

is an invertible $r \times r$ matrix insofar as $h_0^\top \gamma _i \ne 0$. Herein, $A \in \mathbb {C}^{r \times r}$ (respectively, $B \in \mathbb {C}^{r \times r}$) has the $\alpha _i$’s (respectively, $\beta _i$’s) as columns, and $D_{h_0} = \mbox{diag}(h_0^\top \gamma _1, \ldots , h_0^\top \gamma _r)$. Let $\widetilde{N_{h_0}} = h_0^\top \cdot _3 \mathcal {A}$ and $\widetilde{N_{g}} = g^\top \cdot _3 \mathcal {A}$ for $h_0, g \in (\mathbb {C}^{n+1})^\vee$. Then, we have

\begin{equation*} M_{g/h_0} = \widetilde{N_{h_0}}^{-1} \widetilde{N_{g}} = B^{-\top } D_{h_0}^{-1} D_{g} B^\top , \end{equation*}

so that the points $\beta _i$ can be recovered uniquely from the matrix of eigenvectors $B^\top$, provided that $h_0^\top \gamma _i \ne 0$ for all $i=1,\ldots ,r$. The $\alpha _i$’s and $\gamma _i$’s can then be recovered from the 2-flattening; see Leurgans et al. [1993]; Lorber [1985]; Sanchez and Kowalski [1990]; Beltrán et al. [2019]; De Lathauwer [2006] for more details. With the foregoing suggestive notation, it is easy to see that the matrix of $\widetilde{N_{h_0}} : f \mapsto N(fh_0)$ with respect to the standard bases is precisely Equation (3.8). Indeed, note that since we can take $N = \mathcal {A}_{(1)}$, we have $f h_0 \simeq f \otimes h_0$ and so $N(f h_0) = \mathcal {A}_{(1)}(f \otimes h_0)$.

Pencil-based algorithms may thus be interpreted as a special case of the proposed cpd_hnf algorithm based on homogeneous normal forms when $r \le m+1 \le \ell +1$. Note that because of the numerical instabilities analyzed by Beltrán et al. [2019] caused by extracting $\beta _i$ from the eigenvectors, we prefer to extract the $\zeta _i = (\beta _i,\gamma _i)$ in a different way. We compute $\beta _i$ from the eigenvalues of $M_{x_i/h_0}$ and the corresponding $\gamma _i$ from the linear system Equation (3.4).

3.4 The Algorithm

The discussion so far is distilled into 3.1. This algorithm implements step 4 of 2.1. Note that we dropped the tilde on top of the $N_\star$’s in lines 2–7 to streamline the presentation.

The first phase of the algorithm, up to line 21, constructs the pre-normal form $N$ and chooses an $N_{h_0}$. This phase depends on whether we can invoke the more efficient theorem 3.2 ($r \le m+1$) or we need the full power of theorem 3.1. In the former case, we can take $N = \mathcal {A}_{(1)}$, while in the latter case, we need to take $N$ equal to the left null space of $R_I(d,e)$. How we choose the degree $(d,e)$ in line 9 is explained in Section 4. The matrix $R_I(d,e) \in \mathbb {C}^{\mbox{HF}_S(d,e) \times s \mbox{HF}_S(d^{\prime },e^{\prime })}$ can be constructed efficiently column-by-column without polynomial multiplication. Indeed, by Equation (3.5) it suffices to copy the coefficients of $f_i$ relative to the monomial basis of $S_{(1,1)}$ into the correct rows; see also example 3.2 below. The left null space $N$ can be extracted from the last $r$ columns of the $U$-factor in the SVD $R_I(d,e)=USV^H$, where $\cdot ^H$ denotes the conjugate transpose. In our implementation, the matrix $N_{h_0} \in \mathbb {C}^{r \times \mbox{HF}_S(1,1)}$ is chosen by sampling the coefficients of $h_0 \in S_{(d^{\prime },e^{\prime })}$ identically and independently distributed from a Gaussian distribution. With probability 1, $h_0$ satisfies $h_0(\gamma _i) \ne 0$ for all $i$; hence, this is a valid choice of $h_0$.

The next phase of the algorithm, in lines 21–22, chooses a basis $B$. Although in theory theorem 3.1 enables us to choose any $B$ such that $(N_{h_0})_{|B}^{-1}$ is invertible, Telen et al. [2018] showed that for reasons of numerical stability it is crucial to choose $B$ such that $(N_{h_0})_{|B}$ is well-conditioned. In practice, such a subspace $B$ can be found using a QR decomposition with optimal column pivoting or by using the SVD [Telen 2020b, Chapter 4]. We stated the QR approach in 3.1.

The multiplication matrices are constructed straightforwardly as the formula suggests in lines 23 to 25. Note that the upper triangular matrix $(N_{h_0})_{|B}$ does not need to be inverted explicitly, rather the system can be solved by backsubstitution.

In line 26, the matrices $M_{(h x_k)/h_0}$ need to be simultaneously diagonalized, as we have that $M_{(h x_k)/h_0} = V^{-1} \operatorname{diag}(\beta _k) V$. We compute $V$ from a random linear combination of the $M_{(h x_k)/h_0}$’s, and then extract $\beta _k$ as the diagonal entries from the (approximately) diagonalized matrix $V^{-1} M_{(h x_k)/h_0} V$.

The system in line 28 is solved efficiently by noting that the coefficients of $f_j$ can be arranged in a matrix $F_j$ of size $(m+1) \times (n+1)$, so that $f_j(x,y) = x^\top F_j y$. Hence, Equation (3.4) boils down to computing the kernel of $Ay = 0$ where the rows of $A \in \mathbb {C}^{s \times (n+1)}$ are the row vectors $\beta _i^\top F_j$. Note that by assumption 2.1, $\ker A$ is spanned by $\gamma _i$. In line 29 the obtained solution $(\beta _i,\gamma _i)$ is refined using standard Newton iterations. Here the pseudo-inverse of the Jacobian matrix of $f_1, \ldots , f_s$ is used. This matrix has full rank if and only if $(\beta _i, \gamma _i)$ has multiplicity one. This can be used to check condition (iii) in lemma 2.2.

Remark 3.1.

As pointed out to us by an anonymous referee and Bernard Mourrain, it is possible to replace the left nullspace computation in line 11 by a smaller linear system of equations. This interpretation corresponds to the flat extension of the quasi-Hankel operators in Bernardi et al. [2013]. The pre-normal form $N$ can be chosen such that $N_{h_0}$ is given by $\mathcal {A}_{(1)}$. This determines $N$ on the $\mbox{HF}_S(1,1)$-dimensional subspace $h_0 \cdot S_{(1,1)}$. For $(a^{\prime },b^{\prime })$ such that $|a^{\prime }| = d^{\prime }$ and $|b^{\prime }| = e^{\prime }$, the restriction $N_{|x^{a^{\prime }}y^{b^{\prime }} \cdot S_{(1,1)}}$ should satisfy $N_{|x^{a^{\prime }}y^{b^{\prime }} \cdot S_{(1,1)}}(x^{a^{\prime }}y^{b^{\prime }} \cdot f_i) = 0$, $i = 1, \ldots , s$. These linear conditions determine the pre-normal form $N$ on the remaining $(\mbox{HF}_S(d,e) - \mbox{HF}_S(1,1))$-dimensional vector space $\mbox{HF}_S(d,e)/ (h_0 \cdot S_{(1,1)})$.

This observation allows to perform the main linear algebra computations on a matrix of size $(\mbox{HF}_S(d,e) - \mbox{HF}_S(1,1)) \times s\, \mbox{HF}_S(d^{\prime },e^{\prime })$, which is smaller than the size of $R_I(d,e)$. Note that the number of rows is reduced by a factor of $1- \mbox{HF}_S(1,1)/\mbox{HF}_S(d,e) \approx 1 - \frac{1}{m^{d-1} n^{e-1}}$. Note that the resulting pre-normal form $N$ does not represent an orthogonal projection along $I_{(d,e)}$, and we expect that this may impact the numerical accuracy. The implementation and further investigation of this procedure are beyond the scope of this article.

Table 1.

3.5 Some Examples

We now present two illustrative examples. The first one shows how to use the techniques explained above on the tensor $\mathcal {A}$ in example 1.1.

The ideal $I$ in the previous example has the property that $\mbox{HF}_{S/I}(1+d^{\prime },1+e^{\prime }) = r$ for all $(d^{\prime },e^{\prime }) \in \mathbb {N}^2$. Our next example shows that this is not the case in general.

Example 3.3 (A Format for which HF_S/I(2,1)≠ r)

In example 3.2, we could take any $(d^{\prime },e^{\prime }) \in \mathbb {N}^2$ to compute a pre-normal form. However, it may be necessary to take bigger leaps in $\mathbb {N}^2$ such that $\mbox{HF}_{S/I}((1,1) + (d^{\prime }, e^{\prime })) = r$. As a concrete example, consider a rank-12 tensor $\mathcal {A}\in \mathbb {C}^{12} \otimes \mathbb {C}^7 \otimes \mathbb {C}^{3}$ with the decomposition

\begin{equation*} \mathcal {A}= \sum _{i=1}^{12} \alpha _i \otimes \beta _i \otimes \gamma _i, \end{equation*}

where $\beta _1, \ldots , \beta _{12}$ are the columns of a generic $7 \times 12$ matrix, $\gamma _1, \ldots , \gamma _{12}$ are the columns of a generic $3 \times 12$ matrix and $\alpha _1, \ldots , \alpha _{12}$ are the columns of any invertible $12 \times 12$ matrix. The Hilbert function of $S/I$ where $I = \left\langle {\ker \mathcal {A}_{(1)}} \right\rangle$ is shown, for small degrees, in 1. By proposition 3.1, a possible choice for $(d^{\prime },e^{\prime })$ is $(2,0)$; this is underlined in the left part of 1. Other examples are $(2,1), (2,2), (1,2), (0,4)$. Some noteworthy non-examples are $(1,0), (1,1), (0,1)$. In Section 4, we investigate choices of $(d^{\prime },e^{\prime })$ of the form $(d^{\prime },0)$. Our results will explain why, in this example, $d^{\prime } = 1$ does not work, but $d^{\prime } = 2$ does.

4 Regularity

One key step of the proposed cpd_hnf algorithm has not been investigated. As explained in the previous section (see also line 9 of 3.1), we should determine a correct degree $(d,e)$. This choice has a major impact on the computational complexity of the proposed algorithm. Indeed, it determines the dimensions of the graded resultant matrix $R_I(d,e)$ from Equation (3.6) whose left nullspace is required. The goal of this section is determining which degree $(d,e)$ is needed for theorem 3.1 to apply. From this we can then deduce our algorithm’s computational complexity.

As before, let $\mathcal {A}$ be a tensor as in Equation (A) that satisfies assumption 2.1, and let the $\mathbb {N}^2$-graded ring from Equation (2.2) be denoted by $S = \mathbb {C}[x_0, \ldots , x_m, y_0, \ldots , y_n]$. We assume that $r\gt m+1$, for otherwise theorem 3.2 applies and no choice of $(d,e)$ is required.

To compute $\beta _i$ and $\gamma _i$ via theorem 3.1, we need to compute a pre-normal form on $S_{(d,e)}$ for $(d,e) \ne (1,1)$ and $(d^{\prime },e^{\prime }) \ge (0,0)$. Proposition 3.1 tells us that we must find such a tuple $(d,e)$ for which additionally $\mbox{HF}_{S/I}(d,e) = r$, where $I$ is the ideal $\langle { \ker \mathcal {A}_{(1)}}\rangle$. Motivated by this, we make the following definition.

Definition 4.1.

For a homogeneous ideal $J \subset S$ such that $J = \langle {J_{(1,1)}}\rangle$ and $V_X(J)$ consists of $r$ points with multiplicity one, we define the regularity of $J$ to be the set

\[\mbox{Reg}(J) = \lbrace (d,e) \in \mathbb {N}^2 ~|~ (d,e) \ge (1,1) \text{ and } \mbox{HF}_{S/J}(d,e) = r\rbrace .\]

Hence, our task is to find a tuple $(d,e) \in \mbox{Reg}(I) \setminus \lbrace (1,1)\rbrace$. Recall that such a tuple exists by lemma 3.2. In this section, for given $\ell , m, n$ and $r$ satisfying (R), we conjecture an explicit formula for $d$ and $e$ so that $(d,1), (1,e) \in \mbox{Reg}(I) \setminus \lbrace (1,1)\rbrace$ for generic tensors of this format and rank. We prove it in many practical cases.

Because the results in this section are of independent interest for solving structured, overdetermined systems of polynomial equations, we formulate them in a slightly more general context. The first statement of this section, proposition 4.1, is concerned with homogeneous ideals $J$ of $S$ that are generated by elements of degree $(1,1)$. After that, we specialize to a particular type of such $(1,1)$-generated ideals. More precisely, to a tuple $Z = (\zeta _1, \ldots , \zeta _r) \in X^r$, we associate an ideal $J(Z)$, which is generated by elements of degree $(1,1)$, and we investigate its Hilbert function (4.1). In our tensor setting, we will have $J(Z) = I = \left\langle {\ker \mathcal {A}_{(1)}} \right\rangle$. After pointing out in lemma 4.1 that for $r \le mn$, most configurations $Z = (\zeta _1, \ldots , \zeta _r) \in X^r$ lead to an ideal $J(Z)$ such that $V_X(J(Z)) = \lbrace \zeta _1, \ldots , \zeta _r \rbrace$, where each of the $\zeta _i$ occurs with multiplicity one, we use 4.1 to characterize $\mbox{Reg}(J(Z))$ in theorem 4.1 4.2 1.

For $a = (a_0, \ldots , a_m) \in \mathbb {N}^{m+1}$, $b = (b_0, \ldots , b_n) \in \mathbb {N}^{n+1}$, we write $\partial _{a,b} : S \rightarrow S$ for the differential operator

\begin{equation*} \partial _{a,b} = \frac{1}{a_0! \cdots a_{m}! b_0! \cdots b_{n}!} \frac{\partial ^{|a| + |b|}}{\partial x_0^{a_0} \cdots \partial x_{m}^{a_{m}} \partial y_0^{b_0} \cdots \partial y_{n}^{b_{n}}}, \end{equation*}

such that the basis of $S_{(d,e)}^\vee$ dual to $\lbrace x^a y^b ~|~ |a| = d, |b| = e \rbrace$ is given by $\lbrace \partial _{a,b} ~|~ |a| = d, |b| = e \rbrace$. We write $e_k$ for the standard basis vector $(0, \ldots , 1, \ldots , 0)$ with a 1 in the $(k+1)$st position, such that

\begin{equation*} \partial _{e_k,e_l} = \frac{\partial ^2}{\partial x_k \partial y_l}, \qquad 0 \le k \le m,~ 0 \le l \le n. \end{equation*}

Note that this differs from the shorter notation $\partial _{kl}$ used in previous sections to have a general notation for derivatives of arbitrary order. For $(d,e) \in \mathbb {N}^2$ and $J \subset S$ a homogeneous ideal, the linear space $J_{(d,e)}^\perp \subset S_{(d,e)}^\vee$ is defined as

\begin{equation*} J_{(d,e)}^\perp = \left\lbrace v \in S_{(d,e)}^\vee ~|~ v(f) = 0 \text{ for all } f \in I \right\rbrace . \end{equation*}

It follows from basic linear algebra that $J_{(d,e)}^\perp \simeq (S/J)_{(d,e)}$, such that $\mbox{HF}_{S/J}(d,e) = \dim _{\mathbb {C}} J_{(d,e)}^\perp$. As before, we denote $(d^{\prime },e^{\prime }) = (d-1,e-1)$. To simplify the notation in this section, we will use the abbreviations

\begin{equation*} \sum _{a,b}= \sum _\begin{array}{c}|a| = d \\ |b| = e\end{array}, \qquad \sum _{a^{\prime },b^{\prime }}= \sum _\begin{array}{c}|a^{\prime }| = d^{\prime } \\ |b^{\prime }| = e^{\prime }\end{array}, \qquad \sum _{k,l}= \sum _\begin{array}{c}0 \le k \le m \\ 0 \le l \le n\end{array}. \end{equation*}

We focus on ideals $J$ that are generated by elements of degree $(1,1)$. In this case, a functional belongs to $J_{(d,e)}^\perp$ if and only if it induces functionals in $J_{(1,1)}^\perp$.

Proposition 4.1.

Let $J \subset S$ be a homogeneous ideal such that $J = \langle {J_{(1,1)}}\rangle$. An element $v = \sum _{a,b}c_{a,b} \partial _{a,b} \in S_{(d,e)}^\vee$ is contained in $J_{(d,e)}^\perp$ if and only if

\begin{equation*} \sum _{k,l}c_{a^{\prime }+e_k, b^{\prime } + e_l} \partial _{e_k,e_l} \in J_{(1,1)}^\perp \quad \text{ for all } (a^{\prime },b^{\prime }) \text{ such that } |a^{\prime }| = d^{\prime }, |b^{\prime }| = e^{\prime }. \end{equation*}

Proof.

Since $J = \langle {J_{(1,1)}}\rangle$, an element $v = \sum _{a,b}c_{a,b} \partial _{a,b} \in S_{(d,e)}^\vee$ is contained in $J_{(d,e)}^\perp$ if and only if $v(hf) = 0$ for all $f \in J_{(1,1)}$ and all $h \in S_{(d^{\prime },e^{\prime })}$. Using Leibniz’ rule, we find

\begin{equation*} 0 = v(hf) = \sum _{a,b}c_{a,b} \partial _{a,b} (hf) = \sum _{a,b}c_{a,b} \sum _{k,l}\partial _{a-e_k,b-e_l}(h) ~ \partial _{e_k,e_l}(f), \end{equation*}

with the convention that $\partial _{a,b} = 0$ whenever $\min (a) \lt 0$ or $\min (b) \lt 0$. Regrouping the terms in this expression gives

\begin{equation*} 0 = \sum _{a^{\prime },b^{\prime }}\partial _{a^{\prime },b^{\prime }}(h) \sum _{k,l}c_{a^{\prime }+e_k, b^{\prime } + e_l} \partial _{e_k,e_l}(f) \quad \text{ for all $h \in S_{(d^{\prime },e^{\prime })}, f \in J_{(1,1)}$}. \end{equation*}

This proves the statement.□

In our tensor rank decomposition setting, we are mainly interested in investigating the Hilbert function for $(1,1)$-generated ideals defined by point configurations in $X = \mathbb {P}^m \times \mathbb {P}^n$. To that end, fix $r$ points $Z = (\zeta _1, \ldots , \zeta _r) \in X^r$ and let $\zeta _i = (\beta _i, \gamma _i)$. We denote $w_i = (\beta _i \otimes \gamma _i)^\top \in S_{(1,1)}^\vee$ such that $w_i(f) = f(\beta _i,\gamma _i)$ for $f \in S_{(1,1)}$.⁶ In coordinates, the $w_i$ are $w_i = \sum _{k,l}\beta _{ik} \gamma _{il} \partial _{e_k,e_l}.$ To the point configuration $Z$, we associate an ideal $J(Z) \subset S$ by setting

\begin{equation*} J(Z)_{(1,1)}^\perp = \mbox{span}_{\mathbb {C}}(w_1, \ldots , w_r) \quad \text{and} \quad J(Z) = \left\langle { J(Z)_{(1,1)} } \right\rangle . \end{equation*}

Note that the ideal $I = \left\langle {\ker {\mathcal {A}_{(1)}}} \right\rangle$ from previous sections arises in this way.⁷ We denote $W \subset X^r$ for the Zariski-open subset in which $w_1, \ldots , w_r$ are $\mathbb {C}$-linearly independent. If $r \le \mbox{HF}_S(1,1)$, then $W \subset X^r$ is non-empty and therefore dense in the Euclidean topology.⁸ We have the following consequences of proposition 4.1.

Corollary 4.1.

Let $Z = (\zeta _1, \ldots , \zeta _r) \in X^r$ and let $w_1, \ldots , w_r$ and $J(Z)$ be as above. For the maps $\iota : S_{(d,e)}^\vee \hookrightarrow S_{(1,1)}^\vee \otimes S_{(d^{\prime },e^{\prime })}^\vee$ and $\mathcal {M}: (S_{(d^{\prime },e^{\prime })}^\vee)^r \rightarrow S_{(1,1)}^\vee \otimes S_{(d^{\prime },e^{\prime })}^\vee$ given by

\begin{equation*} \iota \left(\sum _{a,b}c_{a,b} \partial _{a,b} \right) = \sum _{a^{\prime },b^{\prime }}\sum _{k,l}c_{a^{\prime }+e_k, b^{\prime } + e_l} \partial _{e_k, e_l} \otimes \partial _{a^{\prime },b^{\prime }}, \quad \mathcal {M}(v_1, \ldots , v_r) = \sum _{i=1}^r w_i \otimes v_i, \end{equation*}

we have $\iota (J(Z)_{(d,e)}^{\perp }) = \operatorname{\mbox{im}}\iota \, \cap \, \operatorname{\mbox{im}}\mathcal {M}$. Moreover, if $Z \in W$, then we have that $\mbox{HF}_{S/J(Z)}(d,e) = \dim _{\mathbb {C}} (\mathcal {M}^{-1}(\operatorname{\mbox{im}}\iota)),$ where

\begin{equation*} \mathcal {M}^{-1}(\operatorname{\mbox{im}}\iota) = \left\lbrace (v_1,\ldots ,v_r) \in (S_{(d^{\prime },e^{\prime })}^{\vee })^r ~|~ \mathcal {M}(v_1,\ldots ,v_r) \in \operatorname{\mbox{im}}\iota \right\rbrace . \end{equation*}

Proof.

By proposition 4.1, $v = \sum _{a,b}c_{a,b} \partial _{a,b}$ is an element of $J(Z)_{(d,e)}^\perp$ if and only if for all $a^{\prime },b^{\prime }$ such that $|a^{\prime }| = d^{\prime }, |b^{\prime }| = e^{\prime }$, there exist $v_{i,a^{\prime },b^{\prime }} \in \mathbb {C}$ such that

\begin{equation} \sum _{k,l}c_{a^{\prime }+e_k, b^{\prime } + e_l} \partial _{e_k,e_l} = \sum _{i=1}^r v_{i,a^{\prime },b^{\prime }} w_i. \end{equation}

(4.1)

Writing $v_i = \sum _{a^{\prime },b^{\prime }}v_{i,a^{\prime },b^{\prime }} \partial _{a^{\prime },b^{\prime }} \in S_{(d^{\prime },e^{\prime })}^\vee$ and writing Equation (4.1) in matrix format, we see that Equation (4.1) is equivalent to the following equality in $S_{(1,1)}^\vee \otimes S_{(d^{\prime },e^{\prime })}^\vee$:

\begin{equation} \sum _{a^{\prime },b^{\prime }}\sum _{k,l}c_{a^{\prime }+e_k, b^{\prime } + e_l} \partial _{e_k, e_l} \otimes \partial _{a^{\prime },b^{\prime }} = \sum _{i=1}^r w_i \otimes v_i. \end{equation}

(4.2)

Linear independence of $w_1, \ldots , w_r$ implies that $\mathcal {M}$ is injective. We have that injectivity of $\iota$ and $\mathcal {M}$ implies along with $\mbox{HF}_{S/J(Z)}(d,e) = \dim _{\mathbb {C}} J(Z)_{(d,e)}^\perp$ that

\begin{equation*} \dim _{\mathbb {C}} J(Z)_{(d,e)}^\perp = \dim _{\mathbb {C}} \iota \left(J(Z)_{(d,e)}^\perp \right) = \dim _{\mathbb {C}} (\operatorname{\mbox{im}}\iota \, \cap \, \operatorname{\mbox{im}}\mathcal {M}) = \dim _{\mathbb {C}} \left(\mathcal {M}^{-1}(\operatorname{\mbox{im}}\iota) \right). \end{equation*}

This concludes the proof.□

Corollary 4.2.

Let $Z = (\zeta _1, \ldots , \zeta _r) \in W \subset X^r$ and $(d, e) \ge (1,1)$. Then, $\mbox{HF}_{S/J(Z)}(d,e) \ge r$.

Proof.

The statement follows from Lemma 5.5.7 in Telen [2020b]. Nevertheless, we give an instructive proof. Let $w_i^{\prime } \in S_{(d^{\prime },e^{\prime })}^\vee , w_i^{\prime \prime } \in S_{(d,e)}^\vee$ be given by $w_i^{\prime }(f) = f(\beta _i,\gamma _i), w_i^{\prime \prime }(g) = g(\beta _i,\gamma _i)$ for $f \in S_{(d^{\prime },e^{\prime })}, g \in S_{(d,e)}$. Then, $\iota (w_i^{\prime \prime }) = w_i \otimes w_i^{\prime }$ and thus

\begin{equation*} \mathcal {M}\left(w_1^{\prime }, 0, \ldots , 0\right),~ \mathcal {M}\left(0, w_2^{\prime }, \ldots , 0\right), ~\ldots , ~\mathcal {M}\left(0,0,\ldots , w_r^{\prime }\right) \end{equation*}

are all contained in $\operatorname{\mbox{im}}\iota$. Therefore, $\mathcal {M}^{-1}(\operatorname{\mbox{im}}\iota)$ contains at least $r$ linearly independent elements, so by 4.1, we have $\mbox{HF}_{S/J(Z)}(d,e) \ge r$.□

The space $S_{(1,1)}^\vee \otimes S_{(d^{\prime },e^{\prime })}^\vee$ is identified with the space of matrices of size $\mbox{HF}_S(1,1) \times \mbox{HF}_S(d^{\prime },e^{\prime })$, where the rows are indexed by $\partial _{e_k,e_l}$ for $0 \le k \le m$, $0 \le l \le n$ and columns are indexed by $\partial _{a^{\prime },b^{\prime }}$ where $(a^{\prime },b^{\prime }) \in \mathbb {N}^{m+1} \times \mathbb {N}^{n+1}$ with $|a^{\prime }| = d^{\prime }, |b^{\prime }| = e^{\prime }$. For such a matrix to be contained in $\operatorname{\mbox{im}}\iota$, a collection of partial symmetry conditions needs to be satisfied. For instance, if $(a^{\prime }+e_k, b^{\prime }+e_l) = (a^{\prime \prime } + e_{k^{\prime }}, b^{\prime \prime } + e_{l^{\prime }})$, then the entry in the row indexed by $\partial _{e_k,e_l}$ and column indexed by $\partial _{a^{\prime },b^{\prime }}$ should be equal to the entry in the row indexed by $\partial _{e_{k^{\prime }},e_{l^{\prime }}}$ and column indexed by $\partial _{a^{\prime \prime }, b^{\prime \prime }}$ (see example 4.1). Matrices in $\operatorname{\mbox{im}}\iota$ are called catalecticant matrices [Iarrobino and Kanev 1999, Definition 1.3].

We can use 4.1 to compute the Hilbert function of $S/J(Z)$ via a rank computation of a matrix whose entries are monomials evaluated at the points $(\beta _i,\gamma _i)$. This is important for our proof of theorem 4.2. It is most easily explained by means of an example.

From the discussion in example 4.1, we would like to conclude that, generically, $(2,1) \in \mbox{Reg}(J(Z))$. For this to make sense, i.e., to apply definition 4.1, we need to show that for most configurations $Z$, $J(Z)$ defines $r$ points with multiplicity one. By an argument analogous to lemma 2.2, this happens for small enough ranks.

Lemma 4.1.

Suppose that $r \le mn$ (this is Equation (R) for $\ell = \infty$). There is a Zariski open, dense subset $U \subset X^r$ such that for all $Z = (\zeta _1, \ldots , \zeta _r) \in U$, $w_1, \ldots , w_r$ are $\mathbb {C}$-linearly independent and $V_X(J(Z)) = \lbrace \zeta _1,\ldots ,\zeta _r\rbrace$ consists of $r$ points with multiplicity one.

Proof.

As before, let $\zeta _i = (\beta _i, \gamma _i), i = 1, \ldots , r$ and $w_i = (\beta _i \otimes \gamma _i)^\top \in S_{(1,1)}^\vee$. By Theorem 2.5 in Chiantini and Ciliberto [2006], there is an open dense subset $U^{\prime } \subset X^r$ such that for $Z = (\zeta _1, \ldots , \zeta _r) \in U^{\prime }$, $\mbox{span}_{\mathbb {C}}(w_1,\ldots ,w_r)$ contains no points of the form $(\beta \otimes \gamma)^\top$, other than the $w_i$. We set $U = U^{\prime } \cap W$, which is open and dense in $X^r$. The rest of the proof is identical to that of lemma 2.2.□

Our next goal is to show that, to prove that $(d,e) \in \mbox{Reg}(J(Z))$ for almost all $Z \in X^r$, it suffices to find one particular instance $Z^* \in W$ for which $\mbox{HF}_{S/J(Z^*)}(d,e) = r$.

Proposition 4.2.

Suppose $r \le \mbox{HF}_S(1,1)$ such that $W \ne \emptyset$. For fixed $(d,e) \ge (1,1)$, the Hilbert function $\mbox{HF}_{S/J(Z)}(d,e)$, as a function of $Z$, is upper semicontinuous on $W$. That is, for any $r^* \in \mathbb {N}$,

\[V_{r^*} = \lbrace Z \in W ~|~ \mbox{HF}_{S/J(Z)}(d,e) \gt r^* \rbrace\]

is Zariski closed in $W$. Consequently, either $V_{r^*} = W$ or $V_{r^*} \subsetneq W$ is a strict closed subvariety.

Proof.

We have $\mbox{HF}_{S/J(Z)}(d,e) = \mbox{HF}_S(d,e) - \mbox{HF}_{J(Z)}(d,e)$, and $J(Z)_{(d,e)} = S_{(d^{\prime },e^{\prime })} \cdot J(Z)_{(1,1)}$. Therefore, the condition $\mbox{HF}_{S/J(Z)}(d,e) \gt r^*$ is equivalent to $\dim _{\mathbb {C}} (S_{(d^{\prime },e^{\prime })} \cdot J(Z)_{(1,1)}) \lt \mbox{HF}_S(d,e) - r^*$, which can be written as the vanishing of the $(\mbox{HF}_S(d,e) - r^*)$-minors of a matrix, whose entries are monomials in the coordinates of $Z$.□

Corollary 4.3.

Suppose that $r \le mn$ and $(d,e) \ge (1,1)$. Let $U \subset X^r$ be the dense open subset from lemma 4.1. If for some element $Z^* \in W$ we have $\mbox{HF}_{S/J(Z^*)}(d,e) = r$, then there is a Zariski open, dense subset $U^\circ$ of $U$ such that for all $Z \in U^\circ$, we have $(d,e) \in \mbox{Reg}(J(Z))$.

Proof.

By corollary 4.2, we know $V_{r-1} = W$. Since $Z^* \in W \setminus V_r$, proposition 4.2 implies that $V_r \subset W$ is a strict subvariety. We set $U^\circ = U \setminus V_r$. Clearly, if $Z = (\zeta _1, \ldots , \zeta _r) \in U^\circ$, then $J(Z)$ is such that $V_X(J(Z)) = \lbrace \zeta _1, \ldots , \zeta _r \rbrace$, where these points occur with multiplicity one (this uses $U^\circ \subset U$), and $\mbox{HF}_{S/J(Z)} = r$ (since $U^\circ \subset V_{r-1} \setminus V_r$).□

This result will be particularly useful for proving theorem 4.2 below. First, we investigate which combinations of $m,n,d,e,r$ are possible to have $\mbox{HF}_{S/J(Z)}(d,e) = r$ for generic points $Z$.

For $(d,e) \ge (1,1)$ and $m,n \in \mathbb {N}_0$, we define

\begin{equation*} \mathcal {R}(m,n,(d,e)) = \frac{\mbox{HF}_S(1,1) \mbox{HF}_S(d^{\prime },e^{\prime }) - \mbox{HF}_S(d,e)}{\mbox{HF}_S(d^{\prime },e^{\prime })-1}. \end{equation*}

If $(d,e) = (1,1)$, then we set $\mathcal {R}(m,n,(d,e)) = \infty$. For given $m,n$ and $(d,e) \ge (1,1)$, the following result shows that $\mathcal {R}(m,n,(d,e))$ bounds the rank $r$, for which we could possibly have $\mbox{HF}_{S/J(Z)}(d,e) = r$ for $Z \in W \subset X^r$.

Theorem 4.1.

Let $(d,e) \ge (1,1)$ and $Z \in W \subset X^r = (\mathbb {P}^m \times \mathbb {P}^n)^r$, with

\begin{equation} \mathcal {R}(m,n,(d,e)) \lt r \le mn. \end{equation}

(4.5)

We have $\mbox{HF}_{S/J(Z)}(d,e) \gt r$. In particular, for $r$ in the range (4.5) and $Z \in U$, where $U \subset W$ is the open subset from lemma 4.1, we have $(d,e) \notin \mbox{Reg}(J(Z))$.

Proof.

Since $\mbox{HF}_{S/J(Z)}(d,e) = \mbox{HF}_S(d,e) - \mbox{HF}_{J(Z)}(d,e) \ge \mbox{HF}_S(d,e) - \mbox{HF}_S(d^{\prime },e^{\prime }) \mbox{HF}_{J(Z)}(1,1)$ and $\mbox{HF}_{J(Z)}(1,1) = \mbox{HF}_S(1,1) - r$, we have

\begin{equation} \mbox{HF}_{S/J(Z)}(d,e) \ge \mbox{HF}_S(d,e) + r \mbox{HF}_S(d^{\prime },e^{\prime }) - \mbox{HF}_S(1,1) \mbox{HF}_S(d^{\prime },e^{\prime }). \end{equation}

(4.6)

Solving $\mbox{HF}_S(d,e) + r \mbox{HF}_S(d^{\prime },e^{\prime }) - \mbox{HF}_S(1,1) \mbox{HF}_S(d^{\prime },e^{\prime }) \gt r$ yields the first inequality in Equation (4.5).□

Note that if $(d,e) = (1,1)$, the range (4.5) is empty. This agrees with the fact that $(1,1) \in \mbox{Reg}(J(Z))$ for all $Z \in W$. From corollary 4.2, we know that $\mbox{HF}_{S/J(Z)}(d,e) \ge r$ for $(d,e) \ge (1,1)$ and $Z \in W$. Combining this with a dimension argument as in the proof of theorem 4.1, we see that

\begin{equation} \mbox{HF}_{S/J(Z)}(d,e) \ge \max \lbrace r~,~ \mbox{HF}_S(d,e) + r \mbox{HF}_S(d^{\prime },e^{\prime }) - \mbox{HF}_S(1,1)\mbox{HF}_S(d^{\prime },e^{\prime }) \rbrace . \end{equation}

(4.7)

For $e^{\prime } = 0$ or $d^{\prime } = 0$, we observe experimentally that equality holds for generic configurations $Z = (\zeta _1, \ldots , \zeta _r) \in X^r$. In this case, it suffices to check for which $r$ the maximum equals $r$. This leads us to the following conjecture.

Conjecture 1.

Let $d \ge 1$ and let $m,n,r$ be such that

\begin{equation} r \le \min \left\lbrace \mathcal {R}(m,n,(d,1)) ~ , ~mn \right\rbrace . \end{equation}

(4.8)

There is a Zariski open, dense subset $U^\circ \subset U \subset X^r = (\mathbb {P}^m \times \mathbb {P}^n)^r$, were $U$ is the open subset from lemma 4.1, such that for all $Z \in U^\circ$, $(d,1) \in \mbox{Reg}(J(Z))$.

Theorem 4.2.

Conjecture 1 holds in the following cases:

(1)

$m \in \mathbb {N}_0, ~n \in \mathbb {N}_0$ and $d = 1$,

(2)

$m \in \lbrace 1,2\rbrace , ~ n \in \mathbb {N}_0$ and $d = 2$,

(3)

$2 \le m+1, n+1 \le 50$ and $d = 2$,

(4)

$2 \le m+1, n+1 \le 9$ and $3 \le d \le 5$,

(5)

$2 \le m+1,n+1 \le 6$ and $6 \le d \le 10$.

Note that the role of $m$ and $n$ in conjecture 1 can be interchanged, implying the analogous statement that for $e \ge 1$ and generic $Z \in W \subset X^r = (\mathbb {P}^m \times \mathbb {P}^n)^r$ with $r \le \min \left\lbrace \mathcal {R}(m,n,(1,e)), mn \right\rbrace ,$ we have $(1,e) \in \mbox{Reg}(J(Z))$.

Let $\mathcal {A}\in \mathbb {C}^{\ell + 1} \otimes \mathbb {C}^{m + 1} \otimes \mathbb {C}^{n+1}$ be a general tensor whose rank satisfies Equation (R) and let $I = \langle { \ker \mathcal {A}_{(1)}}\rangle = J(Z)$, where $Z = ((\beta _i, \gamma _i))_{i=1, \ldots , r}$. A consequence of conjecture 1 would be that $(d,1) \in \mbox{Reg}(I)$, for any $d$ such that $\mathcal {R}(m,n,(d,1)) \ge r$ and $(1,e) \in \mbox{Reg}(I)$ for any $e$ such that $\mathcal {R}(m,n,(1,e)) \ge r$.

Example 4.2 (3.3, Continued).

In 3.3, we had $m = 6$ and $n = 2$. We have $\mathcal {R}(6,2,(2,1)) = 21/2 \lt r = 12$, which explains why $(2,1) \notin \mbox{Reg}(I)$. Also, $\mathcal {R}(6,2,(3,1)) = 112/9 \ge 12$ implies, since $\mathcal {A}$ is generic, that $(3,1) \in \mbox{Reg}(I)$. Similarly, the smallest $e \gt 1$ for which $\mathcal {R}(6,2,(1,e)) \ge 12$ is $e = 5$, so that $(1,5) \in \mbox{Reg}(I)$, but $(1,e) \notin \mbox{Reg}(I)$ for $1 \lt e \lt 5$. These computations are confirmed by 1.

One can check that

\begin{align} \mathcal {R}(m,n,(d,1)) &= \frac{\binom{m+d-1}{d-1}}{\binom{m+d-1}{d-1} - 1} \left((m+1)(n+1) - \frac{n+1}{d}(m+d) \right). \end{align}

(4.9)

The first factor is always greater than 1 and the second factor is at least $mn$ if $d \ge n+1$. If conjecture 1 holds, then for all tensors $\mathcal {A}$ of rank $r$ in the range (R), we have $(n+1,1) \in \mbox{Reg}(I)$. In that case, cpd_hnf can take $(d,e) = (n+1,1)$ to treat all identifiable rank-$r$ tensors in the unbalanced regime. That is, when $\ell \gt mn$. For such shapes, Bocci et al. [2014] proved that generic $r$-identifiability holds up to $r \le mn$.

Conjecture 1 gives a way of finding degrees of the form $(d,1) \in \mbox{Reg}(I) \setminus \lbrace (1,1)\rbrace$. There might exist other tuples of the form $(d,e) \in \mbox{Reg}(I) \setminus \lbrace (1,1)\rbrace$, which could lead to a lower computational cost. For such degrees, Equation (4.7) may be a strict inequality, which means that other tools are needed. We leave the further exploration of effective bounds for the regularity for future research.

5 Computational Complexity

One motivation for studying the regularity of the ideal $I = \langle \ker \mathcal {A}_{(1)} \rangle$ is to understand the computational complexity of 2.1.

Assume that we are given an $r$-identifiable tensor $\mathcal {A}$ in $\mathbb {C}^{\ell +1} \otimes \mathbb {C}^{m+1}\otimes \mathbb {C}^{n+1}$, where $\ell \ge m\ge n$. We additionally assume that $\ell +1 \le (m+1)(n+1)$.¹⁰ The size of the input will be denoted by $M = (\ell +1)(m+1)(n+1)$. Because of the constraint $(m+1)(n+1) \ge \ell +1 \ge m+1 \ge n+1$, we have

\begin{equation*} M^{\frac{1}{4}} \le m+1 \le M^{\frac{1}{2}} \text{ and } 2 \le n+1 \le M^{\frac{1}{3}}. \end{equation*}

Since cpd_hnf applies only if the rank $r$ satisfies Equation (R), we have $r = \mathcal {O}(M^\frac{5}{6})$.

Without going into details, it can be verified that a crude upper bound for the computational complexity of the steps in 2.1, excluding step 4, is $\mathcal {O}(M^3)$.

The key contribution to the time complexity originates from 3.1. Its complexity is ultimately determined by the choice of $(d,e)$ in line 9 of the algorithm. We analyze what happens when the degree is selected as $(d,1)$. In this case, bounding the size of the matrix $R_I(d,e)$ in 3.1 is critical. Assuming conjecture 1 holds, we have

\begin{equation} \left(\frac{3}{2} \right) ^{d+1} \le \mbox{HF}_S(d,1) = \binom{m+d}{d} (n+1) \le M^{\frac{5}{6}d} M^{\frac{1}{3}}. \end{equation}

(5.1)

The upper bound follows from $\binom{m+d}{d} \le (m+d)^d \le (2(m+1))^d \le ((m+1)(n+1))^d$. For establishing the lower bound, note that $(1 + \frac{m}{d})^d \le \binom{m+d}{d}$ and use the facts that $d \le n+1 \le m+1$ (see the discussion below Equation (4.9)), so that $d\ge 2$ implies that $\frac{m}{d} \ge \frac{1}{2}$. Computing a basis for the left null space of $R_{I}(d,1)$ requires at most $\mathcal {O}((\mbox{HF}_S(d,1))^3)$ and at least $r \cdot \mbox{HF}_S(d,1)$ operations, as we know $(d,1)\in \mbox{Reg}(I)$ so that the dimension of the left null space is $r$. Consequently, line 11 has a time complexity that is at least exponential in $d$.

Proposition 5.1.

Consider the concise tensor space $\mathbb {C}^{M^{\frac{1}{2}}}\otimes \mathbb {C}^{M^{\frac{1}{4}}}\otimes \mathbb {C}^{M^{\frac{1}{4}}}$. If conjecture 1 holds, then for a generic tensor of rank $r=M^{\frac{1}{2}} - 2M^{\frac{1}{4}} - 1$ in this space, the asymptotic time complexity of 2.1 is at least exponential in the input size $M = M^{\frac{1}{2}} M^{\frac{1}{4}} M^{\frac{1}{4}}$, if the degrees are restricted to $(d,1)$ or $(1,e)$.

Proof.

It follows from Equation (4.9) and theorem 4.1 that for sufficiently large $M$, the degree $d$ should be at least $\frac{1}{2} M^{\frac{1}{4}}$. Combining this with the above discussion about the size of $R_I(d,1)$ concludes the proof.□

One might conclude from this result that cpd_hnf is not an effective algorithm for tensor rank decomposition. However, Hillar and Lim [2013] proved that computing the rank of a tensor over $\mathbb {C}$ is an NP-hard problem. Hence, a polynomial-time algorithm that applies to all inputs for this problem is not anticipated. A typical instance of an NP-hard problem can often be solved more efficiently than the worst-case instance. Theorem 1.1 is in this spirit.

Proof of 1.1

It follows from the upper bound in Equation (5.1) that the time complexity of 2.1 is $\mathcal {O}((\mbox{HF}_S(d,1))^3) = \mathcal {O}(M^{\frac{5}{2}d + 1})$.

We determine the smallest degree $d$ under conjecture 1 such that $(d,1) \in \mbox{Reg}(I)$, where $I=\langle \ker \mathcal {A}_{(1)} \rangle$. From theorem 4.1, we know that a necessary condition for this is that $\mathcal {R}(m,n,(d,1)) \ge r = \phi mn$. The last inequality is implied by $(m+1)(n+1) - \frac{n+1}{d}(m+d) \ge \phi mn$ because of Equation (4.9). This is equivalent to

\begin{equation*} (1 - \phi) mn + m + n + 1 \ge \frac{1}{d}(m+d)(n+1) = \frac{1}{d} m (n+1) + n+1. \end{equation*}

Hence, $\mathcal {R}(m,n,(d,1)) \ge \phi mn$ is implied by

\begin{equation*} d \ge \frac{1}{1 - \phi } \gt (n+1) \frac{1}{(1 - \phi)n + 1} = \frac{m(n+1)}{(1-\phi)mn + m}. \end{equation*}

In other words, provided conjecture 1 holds, it suffices to take $d = \lceil \frac{1}{1-\phi } \rceil$ to be able to cover all ranks up $\phi mn$. As $\phi \ne 1$ is a constant, this proves the result.□

We can observe that the formula $d \ge \frac{1}{1-\phi }$ is asymptotically sharp in the sense that as $\phi \rightarrow 1$ the exponent in $\mathcal {O}(M^\star)$ needs to blow up to $\infty$ and no polynomial in the input size can control the growth. Indeed, proposition 5.1 shows that exactly when $\phi =1$ there exist cases that require at least an exponential growth. Note that by the discussion below Equation (4.9), assuming conjecture 1, we can use $d = n+1$ for any $\phi \in [0,1]$. This gives the universal, exponential complexity bound $\mathcal {O}(M^{\frac{5}{2}n + \frac{7}{2}})$.

6 Numerical Experiments

We present several numerical results demonstrating the effectiveness of cpd_hnf. All experiments were performed on KU Leuven/UHasselt’s Tier-2 Genius cluster of the Vlaams Supercomputer Centrum (VSC). Specifically, the supercomputer’s scheduling software allocated standard skylake nodes containing two Xeon Gold 6140 CPUs (18 physical cores, 2.3 GHz clock speed, 24.75 MB L3 cache) with 192 GB of main memory, as well as standard cascadelake nodes, which are equipped with two Xeon Gold 6240 CPUs (18 physical cores, 2.6 GHz clock speed, 24.75 MB L3 cache) with 192 GB of main memory for our experiments. In all experiments, we allowed all algorithms to use up to 18 physical cores.

The proposed algorithm was implemented in Julia v1.4.0, relying on the non-base packages Arpack.jl, DynamicPolynomials.jl, GenericSchur.jl, and MultivariatePolynomials.jl.¹¹ Our implementation follows the pseudocode in 2.1 and 3.1 and the detailed discussion in Section 3.4 quite closely. The Julia code of cpd_hnf, including driver routines to reproduce cpd_hnf’s experimental results can be found at https://gitlab.kuleuven.be/u0072863/homogeneous-normal-form-cpd.

In the experiments below, random rank-$r$ tensors $\mathcal {A}= \sum _{i=1}^r \alpha _i \otimes \beta _i \otimes \gamma _i$ are generated by randomly sampling the elements of $\alpha _i \in \mathbb {R}^{\ell +1}$, $\beta _i \in \mathbb {R}^{m+1}$, and $\gamma _i \in \mathbb {R}^{n+1}$ identically and independently distributed (i.i.d.) from a standard normal distribution.

6.1 Implementation Details

The algorithm is implemented for real and complex input tensors. In the former case, all computations are performed over the reals with the exception of the computation of the left null space of $R_I(d,e)$. In the real case, the algorithm continues with the real part of the output of this step.

At the start of the algorithm, we compress the $(\ell +1) \times (m+1) \times (n+1)$ input tensor, which is to be decomposed into $r$ rank-1 terms, to a $\min \lbrace \ell +1,r\rbrace \times \min \lbrace m+1,r\rbrace \times \min \lbrace n+1,r\rbrace$ tensor. For this, we apply ST-HOSVD compression [Vannieuwenhoven et al. 2012] with truncation rank $(\min \lbrace \ell +1,r\rbrace , \min \lbrace m+1,r\rbrace , \min \lbrace n+1,r\rbrace)$. Most of the experiments below are chosen so that this step performs no computations.

The kernel of $\mathcal {A}_{(1)}$ in 2.1 is computed by an SVD. The linear system at the end of 2.1 is solved by computing the Khatri–Rao product $K = [\beta _i \otimes \gamma _i]_{i=1}^r$ and then solving the overdetermined linear system $K A = \mathcal {A}_{(1)}^\top$ for $A$. The rows of $A$ then correspond to the $\alpha _i$’s.

In 3.1, the degree $(d,e)$ is determined automatically by assuming conjecture 1 is true and selecting a valid $(d,1)$ or $(1,e)$ based on a heuristic that takes into account the estimated computational cost and numerical considerations.¹²

The key computational bottleneck is the computation of the left nullspace of $R_I(d,e)$ in line 11 of 3.1. When the number of entries of $R_I(d,e)$ is smaller than 10,000, we use the standard SVD-based kernel computation. In larger cases, for efficiency, we propose to employ Arpack.jl’s eigs function to extract the left null space from the Hermitian Gram matrix $G = R_I(d,e) (R_I(d,e))^H$, where $^H$ denotes the Hermitian transpose. The eigs function (with parameters tol = 1e-6 and maxiter = 25) extracts the $r$ eigenvalues of smallest modulus and their corresponding eigenvectors. Note that Arpack.jl employs a Bunch–Kaufman factorization [Bunch and Kaufman 1977] of the positive semidefinite input matrix $G$ to perform its Lanczos iterations. Using eigs was up to $75\%$ faster than the SVD-based approach for large problems.

The final step of 3.1 is implemented as discussed at the end of Section 3.4. The kernel is computed with an SVD and $\gamma _i$ is taken as the left singular vector corresponding to the smallest singular value. After this step, we find $(\beta _i, \gamma _i)$. Observe that $f_j(\beta _i, \gamma _i)$ should vanish exactly for all $j = 1, \ldots , s$. We propose to refine the approximate roots $(\beta _i, \gamma _i)$ by applying three Newton iterations on $f_1 = \cdots = f_s = 0$. This adds a minor computational overhead of $r(m+1)(n+1)(m+n)^2$.

6.2 Impact on the Accuracy of Computational Optimizations

We compare and evaluate the impact on numerical accuracy of four variants of 3.1. Two options are considered in each combination:

(i)

refining the approximate roots $(\beta _i,\gamma _i)$ with 3 Newton steps (+newton) or not, and

(ii)

computing the left nullspace of $R_I(d,e)$ with Arpack.jl’s iterative eigs method applied to the Gram matrix or using the standard svd.

We append _eigs, _eigs+newton, _svd, and _svd+newton to cpd_hnf to distinguish the variants.

The experimental setup is as follows. Random rank-$r$ tensors are generated as explained at the start of this section, sampling one rank-$r$ tensor for each combination $(\ell +1,m+1,n+1,r)$ with $25 \ge m + 1 \ge n+1 \ge 2$ and $\ell +1=r=\min \lbrace \mathcal {R}(m,n,(2,1)),mn\rbrace$ where $\mathcal {R}$ is as in Equation (4.9). For each variant of cpd_hnf, the relative backward error$\Vert \mathcal {A}- \sum _{i=1}^r \alpha _i^{\prime } \otimes \beta _i^{\prime } \otimes \gamma _i^{\prime } \Vert _F \Vert \mathcal {A}\Vert _F^{-1},$ and execution time is recorded, where $(\alpha _i^{\prime }, \beta _i^{\prime }, \gamma _i^{\prime })$ are the estimates obtained by the algorithm.

The experiment was performed on cascadelake nodes and the results are displayed in Figure 1. It is visually evident in the right panels that the Newton refinement improves the overall relative backward error for both eigs and svd by several orders of magnitude and brings both to seemingly the same level of about $10^{-16}$–$10^{-12}$. However, there is a marked difference in accuracy between eigs and svd before Newton refinement, the svd being more accurate by about 1–3 orders of magnitude. It is interesting to compare these results with the leftmost panel of Figure 2(b), which shows the corresponding results on the same tensors for a conceptually similar state-of-the-art method from the literature. The method also requires a kernel computation that, in the displayed graph, is performed with a singular value decomposition. This means that the methods in Figure 1(c) and the left panel of Figure 2(b) can be directly compared. It can be seen that the proposed cpd_hnf_svd method, even without Newton refinement, can be up to 3 orders of magnitude more accurate (in relative backward error). The differences in accuracy are thus not exclusively attributable to the Newton refinement.

Fig. 1.

Fig. 2.

We did not include a visualization of the timings, because they visually look very similar. By adding the Newton refinement, the total execution time over all cases only modestly increased: $3.4\%$ for eigs and $3.8\%$ for svd. The relative increase was larger for the small cases while it was close to 0–5% for the large cases ($n+1 \ge 20$). Given that the accuracy can be improved by orders of magnitude, we conclude from these experiments that a few steps of Newton refinement of the approximate roots $(\beta _i,\gamma _i)$ is highly recommended.

After deciding that Newton refinement is our default choice and observing in the right panels of Figure 1 there is no significant difference in accuracy between the eigs and svd, we investigate their relative computational performance. In terms of total execution time over all cases, the eigs variant is 29.1% faster. For the largest case this increases to a 40.9% reduction in execution time.

From the above experiments, we conclude that the variant with an iterative eigensolver and Newton refinement is an appropriate default choice. In the remainder of the experiments, cpd_hnf refers to cpd_hnf_eigs+newton, the variant using Arpack’s iterative eigenvalue method applied to the Gram matrix combined with 3 Newton steps to refine the approximate roots.

6.3 Comparison with the State of the Art

In the second experiment, we compare cpd_hnf with the current state of the art in direct numerical methods for tensor rank decomposition in terms of accuracy and computational performance. The algorithm developed by Domanov and De Lathauwer [2017];, 2014] is an advanced direct numerical method that shares several high-level characteristics with cpd_hnf:

(i)

Both methods assume the target rank $r$ is supplied to the decomposition algorithm. They operate in a similar regime of ranks and are able to treat the full range of generically $r$-identifiable tensors in unbalanced tensor spaces. The method by Domanov and De Lathauwer can even deal with some ranks $r \gt \ell +1 \ge m+1 \ge n+1$, while our algorithm cannot.

(ii)

Like our method, Domanov and De Lathauwer rely on a simultaneous diagonalization procedure to extract one or two of the factor matrices.

(iii)

To deal with high ranks, both methods rely on the construction of an auxiliary matrix whose size is parameterized by an integer. By increasing this integer, the range of ranks that is covered by the algorithms is broadened at the cost of a substantially increased computational complexity. In both algorithms, the asymptotic computational complexity is determined by the cost of computing the kernel of this auxiliary matrix. Contrary to Domanov and De Lathauwer [2017];, 2014], we are able to give a precise connection between this integer and the range of ranks we can cover (subject to conjecture 1).

(iv)

When $\ell +1 \ge m+1 \ge r$, both approaches can be considered pencil-based algorithms. For this reason, our comparison will focus on the case where $\ell +1 \ge r \ge m+1 \ge n+1$.

A Matlab implementation of Domanov and De Lathauwer’s algorithm [Domanov and De Lathauwer 2017] is available as part of the Tensorlab+ repository [Hendrikx et al.] as cpd3_gevd in the domanov2017laa directory. We refer to it as cpd_ddl henceforth. cpd_ddl has an accurate and a fast option for computing the kernel of the auxiliary matrix. We found in our experiments that the dimension of this kernel is often misjudged by the fast implementation. Consequently, it usually increases the size of the auxiliary matrix and causes it to exceed the 16.2 GB memory limit, we put on the size of that matrix. As a concrete statistic, in the configuration with $d=2$ below, the fast version of cpd_ddl failed in over 75% of the cases. Therefore, we exclusively compare our implementation with the accurate version cpd_ddl.

We generate random rank-$r$ tensors $\mathcal {A}$ as described at the start of this section. We apply both cpd_hnf and cpd_ddl to these random tensors for all of the following configurations¹³:

\begin{align*} &40 \ge m+1 \ge n+1 \ge 2 \quad \text{for}\quad d=2, \text{ and}\\ &20 \ge m+1 \ge n+1 \ge 2 \quad \text{for}\quad d=3, \text{ and}\\ &15 \ge m+1 \ge n+1 \ge 2 \quad \text{for}\quad d=4, \end{align*}

and in all cases, we take $\ell$ $+$ $1 = r = \min \lbrace \mathcal {R}(m, n, (d,1)), mn\rbrace$. We do not provide our algorithm with the value of $d$. For each input tensor, cpd_hnf determines the degree $(d,1)$ or $(1,e)$ automatically as explained in the previous subsection. For each algorithm, we record the relative backward error, total execution time, and the size of the auxiliary matrix whose kernel is computed. As the latter is the dominant operation in both algorithms, its time and memory complexity gives a good estimate of the overall complexity.

For the first configuration cpd_ddl was executed on bigmem nodes of the supercomputer. These are the same as the skylake nodes, except that they are equipped with 768 GB of main memory. The reason was that our Matlab driver routine consumed more than 162 GB of memory while running through all configurations.

The accuracy of cpd_hnf and cpd_ddl in this set of experiments is shown in Figure 2. The newly proposed algorithm is consistently several orders of magnitude more accurate in terms of the relative backward error than the state of the art.

It took over 146 h to obtain the (incomplete) results in the left panel of Figure 2(b). To put this in perspective, we note that the computation for the left panel of Figure 2(a) took a little over 17 h. The missing values inside of the triangles in Figure 2(b) indicate that cpd_ddl wanted to allocate an auxiliary matrix that would require more than 16.2 GB of memory. The same memory constraint was also imposed on cpd_hnf, but here only the largest case with $d=4$ and $m=n=14$ could not be treated. The red pixel for $d = 2, m = 35, n = 13$ in Figure 2(a) indicates that cpd_hnf gave inaccurate results. This is the only case where eigs failed to find a sufficiently accurate nullspace.

The timings of our Julia implementation are shown in Figure 3. As cpd_ddl is implemented in a different programming language, we believe a direct comparison in timings is not opportune. Nevertheless, in both algorithms computing the kernel of the auxiliary matrix has the highest asymptotic time complexity. In cpd_ddl it is an ${\rm L}$ $\times$ ${\rm L}$ square matrix, and in our algorithm, depending on the choice of degree, $R_{I}(d,e)$ is an almost-square ${\rm M} \times {\rm N}$ matrix with ${\rm M} \approx {\rm N}$. Therefore, we decided to plot the ratio between the number of elements in the auxiliary matrix of cpd_ddl and cpd_hnf. Figure 4 visualizes this factor $\mu = {\rm L}^2/({\rm M}{\rm N})$ in a logarithmic scale. This number indicates the fraction of memory that cpd_ddl requires relative to cpd_hnf. Raising it to the power $\frac{3}{2}$ gives an estimate of the speedup factor in execution time of cpd_hnf relative to cpd_ddl. Based on this estimate and the execution times we logged, it is accurate to state that the newly proposed algorithm outperforms the state of the art by up to two orders of magnitude for larger tensors.

Fig. 3.

Fig. 4.

6.4 Robustness in the Noisy Case

The next experiment illustrates that cpd_hnf can successfully decompose tensors even in the presence of some noise. The setup is as follows. We generate a random rank-$r$ tensor $\mathcal {A}$ of size $150 \times 25 \times 10$ by randomly sampling the $\alpha _i,\beta _i$, and $\gamma _i$ as before. Then, we add white Gaussian noise of relative magnitude 10^e for $e=-1,\ldots ,-15$; that is, we compute $\mathcal {A}^{\prime } = \mathcal {A}+ 10^{e} \frac{\Vert \mathcal {A}\Vert _F}{\Vert \mathcal {E}\Vert _F} \mathcal {E}$. We provide $\mathcal {A}^{\prime }$ as input to our algorithm and request a rank-$r$ decomposition.

The relative backward error between $\mathcal {A}^{\prime }$ and the computed rank-$r$ decomposition is shown in a logarithmic scale in Figure 5. Because of our setup, the rank-$r$ CPD of $\mathcal {A}$ has relative backward error 10^e. A good tensor decomposition algorithm should thus return a CPD with a backward error of at most 10^e. Remarkably, for tensors with random rank-$r$ CPDs, the proposed algorithm consistently manages to reach this benchmark when $e \le -5$. For ranks up to about half the maximum range (from $r=1$ to 70), it even consistently manages to reach this benchmark for white noise of magnitude at most $10^{-2}$! Based on these results, we anticipate that cpd_hnf could be employed as a rank-$r$ approximation algorithm in the high signal-to-noise regime. We believe this observation warrants further research.

Fig. 5.

6.5 An Example of Higher-order Tensors

The last experiment illustrates that the reshaping trick combined with a decomposition algorithm that works well in the unbalanced regime (such as cpd_hnf and cpd_ddl [Domanov and De Lathauwer 2014;, 2017]) is a powerful technique for decomposing high-order, high-rank tensors, even with a balanced shape.

As an example, we generated an eighth-order real tensor of size $7 \times 7 \times 7 \times 7 \times 6 \times 6 \times 5 \times 5$ and rank $r = 1,000$ with factor matrices whose entries are sampled i.i.d. from a standard normal distribution. Here is what happens:

To our knowledge, this computation represents the first time any tensor decomposition algorithm of any type (i.e., alternating least squares, optimization-based methods, direct algebraic methods, homotopy-based methods, or heuristic methods) successfully decomposes a high-order rank-1000 tensor that cannot be reshaped to an order-3 tensor whose CPD can be computed with a pencil-based algorithm.

7 Conclusions

The cpd_hnf algorithm proposed in this article computes the CPD of tensors satisfying assumption 2.1 using numerical linear algebra techniques for solving systems of polynomial equations. Its complexity is governed (proposition 5.1 1.1) by the regularity of a homogeneous, $\mathbb {N}^2$-graded ideal obtained from a flattening. We presented a formula for degrees $(d,1)$ in the regularity for generic tensors of many formats (theorem 4.2) and proposed conjecture 1 for the general case. Our experiments show that cpd_hnf produces backward errors that are almost always of the order of the machine precision. This improves upon the previous state of the art by several orders of magnitude (see Figure 2). In the high signal-to-noise-ratio regime, it seems the algorithm can be used to approximate noisy rank-$r$ tensors.

Possible directions for future research include a further analysis of the regularity, a relaxation of the conditions in assumption 2.1, generalizations for (semi-)symmetric tensors, and a theoretical analysis of cpd_hnf in the noisy setting.

Footnotes

A property on a variety $\mathcal {V}$ is “generic” if the locus where the property does not hold is contained in a Zariski closed subset.

The tensor product is defined uniquely by the linear space into which it maps by universality [Greub 1978], so we do not make a distinction in notation.

Often, we will require $(d,e) \ne (1,1)$ and $(d-1,e-1)\ge (0,0)$. This makes sense by lemma 3.2. The problem of how to find such a tuple $(d,e)$ will be the topic of Section 4.

⁴

If $d = 1$ and $e \ge 2$, then the roles of $d$ and $e$ can be swapped so that the approach still works.

⁵

The decomposition problem for a rank-$r$ tensor in $\mathbb {C}^{(\ell +1)\times (m+1)\times (n+1)}$ with $r \le m+1 \le \ell +1$ can always be reduced to this so-called concise case [Landsberg 2012; Bürgisser et al. 1997] by computing an orthogonal Tucker decomposition [Tucker 1966] followed by a rank-$r$ decomposition of the core tensor.

⁶

The notation is similar to Section 2. Here, we omit the restriction to the subspace $B$ (or, equivalently, we take $B = S_{(1,1)}$). We also drop the transpose on $w_i$, as we will think of them as column vectors instead of row vectors in this section.

⁷

For the reader who is familiar with algebraic geometry, we note that this is our motivation for associating the ideal $J(Z)$ to $Z$, instead of the usual vanishing ideal of the points in $Z$. These are different ideals, as $J(Z)$ is usually not saturated with respect to the irrelevant ideal of $S$.

⁸

This follows from the fact that the Segre variety is not contained in a hyperplane.

⁹

The code and certificates can be obtained at https://gitlab.kuleuven.be/u0072863/homogeneous-normal-form-cpd.

¹⁰

A tensor $\mathcal {A}$ not satisfying this constraint on $\ell$ can always be represented in new bases by a coordinate array $\mathcal {B}$ in a concise [Landsberg 2012; Bürgisser et al. 1997] tensor space $\mathbb {C}^{\ell ^{\prime }+1} \otimes \mathbb {C}^{m^{\prime }+1}\otimes \mathbb {C}^{n^{\prime }+1}$ with $\ell ^{\prime } \le \ell$, $m^{\prime } \le m$, and $n^{\prime }\le n$. After permutation of the factors, a concise tensor space $\mathbb {C}^{k_1}\otimes \mathbb {C}^{k_2}\otimes \mathbb {C}^{k_3}$ always satisfies $k_2k_3 \ge k_1 \ge k_2 \ge k_3$; see, e.g., Carlini and Kleppe [2011]. In practice, $\mathcal {B}$ can be obtained as the core tensor of the (sequentially) truncated higher-order singular value decomposition [De Lathauwer et al. 2000; Vannieuwenhoven et al. 2012].

¹¹

Additional packages are used to support our experimental setup, but these are not required for the main algorithm.

¹²

For more details, see the function minimumMultiDegree in NormalFormCPD.jl.

¹³

Both algorithms were applied to the same rank-$r$ tensors. To deal with the different programming languages and to limit storage demands, we generated a buffer of $10^7$ reals sampled i.i.d. from a standard normal distribution. The $\alpha _i$, $\beta _i$, and $\gamma _i$ of the true decomposition were then generated from this buffer. This entails there is a statistical correlation between the various tensors that are decomposed. However, we judged that this does not affect the validity of our conclusions. For each line in the set of configurations, a different random buffer was generated.

A Proof of the Technical Results

For two ideals $I, K \subset S$, we write $(I: K^\infty) = \lbrace f \in S ~|~ K^k f \subset I \text{ for some } k \in \mathbb {N}\rbrace .$ The next result follows from lemma 2.2.

Corollary A.1.

Under assumption 2.1, we have the identity $(I: K^\infty) = J$, where $I = \langle {\ker \mathcal {A}_{(1)}}\rangle$, $J$ is the vanishing ideal

\begin{equation*} J = \left\langle { f \in S ~|~ f \text{ is homogeneous and } f(\zeta _i) = 0, i = 1, \ldots , r } \right\rangle , \end{equation*}

and $K = \langle {x_iy_j ~|~ 0 \le i \le m, 0 \le j \le n}\rangle$ is the irrelevant ideal of $S$.

Proof.

By lemma 2.2, the modules $S/I$ and $S/J$ define the same subscheme of $X$ consisting of $r$ simple points. It follows that $K^k (J/I) = 0$ for large enough $k$. See, for instance, Proposition 5.3.10 in Cox et al. [2011].□

Proof of 3.1

This follows from the fact that all $h \in S_{(d^{\prime },e^{\prime })}$ satisfying $h(\zeta _i) = 0$ lie on a hyperplane $H_i$ through the origin in the $\mathbb {C}$-vector space $S_{(d^{\prime },e^{\prime })}$. Any $h_0$ in the complement of $H_1 \cup \cdots \cup H_r$ satisfies $h_0(\zeta _i) \ne 0, i = 1, \ldots , r$.□

Proof of 3.2

Let $J = (I: K^\infty)$ be the saturation of $I$ with respect to the irrelevant ideal $K$ (see Corollary A.1). First, it follows from Propositions 4.4 and 6.7 in Maclagan and Smith [2004] that there exists $(\delta _1,\epsilon _1) \in \mathbb {N}^2$ such that $\mbox{HF}_{S/J}(\delta _1+\delta ^{\prime },\epsilon _1+\epsilon ^{\prime }) = r$ for all $(\delta ^{\prime },\epsilon ^{\prime }) \in \mathbb {N}^2$. Second, we show that there exists $(\delta _2,\epsilon _2) \in \mathbb {N}^2$ such that $I_{(\delta _2 + \delta ^{\prime },\epsilon _2 + \epsilon ^{\prime })} = J_{(\delta _2 + \delta ^{\prime },\epsilon _2 + \epsilon ^{\prime })}$ for all $(\delta ^{\prime },\epsilon ^{\prime }) \in \mathbb {N}^s$. To see this, note that $K^k J \subset I$ for some $k \in \mathbb {N}$ (see, e.g., Chapter 4, Section 4, Proposition 9 in Cox et al. [2013]) and $K^k = \langle {S_{(k,k)}}\rangle$. For a finite set of homogeneous generators $g_1, \ldots , g_m$ of $J$, choose $(\delta _2,\epsilon _2)$ such that entry-wise, $(\delta _2,\epsilon _2) \ge (k, k) + \deg (g_i),~ i = 1, \ldots , m$. Take $(d,e) \in (\max (\delta _1,\delta _2), \max (\epsilon _1,\epsilon _2)) + \mathbb {N}^2$ such that $(d,e) \ne (1,1)$ and $(d-1,e-1) \ge (0,0)$.□

Proof of 3.2

Under the assumptions of the theorem, $((0,1),(1,0))$ is a regularity pair for the ideal $J = (I:K^\infty)$, defining the same set of points $V_X(J) = V_X(I)$ (see corollary A.1). The statement now follows from Theorem 5.5.3, Propositions 5.5.4 and 5.5.5 in Telen [2020b].□

Proof of item (3) in 4.2

We prove the case $m = 2$ in item (3) of the theorem. It suffices to show that for some configuration $Z$, the matrix

\begin{equation*} A^{\prime } = \begin{bmatrix}-\Gamma H_0 & \\ & - \Gamma H_0 \\ \Gamma H_2 & - \Gamma H_1 \end{bmatrix} \end{equation*}

has full rank. We set $\beta _{i0} = 1, i = 1, \ldots , r$, such that $H_0$ is the identity matrix. Condition (4.8) ensures that $A^{\prime }$ has more rows than columns, so it suffices to show that $\ker A^{\prime } = \lbrace 0\rbrace$. Suppose $A^{\prime } v = 0$, then $v$ can be split into $v_1, v_2 \in \mathbb {C}^r$ such that $\Gamma v_1 = \Gamma v_2 = 0$. If $r \le n+1$, then it is clear that this implies $v_1 = v_2 = 0$ for generic $Z$. Therefore, we assume $r \gt n + 1$ and make the following choice for $\Gamma$:

\begin{equation*} \Gamma = \begin{bmatrix}1 & & & & \gamma _{n+2,0} & \cdots & \gamma _{r0} \\ & 1& & & \gamma _{n+2,1} & \cdots & \gamma _{r1} \\ & & \ddots & & \vdots & & \vdots \\ & & & 1 & \gamma _{n+2,n} & \cdots & \gamma _{rn} \end{bmatrix} = \begin{bmatrix} \mbox{id}_{n+1} & \hat{\Gamma } \end{bmatrix} \in \mathbb {C}^{(n+1) \times r}, \end{equation*}

where $\hat{\Gamma } \in \mathbb {C}^{(n+1) \times \kappa }$ is the submatrix of $\Gamma$ consisting of its last $\kappa = r - (n+1)$ columns. We have that $\Gamma v_i = 0$ implies $v_i = \Gamma ^\perp w_i$ for some $w_i \in \mathbb {C}^\kappa$ and $\Gamma ^\perp = \left[ {\begin{matrix} -\hat{\Gamma } \\ \mbox{id}_\kappa \end{matrix}} \right]$. Hence, $Av = 0$ is equivalent to $[\Gamma H_2 \Gamma ^\perp \quad -\Gamma H_1 \Gamma ^\perp ] [ {\begin{matrix} w_1 \\ w_2 \end{matrix}} ] = 0.$ Condition (4.8) implies $2 \kappa \le n+1$, so that the coefficient matrix in this equation has more rows than columns. Upon closer inspection, we see that

\begin{equation*} \Gamma H_q \Gamma ^\perp = \begin{bmatrix}(\beta _{n+2,q} - \beta _{1q})\gamma _{n+2,0} & \cdots & (\beta _{rq} - \beta _{1q}) \gamma _{r0} \\ (\beta _{n+2,q} - \beta _{2q})\gamma _{n+2,1} & \cdots & (\beta _{rq} - \beta _{2q}) \gamma _{r1} \\ \vdots & & \vdots \\ (\beta _{n+2,q} - \beta _{n+1,q})\gamma _{n+2,n} & \cdots & (\beta _{rq} - \beta _{n+1,q}) \gamma _{rn} \\ \end{bmatrix} = \begin{bmatrix} (\beta _{jq} - \beta _{i+1,q})\gamma _{ji} \end{bmatrix}_\begin{array}{c}0\le i\le n,\\ n+2\le j\le r\end{array}. \end{equation*}

To make rows $\kappa +1, \ldots , 2 \kappa$ equal to zero in $\Gamma H_2 \Gamma ^\perp$, we set $\beta _{\kappa + 1, 2} = \beta _{\kappa +2, 2} = \cdots = \beta _{2 \kappa ,2} = \beta _{n+2,2} = \beta _{n+3,2} = \cdots = \beta _{r,2} = 1$. All other $\beta$-coordinates are chosen at random, such that all entries of $[\Gamma H_2 \Gamma ^\perp \quad -\Gamma H_1 \Gamma ^\perp ]$, except those in the rows $\kappa +1, \ldots , 2 \kappa$ of $\Gamma H_2 \Gamma ^\perp$, are of the form $\star \gamma _{ji}$, with $\star$ some non-zero complex number. Then,

\begin{equation*} \begin{bmatrix}\Gamma H_2 \Gamma ^\perp & -\Gamma H_1 \Gamma ^\perp \end{bmatrix} = \begin{bmatrix}C_{11} & C_{12} \\ 0 & C_{22} \\ D_1 & D_2 \end{bmatrix}, \end{equation*}

where $C_{ij} \in \mathbb {C}^{\kappa \times \kappa }$ and $D_i \in \mathbb {C}^{(n+1 -2\kappa) \times \kappa }$. The minor corresponding to the first $2\kappa$ rows is seen to be $\det (C_{11}) \det (C_{22})$. This is a product of two nonzero polynomials in the parameters $\gamma _{ji}$, $i = 0, \ldots , n, j = n+2, \ldots , r$. For generic choices of the parameters, this minor is non-zero, hence $w_1 = w_2 = 0$, and thus $v = 0$ and $\ker A^{\prime } = \lbrace 0\rbrace$.□

Acknowledgments

We are grateful to Alessandra Bernardi, Fulvio Gesmundo, and Bernard Mourrain for fruitful discussions. We thank Ignat Domanov and Lieven de Lathauwer for kindly sharing a Matlab implementation of cpd_ddl prior to its publication in the Tensorlab+ repository.

We thank the three anonymous reviewers for their detailed feedback on the manuscript that, in particular, led to the inclusion of sections 2.2 and 6.2. We thank an anonymous referee for suggesting shorter proofs of proposition 4.2 and theorem 4.1.

References

[1]

C. Beltrán, P. Breiding, and N. Vannieuwenhoven. 2019. Pencil-based algorithms for tensor rank decomposition are not stable. SIAM J. Matrix Anal. Appl. 40, 2 (2019), 739–773.

Digital Library

Google Scholar

[2]

M. R. Bender and S. Telen. 2020. Toric eigenvalue methods for solving sparse polynomial systems. Retrieved from https://arXiv:2006.10654.

Google Scholar

[3]

A. Bernardi, J. Brachat, P. Comon, and B. Mourrain. 2011. Multihomogeneous polynomial decomposition using moment matrices. In Proceedings of the 36th International Symposium on Symbolic and Algebraic Computation. 35–42.

Digital Library

Google Scholar

[4]

A. Bernardi, J. Brachat, P. Comon, and B. Mourrain. 2013. General tensor decomposition, moment matrices and applications. J. Symbolic Comput. 52 (2013), 51–71.

Digital Library

Google Scholar

[5]

A. Bernardi and D. Taufer. 2020. Waring, tangential and cactus decompositions. J. Math. Pures Appl. 143 (2020), 1–30.

Crossref

Google Scholar

[6]

C. Bocci, L. Chiantini, and G. Ottaviani. 2014. Refined methods for the identifiability of tensors. Ann. Mat. Pura Appl. 193, 6 (2014), 1691–1702.

Crossref

Google Scholar

[7]

J. Brachat, P. Comon, B. Mourrain, and E. Tsigaridas. 2010. Symmetric tensor decomposition. Linear Alg. Appl. 433, 11–12 (2010), 1851–1872.

Crossref

Google Scholar

[8]

J. R. Bunch and L. Kaufman. 1977. Some stable methods for calculating inertia and solving symmetric linear systems. Math. Comp. 31 (1977), 163–179.

Crossref

Google Scholar

[9]

P. Bürgisser, M. Clausen, and A. Shokrollahi. 1997. Algebraic Complexity Theory. Grundlehren der mathematischen Wissenschaften, Vol. 315. Springer-Verlag, Berlin.

Crossref

Google Scholar

[10]

E. Carlini and J. Kleppe. 2011. Ranks derived from multilinear maps. J. Pure Appl. Alg. 215, 8 (2011), 1999–2004.

Crossref

Google Scholar

[11]

L. Chiantini and C. Ciliberto. 2006. On the concept of $k$-secant order of a variety. J. London Math. Soc. 73, 2 (2006), 436–454.

Crossref

Google Scholar

[12]

L. Chiantini and G. Ottaviani. 2012. On generic identifiability of 3-tensors of small rank. SIAM J. Matrix Anal. Appl. 33, 3 (2012), 1018–1037.

Digital Library

Google Scholar

[13]

L. Chiantini, G. Ottaviani, and N. Vannieuwenhoven. 2014. An algorithm for generic and low-rank specific identifiability of complex tensors. SIAM J. Matrix Anal. Appl. 35, 4 (2014), 1265–1287.

Digital Library

Google Scholar

[14]

L. Chiantini, G. Ottaviani, and N. Vannieuwenhoven. 2017. Effective criteria for specific identifiability of tensors and forms. SIAM J. Matrix Anal. Appl. 38, 2 (2017), 656–681.

Digital Library

Google Scholar

[15]

D. A. Cox, J. Little, and D. O’Shea. 2006. Using Algebraic Geometry. Graduate Texts in Mathematics, Vol. 185. Springer Science & Business Media.

Google Scholar

[16]

D. A. Cox, J. Little, and D. O’Shea. 2013. Ideals, Varieties, and Algorithms: An Introduction to Computational Algebraic Geometry and Commutative Algebra. Springer Science & Business Media.

Google Scholar

[17]

D. A. Cox, J. Little, and H. K. Schenck. 2011. Toric Varieties. Vol. 124. American Mathematical Society.

Crossref

Google Scholar

[18]

L. De Lathauwer. 2006. A link between the canonical decomposition in multilinear algebra and simultaneous matrix diagonalization. SIAM J. Matrix Anal. Appl. 28, 3 (2006), 642–666.

Digital Library

Google Scholar

[19]

L. De Lathauwer, B. De Moor, and J. Vandewalle. 2000. A multilinear singular value decomposition. SIAM J. Matrix Anal. Appl. 21, 4 (2000), 1253–1278.

Digital Library

Google Scholar

[20]

I. Domanov and L. De Lathauwer. 2013. On the uniqueness of the canonical polyadic decomposition of third-order tensors—part II: Uniqueness of the overall decomposition. SIAM J. Matrix Anal. Appl. 34, 3 (2013), 876–903.

Digital Library

Google Scholar

[21]

I. Domanov and L. De Lathauwer. 2014. Canonical polyadic decomposition of third-order tensors: Reduction to generalized eigenvalue decomposition. SIAM J. Matrix Anal. Appl. 35, 2 (2014), 636–660.

Digital Library

Google Scholar

[22]

I. Domanov and L. De Lathauwer. 2017. Canonical polyadic decomposition of third-order tensors: Relaxed uniqueness conditions and algebraic algorithm. Linear Alg. Appl. 513 (2017), 342–375.

Crossref

Google Scholar

[23]

D. R. Grayson and M. E. Stillman. 2019. Macaulay2, a software system for research in algebraic geometry. software version: v1.18. Retrieved from http://www.math.uiuc.edu/Macaulay2/.

Google Scholar

[24]

W. Greub. 1978. Multilinear Algebra (2nd ed.). Springer-Verlag.

Crossref

Google Scholar

[25]

S. Hendrikx, M. Boussé, N. Vervliet, M. Vandecappelle, R. Kenis, and L. De Lathauwer. 2022. Tensorlab⁺. Retrieved from https://www.tensorlabplus.net.

Google Scholar

[26]

C. J. Hillar and L.-H. Lim. 2013. Most tensor problems are NP-Hard. J. ACM 60, 6 (2013), 45:1–45:39.

Digital Library

Google Scholar

[27]

F. L. Hitchcock. 1927. The expression of a tensor or a polyadic as a sum of products. J. Math. Phys. 6, 1 (1927), 164–189.

Crossref

Google Scholar

[28]

Anthony Iarrobino and Vassil Kanev. 1999. Power Sums, Gorenstein Algebras, and Determinantal Loci. Springer Science & Business Media.

Crossref

Google Scholar

[29]

T. G. Kolda and B. W. Bader. 2009. Tensor decompositions and applications. SIAM Rev. 51, 3 (2009), 455–500.

Digital Library

Google Scholar

[30]

J. B. Kruskal. 1977. Three-way arrays: Rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics. Linear Alg. Appl. 18 (1977), 95–138.

Crossref

Google Scholar

[31]

Y.-Ch. Kuo and T.-L. Lee. 2018. Computing the unique CANDECOMP/PARAFAC decomposition of unbalanced tensors by homotopy method. Linear Alg. Appl. 556 (2018), 238–264.

Crossref

Google Scholar

[32]

J. M. Landsberg. 2012. Tensors: Geometry and Applications. Graduate Studies in Mathematics, Vol. 128. AMS, Providence, RI.

Google Scholar

[33]

S. E. Leurgans, R. T. Ross, and R. B. Abel. 1993. A decomposition for three-way arrays. SIAM J. Matrix Anal. Appl. 14, 4 (1993), 1064–1083.

Digital Library

Google Scholar

[34]

A. Lorber. 1985. Features of quantifying chemical composition from two-dimensional data array by the rank annihilation factor analysis method. Anal. Chem. 57 (1985), 2395–2397.

Crossref

Google Scholar

[35]

F. S. Macaulay. 1916. The Algebraic Theory of Modular Systems. Vol. 19. Cambridge University Press.

Crossref

Google Scholar

[36]

D. Maclagan and G. G. Smith. 2004. Multigraded castelnuovo-mumford regularity. J. für die Reine und Angew. Math. 2004, 571 (2004), 179–212.

Google Scholar

[37]

E. Miller and B. Sturmfels. 2005. Combinatorial Commutative Algebra. Vol. 227. Springer Science & Business Media.

Google Scholar

[38]

B. Mourrain. 2018. Polynomial–exponential decomposition from moments. Found. Comput. Math. 18, 6 (2018), 1435–1492.

Digital Library

Google Scholar

[39]

J. Nie. 2017. Generating polynomials and symmetric tensor decompositions. Found. Comput. Math. 17, 2 (2017), 423–465.

Digital Library

Google Scholar

[40]

I. V. Oseledets, D. V. Savostianov, and E. E. Tyrtyshnikov. 2008. Tucker dimensionality reduction of three-dimensional arrays in linear time. SIAM J. Matrix Anal. Appl. 30, 3 (2008), 939–956.

Digital Library

Google Scholar

[41]

Francesco Russo. 2016. On the Geometry of Some Special Projective Varieties. Springer.

Crossref

Google Scholar

[42]

E. Sanchez and B. R. Kowalski. 1990. Tensorial resolution: A direct trilinear decomposition. J. Chemom. 4, 1 (1990), 29–45.

Crossref

Google Scholar

[43]

N. D. Sidiropoulos and R. Bro. 2000. On the uniqueness of multilinear decomposition of N-way arrays. J. Chemom. 14, 3 (2000), 229–239.

Crossref

Google Scholar

[44]

N. D. Sidiropoulos, L. De Lathauwer, X. Fu, K. Huang, E. E. Papalexakis, and Ch. Faloutsos. 2017. Tensor decomposition for signal processing and machine learning. IEEE Trans. Signal. Process. 65, 13 (2017), 3551–3582.

Digital Library

Google Scholar

[45]

S. Telen. 2020a. Numerical root finding via cox rings. J. Pure Appl. Alg. 224, 9 (2020), 106367.

Crossref

Google Scholar

[46]

S. Telen. 2020b. Solving Systems of Polynomial Equations. Ph.D. Dissertation. KU Leuven.

Google Scholar

[47]

S. Telen, B. Mourrain, and M. Van Barel. 2018. Solving polynomial systems via truncated normal forms. SIAM J. Matrix Anal. Appl. 39, 3 (2018), 1421–1447.

Digital Library

Google Scholar

[48]

L. R. Tucker. 1966. Some mathematical notes on three-mode factor analysis. Psychometrika 31, 3 (1966), 279–311.

Crossref

Google Scholar

[49]

N. Vannieuwenhoven, R. Vandebril, and K. Meerbergen. 2012. A new truncation strategy for the higher-order singular value decomposition. SIAM J. Sci. Comput. 34, 2 (2012), A1027–A1052.

Digital Library

Google Scholar

Cited By

View all

Gesmundo FKayser LTelen S(2024)Hilbert Functions of Chopped IdealsJournal of Algebra10.1016/j.jalgebra.2024.11.017Online publication date: Dec-2024
https://doi.org/10.1016/j.jalgebra.2024.11.017

Index Terms

A Normal Form Algorithm for Tensor Rank Decomposition
1. Mathematics of computing
  1. Mathematical analysis
    1. Nonlinear equations
    2. Numerical analysis
      1. Computations on matrices
      2. Computations on polynomials
  2. Mathematical software
    1. Solvers

Recommendations

Pencil-Based Algorithms for Tensor Rank Decomposition are not Stable

We prove the existence of an open set of $n_1\times n_2 \times n_3$ tensors of rank $r$ for which popular and efficient algorithms for computing tensor rank decompositions based on a reduction to a linear matrix pencil, typically followed by a generalized ...
Rank Splitting for CANDECOMP/PARAFAC
LVA/ICA 2015: Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation - Volume 9237

CANDECOMP/PARAFAC CP approximates multiway data by a sum of rank-1 tensors. Our recent study has presented a method to rank-1 tensor deflation, i.e. sequential extraction of rank-1 tensor components. In this paper, we extend the method to block ...
Compressive sensing via nonlocal low-rank tensor regularization

The aim of Compressing sensing (CS) is to acquire an original signal, when it is sampled at a lower rate than Nyquist rate previously. In the framework of CS, the original signal is often assumed to be sparse and correlated in some domain. Recently, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Mathematical Software

ACM Transactions on Mathematical Software Volume 48, Issue 4

December 2022

339 pages

ISSN:0098-3500

EISSN:1557-7295

DOI:10.1145/3572845

Editors:
Zhaojun Bai
University of California at Davis, USA
,
Wolfgang Bangerth
Colorado State University, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 December 2022

Online AM: 10 August 2022

Accepted: 01 July 2022

Revised: 04 February 2022

Received: 30 April 2021

Published in TOMS Volume 48, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

Postdoctoral Fellowship

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
1,382
Total Downloads

Downloads (Last 12 months)842
Downloads (Last 6 weeks)126

Reflects downloads up to 19 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Gesmundo FKayser LTelen S(2024)Hilbert Functions of Chopped IdealsJournal of Algebra10.1016/j.jalgebra.2024.11.017Online publication date: Dec-2024
https://doi.org/10.1016/j.jalgebra.2024.11.017

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

1 Introduction

Contributions

Related Work

Outline

2 From Tensor Decomposition to Polynomial Equations

2.1 Identifiability, Flattenings, and the Main Assumption

2.2 The Reshaping Trick

2.3 Polynomial Systems Defined by Flattenings

2.4 The High-level Algorithm

3 From Polynomial Equations to Eigenvalues

3.1 The Eigenvalue Theorem

3.2 Computing Pre-normal Forms

3.3 Relation to Pencil-based Algorithms

3.4 The Algorithm

3.5 Some Examples

4 Regularity

5 Computational Complexity

6 Numerical Experiments

6.1 Implementation Details

6.2 Impact on the Accuracy of Computational Optimizations

6.3 Comparison with the State of the Art

6.4 Robustness in the Noisy Case

6.5 An Example of Higher-order Tensors

7 Conclusions

Footnotes

A Proof of the Technical Results

Acknowledgments

References

Cited By

Index Terms

Recommendations

Pencil-Based Algorithms for Tensor Rank Decomposition are not Stable

Rank Splitting for CANDECOMP/PARAFAC

Compressive sensing via nonlocal low-rank tensor regularization

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

HTML Format

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations