WO2017107337A1 - 一种基于改进的按位替换法求矩阵三角分解的模块及方法 - Google Patents
一种基于改进的按位替换法求矩阵三角分解的模块及方法 Download PDFInfo
- Publication number
- WO2017107337A1 WO2017107337A1 PCT/CN2016/078460 CN2016078460W WO2017107337A1 WO 2017107337 A1 WO2017107337 A1 WO 2017107337A1 CN 2016078460 W CN2016078460 W CN 2016078460W WO 2017107337 A1 WO2017107337 A1 WO 2017107337A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- matrix
- decomposed
- decomposition
- equation
- coefficient matrix
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
Definitions
- the invention relates to the field of science and engineering calculation, in particular to a novel module and a calculation method for matrix triangulation based on improved bitwise replacement method applied in scientific calculation or engineering calculation.
- Matrix operations are the foundation of scientific and engineering calculations. The matrix's simple and intuitive representation, computational flexibility and high numerical stability make matrix operations a key technology and core issue for many engineering projects. Matrix operations include matrix multiplication, matrix decomposition, matrix inversion, and so on. Matrix decomposition is the inverse of matrix multiplication, and it is a simplified solution to matrix inversion. Development is also the most popular. Matrix decomposition is widely used in numerical analysis and engineering because the decomposed matrix has more obvious numerical features or physical meanings. For example, in the field of large-scale data analysis such as pattern recognition, digital image processing and encryption, computational fluid dynamics, signal processing and control, matrix decomposition algorithms have been or are becoming a core support. How to effectively use resources to achieve fast and large-scale matrix decomposition operations has become the focus and difficulty of design.
- QR decomposition can be used for the decomposition of an arbitrary matrix, which essentially decomposes the arbitrary matrix A into the product of an orthogonal matrix Q and a triangular matrix R.
- the classic QR decomposition algorithms include gram-schmidt method, householder transformation method and givess rotation method.
- the recursive method is adopted, which has large computational complexity, difficult process control and poor parallelism.
- the LU decomposition is a matrix decomposition method for non-singular arrays (that is, the matrix order main sub-forms are not 0). Its essence is to decompose the matrix A into a product of a lower triangular matrix L and an upper triangular matrix U. However, the LU decomposition uses the iterative recursive method to alternately calculate the upper and lower triangular arrays. The synchronization parallelism is not strong, the computational complexity is high, and the storage space is large. Singular value decomposition is a generalization of the normal matrix ⁇ diagonalization in matrix analysis.
- singular value decomposition is to decompose a complex matrix A into the product of two ⁇ matrices U, V and a diagonal matrix S.
- singular value decomposition is difficult to break into uncorrelated sub-operations.
- Singular value decomposition has poor parallelism, high computational complexity, and poor computational efficiency and real-time performance.
- the existing matrix decomposition technology still has certain limitations in engineering applications, mainly due to the following shortcomings:
- the computational space has a high complexity, and the occupied storage space is large, and the resource utilization rate is not high in specific engineering applications.
- the present invention proposes a new module and method for matrix triangulation based on improved bitwise replacement method, in order to optimize the matrix decomposition design in engineering application, reduce computational complexity, and compress storage. Space, improving the parallelism of operations.
- the present invention provides a module for matrix triangulation based on an improved bitwise replacement method, comprising: a boundary element acquisition unit, an internal element acquisition unit, an upper triangular matrix decomposition unit, and a lower triangular matrix decomposition unit;
- the boundary element acquisition unit is configured to obtain a matrix to be decomposed
- the boundary element of the reduction coefficient matrix N is an M-order square matrix that satisfies the order of the main order sub-form is not 0;
- the internal element acquisition unit is configured to obtain an internal element of the reduced coefficient matrix N of the matrix A to be decomposed; thereby obtaining a reduced coefficient matrix N;
- the upper triangular matrix decomposition unit is configured to decompose an upper triangular matrix of the matrix A to be decomposed;
- the lower triangular matrix decomposition unit is configured to decompose the lower triangular matrix of the matrix A to be decomposed.
- the boundary element acquiring unit obtains the boundary elements n 1i ⁇ 0 and n j1 ⁇ 0 of the reduced coefficient matrix N by using Equation (1) according to the matrix A to be decomposed:
- the internal element acquisition unit obtains a diagonal element n ii ⁇ (i-1) of the reduction coefficient matrix N by using Equation (2 ) :
- the internal element acquisition unit obtains the lower triangular element n ji ⁇ (i-1) of the reduction coefficient matrix N by using Equation (3 ) :
- the internal element acquisition unit obtains the upper triangular element n ji ⁇ (j-1) of the reduction coefficient matrix N by using Equation (4 ) :
- the lower triangular matrix decomposing unit decomposes the to-be-decomposed matrix A into a lower triangular matrix by using equation (5) according to the reduced coefficient matrix N
- the upper triangular matrix decomposing unit decomposes the to-be-decomposed matrix A into an upper triangular matrix by using equation (6) according to the reduced coefficient matrix N
- the present invention provides a method for matrix triangulation in engineering calculations, characterized in that the method comprises the following steps:
- Step 1 Obtain a matrix to be decomposed
- Step 2 obtaining the internal elements of the reduced coefficient matrix N of the matrix A to be decomposed, thereby obtaining a reduced coefficient matrix N;
- Step 3 Decompose the lower triangular matrix of the matrix A to be decomposed
- Step 4 Decompose the upper triangular matrix of the matrix A to be decomposed.
- the step 1 includes obtaining the boundary elements n 1i ⁇ 0 and n j1 ⁇ 0 of the reduced coefficient matrix N by using the formula (1):
- the step 2 includes:
- Step 2.1 Using the equation (2) to obtain the diagonal element n ii ⁇ (i-1) of the reduced coefficient matrix N:
- Step 2.2 Using equation (3), obtain the lower triangular element n ji ⁇ (i-1) of the reduced coefficient matrix N:
- Step 2.3 Obtain the upper triangular element n ji ⁇ (j-1) of the reduced coefficient matrix N by using equation (4 ) :
- the step 3 includes decomposing the matrix to be decomposed into a lower triangular matrix by using equation (5) according to the reduced coefficient matrix N.
- the step 4 includes decomposing the matrix to be decomposed into an upper triangular matrix by using equation (6) according to the reduced coefficient matrix N.
- the present invention provides a digital signal processing apparatus, characterized in that the digital signal processing apparatus includes a signal receiving apparatus, a data arithmetic apparatus, and a signal output apparatus, the data arithmetic apparatus including the above-described improved bitwise based Substituting a module for matrix triangulation, and when the data operation device requires a matrix in the digital signal processing process When triangulation is performed, the data operation device calls the module to perform matrix triangulation based on the improved bitwise replacement method to perform a triangulation operation on the matrix to be decomposed.
- the signal processing device is operative to process a communication signal or an image signal.
- the digital signal processing device employs an FPGA-based hardware circuit implementation for performing information processing applications such as remaining life prediction of the spacecraft platform.
- the digital signal processing device employs a software programming implementation for processing and encrypting digital image signals, such as digital watermarking applications.
- the matrix decomposition module proposed by the present invention based on the original triangular substitution algorithm based on the original bitwise replacement method, the algorithm is modified and improved, and an improved efficient matrix decomposition algorithm is generated.
- the matrix decomposition module proposed by the invention is based on the improved bitwise replacement matrix decomposition algorithm, which not only broadens the application scope of the operation, but also greatly simplifies the operation process and makes the operation complexity lower.
- the matrix decomposition module proposed by the present invention in the whole operation process, only in the inner element acquisition unit, the lower triangular matrix decomposition unit and the upper triangular matrix decomposition unit, the diagonal elements of the reduced coefficient matrix need to be squared. Or reciprocal (division), the rest of the decomposition module only involves a simple multiplication and addition process, avoiding a large number of operations in the prior art, such as square root, square, norm, division, etc., simplifying the operation process.
- the upper and lower triangular decomposition units have the same operation form, which is convenient for software programming and hardware design, and reduces computational resources and storage resource consumption of the design platform.
- the matrix decomposition module proposed by the present invention decomposes the serial iterative process by acquiring and creating a reduced coefficient matrix in the boundary element acquiring unit and the internal element acquiring unit, so that the upper triangular triangle is obtained while acquiring the element of the reduced coefficient matrix.
- the elements of the reduced coefficient matrix obtained by the previous stage can synchronously solve the upper and lower triangular matrix elements in parallel, which overcomes the existing matrix decomposition technique and uses the iterative serial operation to alternately calculate The problem of the parallelism caused by the lower triangular matrix elements is not strong.
- the matrix decomposition module proposed by the present invention is based on a bitwise replacement method, so that the entire module does not need to occupy additional storage space except for inputting the storage space occupied by the matrix to be decomposed.
- the existing matrix decomposition technique solves this problem because it has a large amount of storage space and has certain limitations in engineering applications of hyperscale matrix decomposition.
- the matrix decomposition module proposed by the invention has low computational complexity in the internal operation process of each unit, and each unit can perform the operation process independently in parallel with the upper and lower triangles.
- the non-singular matrix decomposition of any order within M order can be extended by adding 0.
- the invention solves the problem that the decomposition module of the fixed order is not scalable in the existing software/hardware engineering design (in the prior art scheme, such as the decomposition function and the fixed decomposition order of the Matlab)
- the number of M FPGA circuit modules cannot be calculated because of the need to strictly follow the calculation conditions, so the scalability is not strong.
- the module for solving the matrix triangulation based on the improved bitwise replacement method in the embodiment includes: a boundary element acquisition unit, an internal element acquisition unit, an upper triangular matrix decomposition unit, and a lower triangular matrix decomposition unit; the decomposition idea is: 1 according to The predetermined matrix to be decomposed is obtained, and the boundary element of the matrix of the reduction coefficient is obtained; 2 according to the boundary element of the matrix to be decomposed and its reduced coefficient matrix, the internal elements of the matrix of the reduction coefficient are obtained; 3 according to the matrix of the reduction coefficient, The decomposition matrix is decomposed into an upper triangular matrix and a lower triangular matrix, thereby completing the decomposition of the entire matrix; specifically,
- the boundary element acquisition unit is used to obtain the matrix to be decomposed
- the matrix A to be decomposed is an M-order square matrix satisfying the order of the main order, and the main sub-form is not 0;
- the matrix to be decomposed A created by Matlab is a randomly generated 8th-order square matrix in which the order main sub-forms are not 0:
- the boundary element acquiring unit obtains the boundary elements n 1i ⁇ 0 and n j1 ⁇ 0 of the reduced coefficient matrix N by using the equation (1) according to the matrix A to be decomposed:
- the boundary element acquiring unit obtains the boundary element of the reduced coefficient matrix N by using the equation (1) according to the input matrix to be decomposed, as shown in the formula (1.1):
- the internal element acquisition unit is configured to obtain an internal element of the reduced coefficient matrix N of the matrix A to be decomposed; thereby obtaining a reduced coefficient matrix N;
- the internal element acquisition unit first obtains the diagonal element n ii ⁇ (i-1) of the reduction coefficient matrix N by using Equation (2 ) :
- the internal element acquisition unit obtains the diagonal element of the reduced coefficient matrix by using Equation (2), as shown in Equation (2.1):
- the internal element acquisition unit reuses the lower triangular element n ji ⁇ (i-1) of the reduction coefficient matrix N by using equation (3 ) :
- the internal element acquisition unit obtains the lower triangular element of the reduced coefficient matrix by using Equation (3), as shown in Equation (3.1):
- the inner element acquisition unit finally obtains the upper triangular element n ji ⁇ (j-1) of the reduction coefficient matrix N by using equation (4 ) :
- the internal element acquisition unit obtains the upper triangular element of the reduced coefficient matrix by using Equation (4), as shown in Equation (4.1):
- the obtained reduction coefficient matrix N is:
- the upper triangular matrix decomposition unit is configured to decompose the upper triangular matrix of the matrix A to be decomposed;
- the upper triangular matrix decomposition unit decomposes the matrix A to be decomposed into a lower triangular matrix by using equation (5) according to the reduced coefficient matrix N.
- the upper triangular matrix decomposing unit decomposes the matrix to be decomposed into the lower triangular matrix L by using equation (5) according to equation (4.1), as shown in equation (5.1):
- the lower triangular matrix decomposition unit is configured to decompose the lower triangular matrix of the matrix A to be decomposed;
- the lower triangular matrix decomposing unit decomposes the matrix A to be decomposed into an upper triangular matrix by using equation (6) according to the reduced coefficient matrix N.
- the lower triangular matrix decomposition unit decomposes the matrix to be decomposed into the upper triangular matrix R by using equation (6) according to equation (4.1), as shown in equation (6.1):
- the experiment randomly selects three scale sample matrices of 8th order, 64th order, and 1024th order.
- the matrix elements of each scale are (-1,1), (-20,20), (-1000,1000).
- Each condition is randomly selected from four different sample matrices for testing. From the maximum error result data in the table, it can be known that the triangular matrix multiplication result decomposed by the matrix decomposition module proposed in this patent is very close to the original sample matrix to be decomposed, absolutely The error and relative error are both small and have high arithmetic precision. It is effective to use the decomposition module proposed in this patent for decomposition.
- the decomposition module proposed in this patent replaces the original matrix element of the corresponding bit by the element of the reduced coefficient matrix, and then replaces the corresponding bit by the decomposed triangular matrix element. Coefficient matrix elements, the entire process is replaced in situ, no additional storage space. Moreover, the internal operation process of each unit in the module is simple, and the synchronization degree of each unit is high, and the decomposition operation time is compressed in the parallel engineering design. Therefore, the matrix decomposition module proposed in this patent has high efficiency, and the decomposition algorithm used not only has high computational precision, but also has low computational complexity, high parallelism and scalability, and saves storage space. It has very good theory and Engineering application value.
- the method comprises the following steps:
- Step 1 Obtain a matrix to be decomposed
- the boundary element of the reduction coefficient matrix N; the matrix A to be decomposed is an M-order square matrix satisfying the order of the main order, and the main sub-form is not 0;
- Step 2 Acquire internal elements of the reduced coefficient matrix N of the matrix A to be decomposed, thereby obtaining a reduced coefficient matrix N, which specifically includes:
- Step 2.1 Obtain a diagonal element n ii ⁇ (i-1) of the reduction coefficient matrix N by using the formula (2) in Embodiment 1;
- Step 2.2 Using the formula (3) in Embodiment 1, obtaining a lower triangular element n ji ⁇ (i-1) of the reduction coefficient matrix N;
- Step 2.3 using the equation (4) in the embodiment 1 to obtain the upper triangular element n ji ⁇ (j-1) of the reduction coefficient matrix N; thereby obtaining a reduction coefficient matrix N;
- Step 3 Decompose the to-be-decomposed matrix A into a lower triangular matrix by using the formula (5) in Embodiment 1;
- Step 4 Decompose the matrix to be decomposed into an upper triangular matrix by using equation (6) in Embodiment 1.
- the present invention provides a digital signal processing device, which includes a signal receiving device, a data computing device, and a signal output device, and the data computing device can adopt software programming according to a specific application platform. It can be implemented in software mode or in hardware mode such as FPGA-based circuit module.
- the data operation device includes the module for performing matrix triangulation based on the improved bitwise replacement method in Embodiment 1, and when the data operation device needs to perform triangulation on the matrix in the signal processing process, the data operation device calls the embodiment in Embodiment 1. Based on the improved bitwise replacement method, the module of the matrix triangulation is decomposed, and the matrix to be decomposed is triangulated.
- the digital signal processing device can be applied in the field of signal processing related to processing communication signals or image signals and other signals related to matrix operations. For example, it is applied to the watermark embedding link in the digital watermarking technology in information security to encrypt information. It is applied to the field of image processing for identifying nuances of digital images. Or apply it to the remaining life prediction of the spacecraft platform.
- the digital signal processing device includes the following work steps:
- Step 1 After receiving the data processing request information, the digital signal processing device parses the data processing request information, and uses the signal receiving device to externally receive the digital signal to be processed as a matrix to be decomposed.
- Step 2 The data operation device of the digital signal processing device reads and stores the matrix to be decomposed from the signal receiving device. After the data is stored, the triangular decomposition operation of the startup matrix is configured according to the parsed processing request, and the specific steps are performed. include:
- Step 2.1 Using the formula (1) in Embodiment 1, obtaining boundary elements n 1i ⁇ 0 and n j1 ⁇ 0 of the reduction coefficient matrix N of the matrix A to be decomposed;
- Step 2.2 Perform an operation using Equations (2), (3), and (4) in Embodiment 1, to obtain internal elements of the reduced coefficient matrix N;
- Step 2.3 Performing operations in parallel using Equations (5) and (6) in Embodiment 1, and decomposing the matrix A to be decomposed into a lower triangular matrix and an upper triangular matrix;
- Step 3 after the completion of the decomposition operation, the data operation device transmits the matrix decomposition result to the signal output device;
- Step 4 The signal output device processes the matrix decomposition result data according to the parsed data processing request request, and outputs the data to the outside of the digital signal processing device to complete communication between the digital signal processing device and the external device.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Complex Calculations (AREA)
- Image Processing (AREA)
Abstract
一种基于改进的按位替换法求矩阵三角分解的模块及方法,还提供了利用该模块进行信号处理的设备。该模块包括:边界元素获取单元、内部元素获取单元、上三角矩阵分解单元和下三角矩阵分解单元;边界元素获取单元用于获取待分解矩阵的约化系数矩阵的边界元素;内部元素获取单元用于获得待分解矩阵的约化系数矩阵的内部元素;从而获得约化系数矩阵;上三角矩阵分解单元用于分解待分解矩阵的上三角矩阵;下三角矩阵分解单元用于分解待分解矩阵的下三角矩阵。上述模块及方法在工程应用中能够统一矩阵上、下三角分解的运算形式,优化分解运算的并行性,压缩运算所需存储空间。
Description
相关申请
本申请主张于2015年12月22日提交的、名称为“一种基于改进的按位替换法求矩阵三角分解的模块”、申请号为201510981481.3的中国发明专利申请的优先权。
本发明涉及科学和工程计算领域,尤其涉及一种应用于科学计算或工程计算中的基于改进的按位替换法求矩阵三角分解的新型模块及计算方法。
目前,在通信、计算机、图像处理等技术领域均会涉及到科学计算或工程计算,在计算时往往需要利用计算机或其他处理器进行大量的运算操作,这些运算操作的计算速度和精确度也往往是制约相应行业发展的瓶颈。
矩阵运算是科学和工程计算的基础。矩阵简洁直观的表述方式、运算的灵活性和高数值稳定性使矩阵运算成为众多工程项目的关键技术和核心问题。矩阵运算包括矩阵乘法、矩阵分解、矩阵求逆等。其中矩阵分解是矩阵乘法的逆过程,是矩阵求逆的一种简化的求解方式,发展也最为普及。由于分解后的矩阵具有更加明显的数值特征或物理含义,矩阵分解在数值分析和工程领域获得广泛应用。例如在模式识别、数字图像处理和加密、计算流体动力学、信号处理和控制等大规模数据分析领域,矩阵分解算法已经或正在成为核心支撑。如何有效利用资源实现快速大规模矩阵分解运算成为了设计的重点和难点。
目前矩阵分解种类繁多,工程应用中,常用的有QR分解、LU分解、奇异值分解等,结合具体的应用可选择不同的分解方法。其中QR分解可用于任意矩阵的分解,其本质是将任意矩阵A分解成一个正交矩阵Q与一个三角矩阵R的乘积。QR分解经典算法有gram-schmidt法、householder变换法和givens旋转法,采用递推方法,计算复杂度大,流程控制困难,并行性差。LU分解是针对非奇异阵(即矩阵顺序主子式均不为0)的一种矩阵分解方式,它的本质是将矩阵A分解为一个下三角阵L与上三角阵U的乘积。但LU分解采用迭代递推方法交替计算上、下三角阵,同步并行性不强,计算复杂度高,占用存储空间大。奇异值分解是矩阵分析中正规矩阵酉对角化的推广,奇异值分解的本质是将一个复矩阵A分解为两个酉矩阵U、V以及一个对角阵S的乘积。但奇异值分解很难拆成不相关子运算,奇异值分解并行性较差,计算复杂度较高,计算效率、实时性较差。
综上,目前现有的矩阵分解技术在工程应用中仍有一定的局限性,主要归结为以下几点不足:
第一,采用递推迭代方法,串行性要求较高,进行并行运算难度较大,难以满足工程应用
中实时性要求。
第二,运算复杂度较高,运算量较大,计算时间较长。
第三,运算空间复杂度较高,占用的存储空间较大,在具体工程应用中资源利用率不高。
发明内容
本发明为避免现有技术的不足之处,提出一种新的基于改进的按位替换法求矩阵三角分解的模块及方法,以期优化工程应用中矩阵分解的设计,降低运算复杂度,压缩存储空间,改善运算的并行性。
本发明为解决技术问题采用如下技术方案:
首先,本发明提供了一种基于改进的按位替换法求矩阵三角分解的模块,其包括:边界元素获取单元、内部元素获取单元、上三角矩阵分解单元和下三角矩阵分解单元;
所述内部元素获取单元用于获得待分解矩阵A的约化系数矩阵N的内部元素;从而获得约化系数矩阵N;
所述上三角矩阵分解单元用于分解待分解矩阵A的上三角矩阵;
所述下三角矩阵分解单元用于分解待分解矩阵A的下三角矩阵。
本发明所述的基于改进的按位替换法求矩阵三角分解的模块的特点也在于:
所述边界元素获取单元根据待分解矩阵A,利用式(1)获得约化系数矩阵N的边界元素n1i·0和nj1·0:
所述内部元素获取单元利用式(2)获得约化系数矩阵N的对角元素nii·(i-1):
式(2)中,k=2,3,…i-1;
所述内部元素获取单元利用式(3)获得约化系数矩阵N的下三角元素nji·(i-1):
式(3)中,i=2,3,…,M-1;j=i+1,i+2,…,M;
所述内部元素获取单元利用式(4)获得约化系数矩阵N的上三角元素nji·(j-1):
式(4)中,i=j+1,j+2,…,M;j=2,3,…,M-1;
另一方面,本发明提供一种在工程计算中求矩阵三角分解的方法,其特征是,所述方法包括下述步骤:
步骤2、获取待分解矩阵A的约化系数矩阵N的内部元素,从而获得约化系数矩阵N;
步骤3、分解待分解矩阵A的下三角矩阵;
步骤4、分解待分解矩阵A的上三角矩阵。
在一种优选实现方式中,所述步骤1包括利用式(1)获得约化系数矩阵N的边界元素n1i·0和nj1·0:
所述步骤2包括:
步骤2.1、利用式(2)获得约化系数矩阵N的对角元素nii·(i-1):
式(2)中,k=2,3,…i-1;
步骤2.2、利用式(3)获得约化系数矩阵N的下三角元素nji·(i-1):
式(3)中,i=2,3,…,M-1;j=i+1,i+2,…,M;
步骤2.3、利用式(4)获得约化系数矩阵N的上三角元素nji·(j-1):
式(4)中,i=j+1,j+2,…,M;j=2,3,…,M-1;
另一方面,本发明提供一种数字信号处理设备,其特征在于,所述数字信号处理设备包括信号接收装置、数据运算装置以及信号输出装置,所述数据运算装置包括上述的基于改进的按位替换法求矩阵三角分解的模块,并且当所述数据运算装置在数字信号处理过程中需要对矩阵
进行三角分解时,所述数据运算装置调用所述基于改进的按位替换法求矩阵三角分解的模块对待分解的矩阵进行三角分解运算。
在一种优选实现方式中,所述信号处理设备用于对通信信号或图像信号进行处理。
在另一种优选实现方式中,所述数字信号处理设备采用基于FPGA的硬件电路实现方式,用于进行空间飞行器平台的剩余使用寿命预测等信息处理应用。
在另一种优选实现方式中,所述数字信号处理设备采用软件编程的实现方式,用于对数字图像信号进行处理和加密,如数字水印等应用。
与已有技术相比,本发明有益效果体现在:
1、本发明提出的矩阵分解模块,在基于原按位替换法进行矩阵三角分解算法的基础上,对其算法进行了修正和改进,产生了改进的高效矩阵分解算法。本发明提出的矩阵分解模块基于改进的按位替换矩阵分解算法,不仅拓宽了运算适用范围,且大大简化了运算过程,使运算复杂度更低。
2、本发明提出的矩阵分解模块,在整个运算过程中,只有在内部元素获取单元、下三角矩阵分解单元中和上三角矩阵分解单元中,需要对约化系数矩阵的对角元素做开方或倒数(除法),整个分解模块的其余部分只涉及简单的乘加运算过程,避免了现有技术中大量的开方、平方、求范数、除法等运算,简化了运算过程。且上、下三角分解单元运算形式相同,便于软件编程和硬件设计,降低设计平台的运算资源和存储资源消耗。
3、本发明提出的矩阵分解模块,通过在边界元素获取单元和内部元素获取单元中获取并创建约化系数矩阵,分解了串行迭代过程,使得在获取约化系数矩阵元素的同时,上三角矩阵分解单元和下三角矩阵分解单元内部可以由前级得到的约化系数矩阵元素,同步并行求解上、下三角矩阵元素,克服了现有矩阵分解技术中,由于采用迭代串行运算交替计算上、下三角矩阵元素导致的可并行性不强的问题。
4、本发明提出的矩阵分解模块,基于按位替换的方法,使得整个模块除输入待分解矩阵所占用的存储空间外,无需占用额外的存储空间。现有矩阵分解技术,由于要占用大量的存储空间,从而在超大规模矩阵分解的工程应用中具有一定的局限性,本发明正是解决了这一问题。
5、本发明提出的矩阵分解模块,各单元内部运算过程的运算复杂度较低,且各单元内部可按上、下三角并行独立执行运算过程。对于工程应用中固定阶数为M的矩阵分解模块,通过补0的方式,可扩展实现M阶以内任意阶的非奇异矩阵分解。解决了现有软/硬件工程设计中,固定阶数的分解模块可扩展性不强的问题(现有技术方案中,如Matlab自带的分解函数和固定分解阶
数M的FPGA电路模块,由于需要严格根据运算条件,补0则无法运算,因此扩展性不强)。
实施例1
本实施例中的基于改进的按位替换法求矩阵三角分解的模块包括:边界元素获取单元、内部元素获取单元、上三角矩阵分解单元和下三角矩阵分解单元;其分解思路是:1根据给定的待分解矩阵,求其约化系数矩阵的边界元素;2根据待分解矩阵和其约化系数矩阵的边界元素,求其约化系数矩阵的内部元素;3根据约化系数矩阵,将待分解矩阵分解为上三角矩阵和下三角矩阵,从而完成整个矩阵的分解;具体的说,
边界元素获取单元用于获得待分解矩阵的约化系数矩阵N的边界元素;待分解矩阵A为满足各阶顺序主子式不为0的M阶方阵;aji表示第j行第i列元素;i,j=1,2,3,…,M;在本实施例中,采用Matlab创建的待分解矩阵A为随机产生的各阶顺序主子式不为0的8阶方阵:
具体的说,边界元素获取单元是根据待分解矩阵A利用式(1)获得约化系数矩阵N的边界元素n1i·0和nj1·0:
在本实施例中,边界元素获取单元是根据输入的待分解矩阵A利用式(1)得到约化系数矩阵N的边界元素,如式(1.1)所示:
内部元素获取单元用于获得待分解矩阵A的约化系数矩阵N的内部元素;从而获得约化系数矩阵N;
具体的说,内部元素获取单元先利用式(2)获得约化系数矩阵N的对角元素nii·(i-1):
式(2)中,k=2,3,…i-1;
在本实施例中,内部元素获取单元利用式(2)获得约化系数矩阵的对角元素,如式(2.1)所示:
内部元素获取单元再利用式(3)获得约化系数矩阵N的下三角元素nji·(i-1):
式(3)中,i=2,3,…,M-1;j=i+1,i+2,…,M;
在本实施例中,内部元素获取单元利用式(3)获得约化系数矩阵的下三角元素,如式(3.1)所示:
内部元素获取单元最后利用式(4)获得约化系数矩阵N的上三角元素nji·(j-1):
式(4)中,i=j+1,j+2,…,M;j=2,3,…,M-1;
在本实施例中,内部元素获取单元利用式(4)获得约化系数矩阵的上三角元素,如式(4.1)所示:
在本实施例中,获得的约化系数矩阵N为:
上三角矩阵分解单元用于分解待分解矩阵A的上三角矩阵;
在本实施例中,上三角矩阵分解单元根据式(4.1),利用式(5)将待分解矩阵A分解为下三角矩阵L,如式(5.1)所示:
下三角矩阵分解单元用于分解待分解矩阵A的下三角矩阵;
在本实施例中,下三角矩阵分解单元,根据式(4.1),利用式(6)将待分解矩阵A分解为上三角矩阵R,如式(6.1)所示:
为了验证本专利中提出的矩阵分解模块的效果,随机选取多组阶数M不同、矩阵元素取值范围不同的矩阵,作为待分解样本矩阵输入该新型矩阵三角分解模块中进行矩阵分解实验。为
了客观的评价本专利提出的矩阵分解模块的性能,将采用本专利分解模块分解后的三角矩阵乘积结果和原待分解样本矩阵进行对比,采用式(7)来计算获得最大绝对误差ε,并对不同实验条件下的结果进行了评测,具体结果如下表1所示:
ε=Max(|A-D|),D=L·R (7)
表1 不同矩阵分解实验误差结果
表1中,实验随机选取了8阶、64阶、1024阶三种规模样本矩阵,每种规模矩阵元素范围分别为(-1,1)、(-20,20)、(-1000,1000),每种条件随机选取四组不同的样本矩阵进行测试,由表中最大误差结果数据可知,采用本专利提出的矩阵分解模块分解后的三角矩阵相乘结果和原待分解样本矩阵非常接近,绝对误差和相对误差均较小,具有较高的运算精度,采用本专利提出的分解模块进行分解有效。
此外,由实施例运算过程可知,本专利提出的分解模块,在进行分解时,先由约化系数矩阵元素替换对应位的原矩阵元素,再由分解后的三角矩阵元素替换对应位的约化系数矩阵元素,整个过程均为原位替换,不需额外占用存储空间。并且模块中各单元内部运算过程简单,且各单元同步可并行度很高,并行工程设计中压缩了分解运算用时。从而可知,本专利提出的矩阵分解模块具有较高的效率,采用的分解算法不仅运算精度较高,并且运算复杂度低、并行性和可扩展性强、节省存储空间,具有非常好的理论和工程应用价值。
实施例2
在本实施例中,提供了一种在工程计算中求矩阵三角分解的方法。需要说明的是,在本实施例中,该方法中所用的公式与实施例1相同,因此,这里基于实施例1中的公式简要介绍该方法。
该方法包括下述步骤:
步骤1、获取待分解矩阵的约化系数矩阵N的边界元素;该待分解矩阵A为满足各阶顺序主子式不为0的M阶方阵;aji表示第j行第i列元素;i,j=1,2,3,…,M,该步骤包括利用实施例1中的式(1)获得约化系数矩阵N的边界元素n1i·0和nj1·0;
步骤2、获取待分解矩阵A的约化系数矩阵N的内部元素,从而获得约化系数矩阵N,其具体包括:
步骤2.1、利用实施例1中的式(2)获得约化系数矩阵N的对角元素nii·(i-1);
步骤2.2、利用实施例1中的式(3)获得约化系数矩阵N的下三角元素nji·(i-1);
步骤2.3、利用实施例1中的式(4)获得约化系数矩阵N的上三角元素nji·(j-1);从而获得约化系数矩阵N;
步骤3、利用实施例1中的式(5)将所述待分解矩阵A分解为下三角矩阵;
步骤4、利用实施例1中的式(6)将所述待分解矩阵A分解为上三角矩阵。
实施例3
在本实施例中,本发明提供一种数字信号处理设备,该数字信号处理设备包括信号接收装置、数据运算装置以及信号输出装置,所述数据运算装置可以根据具体的应用平台,采用软件编程等软件方式实现,或采用基于FPGA的电路模块等硬件方式实现。数据运算装置包括实施例1中的基于改进的按位替换法求矩阵三角分解的模块,并且当数据运算装置在信号处理过程中需要对矩阵进行三角分解时,数据运算装置调用实施例1中的基于改进的按位替换法求矩阵三角分解的模块,对待分解的矩阵进行三角分解运算。
该数字信号处理设备可以应用在信号处理相关领域中,用于对通信信号或图像信号以及其他涉及到矩阵运算的信号进行处理。例如,将其应用于信息安全中数字水印技术中的水印嵌入环节,对信息进行加密。将其应用于图像处理领域,用于鉴别数字图像的细微差别。或将其应用于空间飞行器平台的剩余使用寿命预测。
所述数字信号处理设备包括下述工作步骤:
步骤2、所述数字信号处理设备的数据运算装置从信号接收装置中读取并存储待分解矩阵A,待数据存储完毕后,根据解析后的处理请求,配置启动矩阵三角分解运算,其具体步骤包括:
步骤2.1、利用实施例1中的式(1)获得所述待分解矩阵A的约化系数矩阵N的边界元素n1i·0和nj1·0;
步骤2.2、利用实施例1中的式(2)、式(3)和式(4)执行运算,获得约化系数矩阵N的内部元素;
步骤2.3、利用实施例1中的式(5)和式(6)并行执行运算,将待分解矩阵A分解为下三角矩阵和上三角矩阵;
步骤3、待分解运算完成后,所述数据运算装置将矩阵分解结果传输到所述信号输出装置;
步骤4、所述信号输出装置根据解析后的数据处理请求要求,对矩阵分解结果数据进行处理,并输出到所述数字信号处理设备外部,完成所述数字信号处理设备与外部的通信。
Claims (7)
- 根据权利要求1所述的基于改进的按位替换法求矩阵三角分解的模块,其特征是:所述边界元素获取单元根据待分解矩阵A,利用式(1)获得约化系数矩阵N的边界元素n1i·0和nj1·0:所述内部元素获取单元利用式(2)获得约化系数矩阵N的对角元素nii·(i-1):式(2)中,k=2,3,…i-1;所述内部元素获取单元利用式(3)获得约化系数矩阵N的下三角元素nji·(i-1):式(3)中,i=2,3,…,M-1;j=i+1,i+2,…,M;所述内部元素获取单元利用式(4)获得约化系数矩阵N的上三角元素nji·(j-1):式(4)中,i=j+1,j+2,…,M;j=2,3,…,M-1;lji=0,j=1,2,…,M-1;i=j+1,j+2,…,Mrji=0,i=1,2,…,M-1;j=i+1,i+2,…,M
- 根据权利要求3所述的在工程计算中求矩阵三角分解的方法,其特征是:所述步骤1包括利用式(1)获得约化系数矩阵N的边界元素n1i·0和nj1·0:所述步骤2包括:步骤2.1、利用式(2)获得约化系数矩阵N的对角元素nii·(i-1):式(2)中,k=2,3,…i-1;步骤2.2、利用式(3)获得约化系数矩阵N的下三角元素nji·(i-1):式(3)中,i=2,3,…,M-1;j=i+1,i+2,…,M;步骤2.3、利用式(4)获得约化系数矩阵N的上三角元素nji·(j-1):式(4)中,i=j+1,j+2,…,M;j=2,3,…,M-1;lji=0,j=1,2,…,M-1;i=j+1,j+2,…,Mrji=0,i=1,2,…,M-1;j=i+1,i+2,…,M
- 一种数字信号处理设备,其特征在于,所述数字信号处理设备包括信号接收装置、数据运算装置以及信号输出装置,所述数据运算装置包括根据权利要求1或2所述的基于改进的按位替换法求矩阵三角分解的模块,并且当所述数据运算装置在信号处理过程中需要对矩阵进行三角分解时,所述数据运算装置调用所述基于改进的按位替换法求矩阵三角分解的模块,利用权利要求3或4所述的方法对待分解的矩阵进行三角分解。
- 根据权利要求5所述的数字信号处理设备,其特征在于,所述数字信号处理设备用于对通信信号或图像信号进行处理。
- 根据权利要求5所述的数字信号处理设备,其特征在于,所述数据运算装置为基于FPGA的电路模块。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2017514867A JP6388713B2 (ja) | 2015-12-22 | 2016-04-05 | デジタル信号処理装置 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510981481.3 | 2015-12-22 | ||
CN201510981481.3A CN105608059A (zh) | 2015-12-22 | 2015-12-22 | 一种基于改进的按位替换法求矩阵三角分解的模块 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017107337A1 true WO2017107337A1 (zh) | 2017-06-29 |
Family
ID=55988005
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2016/078460 WO2017107337A1 (zh) | 2015-12-22 | 2016-04-05 | 一种基于改进的按位替换法求矩阵三角分解的模块及方法 |
Country Status (3)
Country | Link |
---|---|
JP (1) | JP6388713B2 (zh) |
CN (1) | CN105608059A (zh) |
WO (1) | WO2017107337A1 (zh) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108399152A (zh) * | 2018-02-06 | 2018-08-14 | 中国科学院信息工程研究所 | 数字查找树的压缩表示方法、系统、存储介质及规则匹配装置 |
CN108693251A (zh) * | 2018-02-19 | 2018-10-23 | 江苏新时高温材料股份有限公司 | 基于超声技术实现中空板式陶瓷膜深层缺陷的三维检测方法 |
CN109885945A (zh) * | 2019-02-26 | 2019-06-14 | 哈尔滨工程大学 | 一种半空间环境下的边界元法近场声全息变换方法 |
CN113094648A (zh) * | 2021-04-02 | 2021-07-09 | 算筹信息科技有限公司 | 外积累加求解三角矩阵与矩阵内积的方法 |
CN115993585A (zh) * | 2023-03-23 | 2023-04-21 | 北京东远润兴科技有限公司 | 雷达抗干扰信号矩阵处理方法、装置、设备及存储介质 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108897716B (zh) | 2018-07-04 | 2022-07-01 | 合肥工业大学 | 通过存储器读写操作来缩减计算量的数据处理装置及方法 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103746948A (zh) * | 2013-11-22 | 2014-04-23 | 天津理工大学 | 基于fpga的模块化盲源分离逻辑电路 |
CN103902762A (zh) * | 2014-03-11 | 2014-07-02 | 复旦大学 | 一种针对正定对称矩阵进行最小二乘方程求解的电路结构 |
US20140188444A1 (en) * | 2012-12-31 | 2014-07-03 | Futurewei Technologies, Inc. | System and Method for Pruning an Over-Defined System Model |
CN104216866A (zh) * | 2013-05-31 | 2014-12-17 | 深圳市海思半导体有限公司 | 一种数据处理装置 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007133710A (ja) * | 2005-11-11 | 2007-05-31 | Hitachi Ltd | 連立一次方程式反復解法における前処理方法および行列リオーダリング方法 |
-
2015
- 2015-12-22 CN CN201510981481.3A patent/CN105608059A/zh active Pending
-
2016
- 2016-04-05 WO PCT/CN2016/078460 patent/WO2017107337A1/zh active Application Filing
- 2016-04-05 JP JP2017514867A patent/JP6388713B2/ja active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140188444A1 (en) * | 2012-12-31 | 2014-07-03 | Futurewei Technologies, Inc. | System and Method for Pruning an Over-Defined System Model |
CN104216866A (zh) * | 2013-05-31 | 2014-12-17 | 深圳市海思半导体有限公司 | 一种数据处理装置 |
CN103746948A (zh) * | 2013-11-22 | 2014-04-23 | 天津理工大学 | 基于fpga的模块化盲源分离逻辑电路 |
CN103902762A (zh) * | 2014-03-11 | 2014-07-02 | 复旦大学 | 一种针对正定对称矩阵进行最小二乘方程求解的电路结构 |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108399152A (zh) * | 2018-02-06 | 2018-08-14 | 中国科学院信息工程研究所 | 数字查找树的压缩表示方法、系统、存储介质及规则匹配装置 |
CN108693251A (zh) * | 2018-02-19 | 2018-10-23 | 江苏新时高温材料股份有限公司 | 基于超声技术实现中空板式陶瓷膜深层缺陷的三维检测方法 |
CN108693251B (zh) * | 2018-02-19 | 2022-08-30 | 江苏新时膜科技有限公司 | 基于超声技术实现中空板式陶瓷膜深层缺陷的三维检测方法 |
CN109885945A (zh) * | 2019-02-26 | 2019-06-14 | 哈尔滨工程大学 | 一种半空间环境下的边界元法近场声全息变换方法 |
CN109885945B (zh) * | 2019-02-26 | 2022-08-02 | 哈尔滨工程大学 | 一种半空间环境下的边界元法近场声全息变换方法 |
CN113094648A (zh) * | 2021-04-02 | 2021-07-09 | 算筹信息科技有限公司 | 外积累加求解三角矩阵与矩阵内积的方法 |
CN115993585A (zh) * | 2023-03-23 | 2023-04-21 | 北京东远润兴科技有限公司 | 雷达抗干扰信号矩阵处理方法、装置、设备及存储介质 |
CN115993585B (zh) * | 2023-03-23 | 2023-05-30 | 北京东远润兴科技有限公司 | 雷达抗干扰信号矩阵处理方法、装置、设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
JP2018506757A (ja) | 2018-03-08 |
CN105608059A (zh) | 2016-05-25 |
JP6388713B2 (ja) | 2018-09-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017107337A1 (zh) | 一种基于改进的按位替换法求矩阵三角分解的模块及方法 | |
Andri et al. | YodaNN: An architecture for ultralow power binary-weight CNN acceleration | |
US11023801B2 (en) | Data processing method and apparatus | |
US10445638B1 (en) | Restructuring a multi-dimensional array | |
CN106991665B (zh) | 基于cuda图像融合并行计算的方法 | |
WO2017181562A1 (zh) | 一种神经网络的处理方法、系统 | |
Wang et al. | Large scale distributed sparse precision estimation | |
WO2017107338A1 (zh) | 一种改进的按位替换法求矩阵逆矩阵模块及方法 | |
Belomestny | Solving optimal stopping problems via empirical dual optimization | |
CA2931226C (en) | Implementation design for hybrid transform coding scheme | |
CN104244010B (zh) | 提高数字信号变换性能的方法及数字信号变换方法和装置 | |
Niu et al. | RT3D: Achieving real-time execution of 3d convolutional neural networks on mobile devices | |
CN106485074B (zh) | 一种基于自适应采样率的海洋温盐场采样方法 | |
CN115272250B (zh) | 确定病灶位置方法、装置、计算机设备和存储介质 | |
Gong et al. | Automatic mapping of the best-suited dnn pruning schemes for real-time mobile acceleration | |
CN113435265B (zh) | 高光谱图像分类方法、装置、电子设备及存储介质 | |
He et al. | A low-cost high-speed object tracking VLSI system based on unified textural and dynamic compressive features | |
Kulkarni et al. | Less: Big data sketching and encryption on low power platform | |
CN103985100A (zh) | 一种基于自适应观测组合优化的分块压缩感知方法 | |
Lu et al. | High-performance homomorphic matrix completion on GPUs | |
Qian et al. | Structured sparse model based feature selection and classification for hyperspectral imagery | |
Che et al. | The fractional differential enhancement of image texture features and its parallel processing optimization | |
Wu et al. | Finding tree symmetries using continuous-time quantum walk | |
Morimitsu et al. | RAPIDFlow: Recurrent Adaptable Pyramids with Iterative Decoding for Efficient Optical Flow Estimation | |
Zhang et al. | Structured sparse BAYESIAN hyperspectral compressive sensing using spectral unmixing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2017514867 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16877151 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16877151 Country of ref document: EP Kind code of ref document: A1 |