CN116560733B

CN116560733B - Space target feature on-orbit real-time parallel LU decomposition computing system and method

Info

Publication number: CN116560733B
Application number: CN202310827007.XA
Authority: CN
Inventors: 禹霁阳; 黄丹; 卢玲
Original assignee: Ordnance Science and Research Academy of China; Beijing Institute of Spacecraft System Engineering
Current assignee: Ordnance Science and Research Academy of China; Beijing Institute of Spacecraft System Engineering
Priority date: 2023-07-07
Filing date: 2023-07-07
Publication date: 2023-10-24
Anticipated expiration: 2043-07-07
Also published as: CN116560733A

Abstract

The invention discloses a space target feature on-orbit real-time parallel LU decomposition computing system and a method, wherein the computing system comprises the following components: the system comprises an input register set, an output L matrix register set, an output U matrix register set, an LU parallel computing scheduling controller and a computing unit set; the calculation method comprises the following steps: the LU parallel computing scheduling controller controls the input register group to receive and store data to be decomposed; the LU parallel computing scheduling controller generates a time sequence scheduling mechanism according to the computing unit group, reads data to be decomposed through the time sequence scheduling mechanism, and sequentially transmits the data to be decomposed to the computing unit group for LU decomposition and calculation to obtain an L matrix result and a U matrix result; the LU parallel computing scheduling controller correspondingly transmits and stores the L matrix result and the U matrix result to an output L matrix register group and a U matrix register group; the LU parallel computing scheduling controller reads and outputs an L matrix result and a U matrix result. On the basis of ensuring the calculation precision and the executable frequency, the real-time LU decomposition of the space target features is realized.

Description

Space target feature on-orbit real-time parallel LU decomposition computing system and method

Technical Field

The invention relates to the technical field of space target feature calculation, in particular to an on-orbit real-time parallel LU decomposition calculation method for space target features.

Background

Currently, in the process of detecting a spatial target, in order to confirm whether a suspected target is a non-attention object such as noise or other fragments, analysis of target characteristics is required. There are two requirements in the calculation of the space target feature matrix: one is to ensure the real-time performance of calculation, and the other is to ensure the precision of calculation; conventional feature calculation is usually finished by a general floating point processor, and for large-scale suspected target elimination calculation, when the real-time requirement of the whole image calculation is in the millisecond order, the real-time performance cannot be ensured; by adopting a general fixed-point processor, the accuracy in the characteristic comparison process is difficult to ensure; the adoption of a high-performance processor such as a DSP (digital signal processor) requires additional heat dissipation design and a corresponding heat dissipation cold plate structure, and the structure is complex; the traditional statistical method is difficult to be used for feature discrimination of dozens of pixel levels, the recognition calculation amount based on CNN is large, and on-orbit application is difficult to be effectively realized. The LU decomposition can amplify the fine difference between the target and noise, so that the false alarm target can be removed rapidly.

However, conventional LU decomposition is partial block parallelism based on sparsity of input matrix values, or a pipeline process is reconstructed using a systolic structure. The designed architecture is difficult to be suitable for special complex application conditions, and meanwhile, as LU calculation elements have correlation from front to back, running water calculation is difficult to be effectively unfolded in a pulsation structure, so that the resolution calculation precision and the real-time performance of the resolution calculation are affected.

Therefore, how to improve the accuracy of the decomposition calculation on the basis of ensuring the real-time performance of the decomposition calculation is a problem to be solved by those skilled in the art.

Disclosure of Invention

In view of this, the present invention provides a system and a method for on-orbit real-time LU decomposition and calculation of space target features, which provides a fully-pipelined parallel computing architecture for digital logic devices, designs the computing flow and time sequence of each sub-step by combing the correlation of each element in the computing process, and realizes real-time LU decomposition of space target features on the basis of ensuring computing precision and executable frequency.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

an on-orbit real-time parallel LU decomposition computing system of spatial target features, comprising: the system comprises an input register set, an output L matrix register set, an output U matrix register set, an LU parallel computing scheduling controller and a computing unit set;

the LU parallel computing scheduling controller is respectively connected with the input register set, the output L matrix register set, the output U matrix register set and the computing unit PE;

the input register set is used for receiving and storing data to be decomposed;

the calculation unit group is used for carrying out parallel running water calculation on the input data to be decomposed to obtain an L matrix result and a U matrix result;

the output L matrix register set is used for storing the L matrix result to be output;

the output U matrix register set is used for storing the U matrix result to be output;

the LU parallel computing scheduling controller is used for controlling the input register set to receive and store data to be decomposed, generating a time sequence scheduling mechanism according to the computing unit set, reading the data to be decomposed through the time sequence scheduling mechanism, sequentially transmitting the data to be decomposed to the computing unit set for LU decomposition computation, and outputting the L matrix result and the U matrix result.

Preferably, each input register in the input register group is marked as；

Each output L matrix register mark in the output L matrix register group is as follows；

Each output U matrix register mark in the output U matrix register group is as follows；

Where i and j represent two-dimensional register row and column identifications, for an N-dimensional matrix,and->All belong to positive integers.

Preferably, the computing unit group includes t computing unitsT is a positive integer;

the computing unitComprising the following steps: multiplier->Adder->Gate->And a reciprocal calculator;

the multiplier is used for executing multiplication operation;

the adder is used for executing addition operation;

the gate is used for selecting signals according toOutputting the received signal, f=1, 2,3,4,5,6;

the reciprocal calculator is used for carrying out table look-up, multiplication, subtraction and shift operation on the input data to be decomposed.

An on-orbit real-time parallel LU decomposition calculation method for spatial target features comprises the following steps:

the LU parallel computing scheduling controller controls the input register group to receive and store data to be decomposed;

the LU parallel computing scheduling controller generates a time sequence scheduling mechanism according to a computing unit group, reads the data to be decomposed through the time sequence scheduling mechanism, and sequentially transmits the data to be decomposed to the computing unit group for LU decomposition computation to obtain an L matrix result and a U matrix result;

the LU parallel computing scheduling controller correspondingly transmits and stores the L matrix result and the U matrix result to an output L matrix register set and an output U matrix register set;

and the LU parallel computing scheduling controller reads and outputs the L matrix result and the U matrix result.

Preferably, the maximum of the data to be decomposed, the L matrix result and the U matrix result is 6x6 matrix;

the computing unit group comprises t computing units for parallel computingWherein t is a positive integer of 8 or less; the computing element->Comprising the following steps: multiplier->Adder->Gate->And a reciprocal calculator, f=1, 2,3,4,5,6.

Preferably, reading the data to be decomposed by the timing schedule mechanism, and sequentially transmitting the data to be decomposed to the computing unit group to perform LU decomposition computation specifically includes:

the data to be decomposed is input into the computing unit group to complete LU decomposition computation after 35 periods;

step 2.1. Input registerDirect assignment:，The process is completed in the 1 st period;representing an output U matrix register;

step 2.2. By means of a calculation unitThe reciprocal calculator in (2) calculates +.3 in cycle 3>According to>，By means of a computing unit->To->Parallel computing to get->Is calculated according to the calculation result of (2);Representing an L matrix register;

step 2.3 according to，By means of a computing unit->To->Parallel computation, get +.5 in period>Is calculated according to the calculation result of (2);

step 2.4. By means of a calculation unitThe reciprocal calculator in (2) calculates +.>According to>，By means of a computing unit->To->Parallel computation, get +.9 in period>Is calculated according to the calculation result of (2);

step 2.5. According to，By means of a computing unit->To the point ofParallel computation, get +.12 in cycle>Is calculated according to the calculation result of (2);

step 2.6. By means of the calculation unit PE ₁ The reciprocal calculator in (2) calculates the value obtained in 15 th periodAccording to>，By means of a computing unit->To->Parallel computation gets +.>Is calculated according to the calculation result of (2);

step 2.7. According to，By means of a computing unit->To->Parallel computation, get +.19 in 19 th cycle>Is calculated according to the calculation result of (2);

step 2.8. By means of a calculation unitThe reciprocal calculator in (2) calculates +.>According to>，By means of a computing unit->To->Parallel computation, get +.23 in cycle>Is calculated according to the calculation result of (2);

step 2.9. According to，By means of a computing unit->To->Parallel computation, get +.27 th cycle>Is calculated according to the calculation result of (2);

step 2.10. By means of a calculation unitThe reciprocal calculator in (2) gets +.>According to>By means of a computing unit->Calculate to get +.>Is calculated according to the calculation result of (2);

step 2.11 according toBy means of a computing unit->Calculate to get +.35 in cycle 35>Is calculated by the computer.

Preferably, a single said computing unitUp to 12 input data are processed simultaneously +.>，。

Preferably, the computing unitProcessing input data +.>The process of (1) comprises:

enters a reciprocal calculator and passes through three clock cycles and a gating device +.>At->The output results are multiplied, and the obtained result is then added with +.>At->The output results are added;Represents a selection signal, f=1, 2,3,4,5,6;

and->Input to multiplier->The result obtained is passed through a gating device +.>When->Time-selective input adder->When->Time-selective input adder->；

And->Input to multiplier->The result obtained is input to the adder->；

And->Input to multiplier->The obtained result is subjected to->When->Time selection input adderWhen->Time-selective input adder->；

And->Input to multiplier->The result obtained is input to the adder->；

Input to a gateWhen->Time-selective input adder->When->Time-selective input adder->。

Preferably, the computing unitThe process of processing the input data further comprises:

adder deviceThe output of (2) is passed through the gate->When->Time-selective input adder->When->Time-selective input adder->；

Adder deviceThe output of (2) is directly to adder->；

Adder deviceThe output of (2) is passed through the gate->When->Time-selective input multiplier->When->Time-selective input adder->；

Adder deviceThe output of (2) is passed through the gate->When->Output as final result->When->Time-selective input multiplier->。

Preferably, the reciprocal calculator data processing procedure includes:

preprocessing the data S to be calculated, and right-shifting the decimal point to the right side of the first 1 to obtain processed dataAnd recording the shift times p;

processing the dataBinary representation is performed to obtain +.>And splitting said binary representation to obtain +.>And->；And->Representation->And->Is a bit width of (2);

as address input look-up table, get the result +.>；

The calculation formula of the reciprocal calculator is as follows:

；

wherein,,representing a binary representation operation on a variable, +.>，Initially 0, & gt>Representing a look-up operation->。

Compared with the prior art, the invention discloses and provides the space target feature on-orbit real-time parallel LU decomposition computing system and method, which have the following beneficial effects:

1. the hardware parallel computing system designed by the invention can be suitable for programming application of digital logic devices, the design architecture considers the maximum executable frequency in the hardware digital logic design process, the occupied resources are limited, and the stored matrix data can be subjected to 1:1 real-time LU decomposition processing.

2. The invention uses a computing unitThe gating signals in the system realize that different data are subjected to LU decomposition calculation under a unified architecture, and adapt to parallel calculation application of targets with different sizes.

3. The invention adopts fixed-point calculation, adopts a mode of combining table look-up and expansion approximation for division operation, saves hardware resources, improves calculation precision, and is suitable for application of large, medium and small digital logic devices.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a space target feature parallel LU decomposition computing system provided by the invention.

Fig. 2 is a flow chart of a space target feature parallel LU decomposition method provided by the present invention.

Fig. 3a is a timing diagram of the 1 st cycle to 14 th cycle of the timing schedule mechanism according to the present invention.

Fig. 3b is a timing diagram of 14 th cycle to 27 th cycle of the timing schedule mechanism according to the present invention.

Fig. 3c is a timing diagram of 27 th cycle to 35 th cycle of the timing schedule mechanism according to the present invention.

Fig. 4 is a schematic diagram of a computing unit PE structure and data processing according to the present invention.

Fig. 5 is a schematic diagram of a reciprocal calculator structure and data processing according to the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

As shown in fig. 1, an embodiment of the present invention discloses an on-orbit real-time parallel LU decomposition computing system for spatial target features, including: the system comprises an input register set, an output L matrix register set, an output U matrix register set, an LU parallel computing scheduling controller and a computing unit set;

the input register set is used for receiving and storing data to be decomposed;

the computing unit group is used for carrying out parallel running water computation on the input data to be decomposed to obtain an L matrix result and a U matrix result;

the output L matrix register set is used for storing an L matrix result to be output;

the LU parallel computing scheduling controller is used for controlling the input register set to receive and store data to be decomposed, generating a time sequence scheduling mechanism according to the computing unit set, reading the data to be decomposed through the time sequence scheduling mechanism, sequentially transmitting the data to be decomposed to the computing unit set for LU decomposition computation, and outputting an L matrix result and a U matrix result.

Preferably, the output L matrix register set is further configured to store temporary variables of the L matrix in the iterative process; the output U matrix register set is also used for storing temporary variables of the U matrix in the iterative process.

Preferably, the input register set is a collection of register hardware, each register being marked asThe method comprises the steps of carrying out a first treatment on the surface of the The output L matrix register set is a set of register hardware, each register is marked +.>The method comprises the steps of carrying out a first treatment on the surface of the The output U matrix register set is a collection of register hardware, each register is marked +.>The method comprises the steps of carrying out a first treatment on the surface of the Where i and j represent two-dimensional register row and column identifications, for an N-dimensional matrix,and->All belong to positive integers.

Preferably, the group of computing units comprises t computing unitsT is a positive integer;

each computing unitAll include: multiplier->Adder->Gate->And a reciprocal calculator;

the multiplier is used for executing multiplication operation;

the adder is used for executing addition operation;

As shown in fig. 2, the embodiment of the invention discloses an on-orbit real-time parallel LU decomposition calculation method for space target features, which comprises the following steps:

the LU parallel computing scheduling controller generates a time sequence scheduling mechanism according to the computing unit group, reads data to be decomposed through the time sequence scheduling mechanism, and sequentially transmits the data to be decomposed to the computing unit group for LU decomposition and calculation to obtain an L matrix result and a U matrix result;

the LU parallel computing scheduling controller reads and outputs an L matrix result and a U matrix result.

the computing unit group comprises t computing units for parallel computingWherein t is a positive integer of 8 or less; computing element->Comprising the following steps: multiplier->Adder->Gate->And a reciprocal calculator, f=1, 2,3,4,5,6.

Preferably, as shown in fig. 3a-c, reading data to be decomposed by a timing scheduling mechanism, and sequentially transmitting the data to be decomposed to a computing unit group to perform LU decomposition computation specifically includes:

the data to be decomposed is input into the calculation unit group to complete LU decomposition calculation after 35 periods;

step 2.11 according toBy means of a computing unit->Calculate to get +.35 in cycle 35>Calculation of (2)As a result.

Preferably, the period represents a clock period, and the time between the rising edge of one clock and the next rising edge represents one period.

Preferably, the calculation content of each clock cycle in 35 cycles is as follows:

preferably, a single computing unitUp to 12 input data are processed simultaneously +.>，。

Preferably, as shown in FIG. 4, the computing unitProcessing input data +.>The process of (1) comprises:

enters a reciprocal calculator and passes through three clock cycles and a gating device +.>At->The output results are multiplied, and the obtained result is then added with +.>At->The output results are added;Represents a selection signal, f=1, 2,3,4,5,6;And->Input to multiplier->The result obtained is passed through a gating device +.>When->Time-selective input adder->When->Time-selective input adder->；And->Input to multiplier->The obtained result is input into an adder；And->Input to multiplier->The result obtained is input to the adder->；And->Input to multiplier->The obtained result is subjected to->When->Time-selective input adder->When->Time-selective input adder->；And->Input to multiplier->The result obtained is input to the adder->；Input to the gating device->When (when)Time-selective input adder->When->Time-selective input adder->。

Adder deviceThe output of (2) is passed through the gate->When->Time-selective input adder->When->Time-selective input adder->The method comprises the steps of carrying out a first treatment on the surface of the Adder->The output of (2) is directly to adder->The method comprises the steps of carrying out a first treatment on the surface of the Adder->Is directly to the adderThe method comprises the steps of carrying out a first treatment on the surface of the Adder->The output of (2) is passed through the gate->When->Time-selective input multiplier->When->Time-selective input adder->The method comprises the steps of carrying out a first treatment on the surface of the Adder->The output of (2) is passed through the gate->When->Output as final result->When (when)Time-selective input multiplier->。

Preferably, as shown in fig. 5, the reciprocal calculator data processing process includes:

preprocessing the data S to be calculatedThe decimal point moves to the right of the first 1 to obtain the processed dataAnd recording the shift times p;

will process the dataBinary representation is performed to obtain +.>And splitting the binary representation to obtain +.>And->；And->Representation->And->Is a bit width of (2);

preferably, willBinary representation is performed to obtain +.>And splitting the binary representation to obtain +.>And->；And->Representation->And->Is a bit width of (2);

as address input look-up table, get the result +.>；And->Multiplication is performed with the result subtracted by 1, the subtraction result and +.>Multiplying, shifting the multiplication result left by p bits to obtain the final reciprocal calculation estimation value +.>；

The calculation formula of the reciprocal calculator is as follows:

；

Preferably, the method comprises the steps of,the larger the bit width of the table is, the higher the accuracy of the reciprocal calculation estimation value is, the more resources are occupied by table lookup, and the table lookup depth below 12-bit addresses is adopted. />

Preferably, the LU parallel computing scheduling controller transmits and stores LU decomposition computing results to the output L matrix register group and the U matrix register group respectively according to the first-come-first-served principle, and if the LU decomposition computing results arrive at the same time, the LU decomposition computing results are stored according to the number sequence number from small to large.

Preferably, the present invention discloses and provides a space target feature on-orbit real-time parallel LU decomposition computing system and method, which has the following beneficial effects compared with the prior art:

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An on-orbit real-time parallel LU decomposition computing system for spatial target features, comprising: the system comprises an input register set, an output L matrix register set, an output U matrix register set, an LU parallel computing scheduling controller and a computing unit set;

the input register set is used for receiving and storing data to be decomposed;

the LU parallel computing scheduling controller is used for controlling the input register set to receive and store data to be decomposed, generating a time sequence scheduling mechanism according to the computing unit set, reading the data to be decomposed through the time sequence scheduling mechanism, sequentially transmitting the data to be decomposed to the computing unit set for LU decomposition computation, and outputting the L matrix result and the U matrix result;

reading the data to be decomposed through the time schedule mechanism, and sequentially transmitting the data to be decomposed to the computing unit group to perform LU decomposition computation specifically includes:

step 2.1. Input register a _(i，j) Direct assignment: a, a _(1，i) ＝U _(1，i) ，i∈[1，2，3，4，5，6]The process is completed in the 1 st period; u (U) _(i，j) Representing an output U matrix register;

step 2.2. By means of the calculation unit PE ₁ The reciprocal calculator in (2) calculates 1/U at the 3 rd period _(1，1) According to the calculation result of L at the same time _(j，1) ＝a _(j，1) (1/U _(1，1) )，j∈[2，3，4，5，6]By means of a calculation unit PE ₁ To PE ₅ Parallel computing to obtain L _(j，1) Is calculated according to the calculation result of (2); l (L) _(i，j) Representing an L matrix register;

step 2.3. According to U _(2，i) ＝a _(2，i) -L _(2，1) U _(1，i) ，i∈[1，2，3，4，5，6]By means of a calculation unit PE ₁ To PE ₅ Parallel computing, obtaining U in the 5 th period _(2，i) Is calculated according to the calculation result of (2);

step 2.4. By means of the calculation unit PE ₁ The reciprocal calculator in (2) calculates 1/U at the 8 th period _(2，2) According to the calculation result of L at the same time _(j，2) ＝a _(j，2) -(L _(j，1) U _(1，2) /U _(2，2) )，j∈[3，4，5，6]By means of a calculation unit PE ₂ To PE ₅ : parallel computing, obtaining L in 9 th period _(j，2) Is calculated according to the calculation result of (2);

step 2.5. According toBy means of a computing unit PE ₁ To PE ₄ Parallel computing, obtaining U in 12 th period _(3，i) Is calculated according to the calculation result of (2);

step 2.6. By calculationUnit PE ₁ The reciprocal calculator in (2) calculates 1/U at 15 th period _(3，3) According to the calculation result of (1) at the same timeBy means of a computing unit PE ₂ To PE ₄ Parallel computation to get L at 16 th cycle _(j，3) Is calculated according to the calculation result of (2);

step 2.7. According toBy means of a computing unit PE ₁ To PE ₃ Parallel computing, obtaining U in 19 th period _(4，i) Is calculated according to the calculation result of (2);

step 2.8. By means of the calculation unit PE ₁ The reciprocal calculator in (2) calculates 1/U at 22 th period _(4，4) According to the calculation result of (1) at the same timeBy means of a computing unit PE ₂ To PE ₃ Parallel computing, obtaining L in 23 rd period _(j，4) Is calculated according to the calculation result of (2);

step 2.9. According toBy means of a computing unit PE ₁ To PE ₂ Parallel computing, obtaining U in 27 th period _(5，i) Is calculated according to the calculation result of (2);

step 2.10. By means of the calculation unit PE ₁ The reciprocal calculator in (2) obtains 1/U in 30 th period _(5，5) According to the calculation result of (1) at the same timeBy means of a computing unit PE ₂ Calculation to obtain L at cycle 32 _(6，5) Is calculated according to the calculation result of (2);

step 2.11 according toBy means of a computing unit PE ₁ Calculate U at 35 th cycle _(6，6) Is calculated according to the calculation result of (2);

calculation unit PE _t Processing input data m _k The process of (1) comprises:

m ₀ enters a reciprocal calculator and passes through three clock cycles and a gating device MUX ₅ At Sel ₅ Multiplying the output result when=0, and then the obtained result is multiplied by MUX ₆ At Sel ₆ Output results when=0 are added; sel (Sel) _f Represents a selection signal, f=1, 2,3,4,5,6;

m ₁ and m is equal to ₂ Input to multiplier M ₁ The result obtained is passed through a gate MUX ₁ When Sel ₁ Select input adder a when=0 ₅ When Sel ₁ Select input adder a when=1 ₁ ；

m ₃ And m is equal to ₄ Input to multiplier M ₂ The obtained result is input into an adder A ₁ ；

m ₅ And m is equal to ₆ Input to multiplier M ₃ The obtained result is input into an adder A ₂ ；

m ₇ And m is equal to ₈ Input to multiplier M ₄ The result obtained is passed through MUX ₃ When Sel ₂ Select input adder a when=0 ₂ When Sel ₂ Select input adder a when=1 ₅ ；

m ₉ And m is equal to ₁₀ Input to multiplier M ₅ The obtained result is input into an adder A ₃ ；

m ₁₁ Input to a gate MUX ₆ When Sel ₆ Select input adder a when=0 ₆ When Sel ₆ Select input adder a when=1 ₃ ；

The process of processing the input data by the computing unit PEt further comprises:

adder A ₁ The output of (a) passes through a gate MUX ₂ When Sel ₃ Select input adder a when=0 ₅ When Sel ₃ Select input adder a when=1 ₄ ；

Adder A ₂ The output of the adder is directly connected to the adder A ₄ ；

Adder A ₃ The output of the adder is directly connected to the adder A ₅ ；

Adder A ₄ The output of (a) passes through a gate MUX ₄ When Sel ₄ Select input multiplier M when=0 ₆ When Sel ₄ Select input adder a when=1 ₅ ；

Adder A ₅ The output of (a) passes through a gate MUX ₅ When Sel ₅ Output as final result n when=0 ₀ When Sel ₅ Select input multiplier M when=1 ₆ ；

The reciprocal calculator data processing process includes:

preprocessing the data S to be calculated, and right-shifting the decimal point to the right side of the first 1 to obtain processed data f _trans (S) and recording the shift number p;

processing the data f _trans (S) performing binary representation to obtain [ S ] _M-1 ，s _M-2 ，…，s _N ，s _N-1 ，…，s ₀ ] ₂ And split the binary representation to obtain x _S ＝[s _M-1 ，s _M-2 ，…，s _N ] ₂ And D _S ＝[s _N-1 ，s _N-2 ，…，s ₀ ] ₂ The method comprises the steps of carrying out a first treatment on the surface of the M and N represent x _S And D _S Is a bit width of (2);

x _S as an address input lookup table to obtain the result f _Table (x _S )；

The calculation formula of the reciprocal calculator is as follows:

wherein [ (S)] ₂ Representing binary representation operations on variables, M.gtoreq.N, p is initially 0, f _Table (. Cndot.) means a look-up table operation,

2. the spatial target feature on-orbit real-time parallel LU decomposition calculation system according to claim 1, wherein each input register in the input register set is labeled a _(i，j) ；

Each output L matrix register mark in the output L matrix register group is L _(i，j) ；

Each output U matrix register mark in the output U matrix register group is U _(i，j) ；

Wherein i and j represent two-dimensional register row and column identifications, and for an N-dimensional matrix, i, j E [1, N ] and i, j belong to positive integers.

3. The on-orbit real-time parallel LU decomposition computing system according to claim 1, wherein the computing unit group includes t computing units PE _t T is a positive integer;

the computing unit PE _t Comprising the following steps: multiplier M _f Adder A _f Gate MUX _f And a reciprocal calculator;

the multiplier is used for executing multiplication operation;

the adder is used for executing addition operation;

the gate is used for selecting the signal Sel according to _f Outputting the received signal, f=1, 2,3,4,5,6;

4. The on-orbit real-time parallel LU decomposition calculation method for the space target features is characterized by comprising the following steps of:

the LU parallel computing scheduling controller correspondingly transmits and stores the L matrix result and the U matrix result to an output L matrix register group and a U matrix register group;

the LU parallel computing scheduling controller reads and outputs the L matrix result and the output U matrix result;

step 2.6. By means of the calculation unit PE ₁ The reciprocal calculator in (2) calculates 1/U at 15 th period _(3，3) According to the calculation result of (1) at the same timeBy means of a computing unit PE ₂ To PE ₄ Parallel computation to get L at 16 th cycle _(j，3) Is calculated according to the calculation result of (2);

the computing unit PEt processes the input data m _k The process of (1) comprises:

The reciprocal calculator data processing process includes:

x _S as an address input lookup table to obtain the result f _Table (x _S )；

The calculation formula of the reciprocal calculator is as follows:

wherein [ (S)] ₂ Representing binary representation operations on variables, M.gtoreq.N, P initially being 0, f _Table (. Cndot.) means a look-up table operation,

5. the on-orbit real-time parallel LU decomposition calculation method of space target features according to claim 4, wherein the maximum of the data to be decomposed, the L matrix result and the U matrix result is a 6x6 matrix;

the computing unit group comprises t computing units PEt for parallel computing, wherein t is a positive integer less than or equal to 8; the computing unit PE _t Comprising the following steps: multiplier M _f Adder A _f Gate MUX _f And a reciprocal calculator, f=1, 2,3,4,5,6.

6. The on-orbit real-time parallel LU decomposition calculation method of space object features according to claim 4, wherein a single calculation unit PEt processes at most 12 input data m simultaneously _k ，K∈[0，1，…，11]。