[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN106406972B - Program compiling method and compiler - Google Patents

Program compiling method and compiler Download PDF

Info

Publication number
CN106406972B
CN106406972B CN201610974546.6A CN201610974546A CN106406972B CN 106406972 B CN106406972 B CN 106406972B CN 201610974546 A CN201610974546 A CN 201610974546A CN 106406972 B CN106406972 B CN 106406972B
Authority
CN
China
Prior art keywords
base address
address
program
instruction
instructions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610974546.6A
Other languages
Chinese (zh)
Other versions
CN106406972A (en
Inventor
黄震宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Jieli Technology Co Ltd
Original Assignee
Zhuhai Jieli Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Jieli Technology Co Ltd filed Critical Zhuhai Jieli Technology Co Ltd
Priority to CN201610974546.6A priority Critical patent/CN106406972B/en
Publication of CN106406972A publication Critical patent/CN106406972A/en
Application granted granted Critical
Publication of CN106406972B publication Critical patent/CN106406972B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • G06F8/427Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/441Register allocation; Assignment of physical memory space to logical memory space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation
    • G06F8/4434Reducing the memory space required by the program code

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The present invention relates to a kind of program compiling method and compiler, this method includes base address and the offset address for reading program to be compiled and recording every internal storage access instruction;The identical internal storage access instruction in base address is divided into a major class;Using the memory address of the smallest instruction of offset address in each major class as the new base address of all instructions in the major class;Each major class is calculated according to the new base address of each major class to become smaller using the total code length whether new base address can be such that the internal storage access of the major class instructs;If so, the code of modification program to be compiled, by the base address of every instruction in these major class is modified as new base address, offset address is revised as new offset address, to generate target program.After the present invention is by being classified as a set for the identical instruction in base address, the base address of the instruction in the set is reselected, so that the offset of internal storage access instruction is reduced, to reduce the size of internal storage access instruction, and then reduces the size of total program.

Description

Program compiling method and compiler
Technical Field
The present invention relates to the field of computers, and in particular, to a program compiling method and a compiler.
Background
At present, according to statistics, the most used type of instruction in a program is the instruction for memory access. In a program to be compiled, such as C source code, an access to a structure is usually translated into a corresponding memory access instruction by a compiler. Memory access instructions of different architectural instruction sets are generally of two types: the first is to access the contents of a block of memory addresses via a register (i.e., base address register) plus a constant offset address, and the second is to access the contents of a block of memory addresses via the address of one register (i.e., base address register) plus another register (i.e., offset register). For example, in the AArch64 architecture instruction set, ldr w8, [ x1, #4], a memory access instruction belonging to the first class, indicates that the constant 4 is added to the content of the register x1, the obtained result is taken as a memory address, and the content of the address is read into the register w 8; and ldr w8, [ x1, x2], which belongs to the second class of memory access instructions, indicates that the contents of register x2 are added to the contents of register x1, the obtained result is used as a memory address, and the contents of the address are read into register w 8. In the PI32-V2 architecture instruction set, there is also a similar first type of instruction, such as r8 ═ r1+4, indicating that the register of r8 is the contents of base address r1 plus 4; the second type of instruction, r8 ═ r1+ r2, indicates that the register for r8 is the base address r1 plus the offset register r 2.
Since each instruction requires a certain length of encoding, which is limited, the offset range of the constant of the first type of memory access instruction is typically limited, for example, in AArch64, the offset range of the first type of instruction is-128 × 4 to 128 × 4. If we need to access an offset address larger than the limited range, we can only first move the offset into a register with a move constant instruction and then access it with a second instruction. This extra step makes the code size large. On the other hand, some instruction sets, such as PI32-V2, have two different encoding types, one is a short encoding type and can only encode a shorter offset range, and the other is a long encoding type and can encode a longer offset range. If more instructions of the first type are generated in a program in long code form, the resulting code size will also become larger.
In the conventional compiling technology, the above memory access instruction is usually generated for structure body access, for example, f.a in C source code is 10; f.b ═ 10; assuming that the address of f is stored in r0, the offset for domain a is 80, and the offset for domain b is 84, for the PI32-V2 instruction set, this assignment translates three assembly instructions: r1 ═ 10; [ r0+80] ═ r 1; [ r0+84] ═ r 1; the instructions generated here are long coded versions of the instructions in the first class. In an actual program, such assignment statements are often more and more centralized, so that the conventional method generates more instructions in a long coding form, so that the code size is larger.
Disclosure of Invention
In view of the above, there is a need for a program compiling method and a compiler, which can reduce the encoding size of the memory access instruction.
A method of program compilation, the method comprising:
reading a program to be compiled, and recording a base address and an offset address of each memory access instruction, wherein the memory address of each memory access instruction is equal to the base address of the instruction plus the offset address of the instruction;
dividing the memory access instructions with the same base address into a large class;
taking the memory address of the instruction with the minimum offset address in each class as a new base address of all the instructions in the class;
calculating whether the new base address used by each large class can reduce the total code length of the memory access instruction of the large class according to the new base address of each large class; and if the large classes with the smaller total code length exist, modifying the codes of the program to be compiled, and modifying the base address of each instruction in the large classes into a new base address and an offset address into a new offset address so as to generate the target program.
In one embodiment, the step of dividing the memory access instructions with the same base address into a large class further includes sorting the memory access instructions in each large class from small to large according to offset addresses.
In one embodiment, after the step of using the memory address of the instruction with the smallest offset address in each class as the new base address of all the instructions in the class, the method further includes the step of grouping the instructions in each class;
the step of further grouping said instructions in a broad class comprises:
step A, checking the instructions in the large class from small to large according to the offset address, and adding the instructions of which the encoding length is the shortest encoding length when the new base address is taken as the base address into a group taking the new base address as the base address;
step B, when the memory access instruction with the encoding length of the instruction not being the shortest encoding length is found by using the new base address as the base address, establishing a new group by using the memory address of the instruction as the new base address;
step A, B is repeated until all instructions in the broad class have been grouped.
In one embodiment, the step B is:
when the memory access instruction with the encoding length of the instruction not being the shortest encoding length is found by using the new base address as the base address, the memory access instruction is executed
On the premise of ensuring that the coding length of the instructions of the current group is not changed, increasing the base address of the current group to enable the offset address of the currently viewed memory access instruction to reach the shortest coding length, and taking the increased base address as a new base address of the current group;
if the attempt is unsuccessful, a new set is created with the memory address of the instruction as the new base address.
In one embodiment, the modifying the code of the program to be compiled, and modifying the base address of each instruction in the classes into a new base address and an offset address into a new offset address to generate the target program specifically include:
inserting instructions of each large class that modify a base address at a target location of the target program; wherein the target location is such that instructions of the modified base address of each large class are executed before all of the instructions in that large class;
modifying the base address of each of the instructions in these broad classes to a new base address, the offset address to a new offset address.
In one embodiment, the step of reading the program to be compiled and recording the base address and the offset address of each memory access instruction includes:
reading a program to be compiled, and dividing the program to be compiled into a plurality of functions;
and recording the base address and the offset address of each memory access instruction in each function.
A program compiler, the compiler comprising:
the program loading module is used for reading a program to be compiled and recording a base address and an offset address of each memory access instruction; the memory address of the memory access instruction is equal to the base address of the instruction plus the offset address of the instruction;
the base address conversion module is used for dividing the memory access instructions with the same base address into a large class, and taking the memory address of the instruction with the minimum offset address in each large class as a new base address of all the instructions in the large class;
the coding length calculation module is used for calculating the total code length of the memory access instruction of each large class after the large class uses the new base address according to the new base address of each large class;
the cost judgment module is used for calculating whether the new base address used by each large class can reduce the total code length of the memory access instruction of the large class according to the new base address of each large class;
and the input end of the target program generation module is connected with the output end of the cost judgment module, and the target program generation module is used for modifying the codes of the program to be compiled when large classes with smaller total code length exist, and modifying the base address of each instruction in the large classes into a new base address and an offset address into a new offset address so as to generate the target program.
In one embodiment, the method further comprises the following steps:
and the input end of the sorting module is connected with the output end of the base address conversion module, the output end of the sorting module is connected with the input end of the coding length calculation module, and the sorting module is used for sorting the memory access instructions in each large class from small to large according to offset addresses.
In one embodiment, the base address translation module includes:
the input end of the grouping control unit is connected with the output end of the sorting module, and the grouping control unit is used for judging whether all the instructions in the class are grouped completely;
the input end of the first grouping unit is connected with the output end of the grouping control unit, and the first grouping unit is used for checking the instructions in the large class from small to large according to the offset address when all the instructions in the large class are not grouped completely, and adding the instructions of which the coding length is the shortest coding length when the new base address is used as the base address into a group using the new base address as the base address;
and the input end of the second grouping unit is connected with the output end of the first grouping unit, and the second grouping unit is used for creating a new group by taking the memory address of the command as a new base address when the memory access command of which the coding length is not the shortest coding length is checked out and the new base address is taken as the base address.
In one embodiment, the second packet unit includes:
the input end of the coding length judging subunit is connected with the output end of the first grouping unit, and the coding length judging unit is used for checking whether a memory access instruction with the coding length not being the shortest coding length exists when the new base address is taken as the base address;
and the grouping subunit is used for trying to increase the base address of the current group on the premise of ensuring that the coding length of the offset address of the current group is not changed, so that the offset address of the currently looked-up memory access instruction reaches the shortest coding length, taking the increased base address as a new base address of the current group, and if the trying is unsuccessful, taking the memory address of the instruction as the new base address to create a new group.
In one embodiment, the target program generation module is further configured to insert an instruction to modify the base address for each large class at a target location of the target program; wherein the target location is such that the instructions of the modified base address of each large class are executed before all of the instructions in that large class.
In one embodiment, the program loading module includes:
the program reading unit is used for reading a program to be compiled;
the input end of the program dividing unit is connected with the output end of the program reading unit, and the program dividing unit is used for dividing the program to be compiled into a plurality of functions;
and the input end of the recording unit is connected with the output end of the program dividing unit, and the recording unit is used for recording the base address and the offset address of each memory access instruction in each function.
According to the program compiling method and the compiler, the instructions with the same base address are grouped into a set, and then the base addresses of the instructions in the set are reselected, so that the offset of the memory access instruction is reduced, the size of the total program is further reduced, and the application range is wider.
Drawings
FIG. 1 is a flow diagram illustrating the compilation of a computer program according to one embodiment;
FIG. 2 is a flowchart of a program compiling method according to an embodiment;
FIG. 3 is a block diagram of a program compiler according to an embodiment.
Wherein,
100 program loading module
110 program reading unit
120 program dividing unit
130 recording unit
200 base address conversion module
210 first grouping unit
220 second packet unit
230 packet control unit
221 coding length judgment subunit
222 grouping subunits
300 code length calculating module
400 cost judging module
500 object program generating module
600 ordering module
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Before describing in detail embodiments that are in accordance with the present invention, it should be observed that the embodiments reside primarily in combinations of steps and system components related to program compilation methods and compilers. Accordingly, the system components and method steps have been represented where appropriate by conventional symbols in the drawings, showing only those details that are pertinent to understanding the embodiments of the present invention so as not to obscure the disclosure with details that will be readily apparent to those of ordinary skill in the art having the benefit of the description herein.
In this document, relational terms such as left and right, top and bottom, front and back, first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, fig. 1 is a flow of compiling a computer program in an embodiment, in the embodiment, a program to be compiled obtains an intermediate representation through front-end analysis, the intermediate representation obtains a final intermediate representation after a plurality of middle-end optimizations, and finally, the intermediate representation is subjected to code generation to obtain a final target program. The compilation method in the present invention may be the compilation process shown in fig. 1 as a whole, or may be simply a process in which one intermediate representation is converted into another intermediate representation. In this embodiment and the following, a section of C source code is taken as an example to illustrate the embodiment and the invention. Assume the source program is:
after the above front-end analysis and several middle-end optimizations, it is assumed that before the method of the present invention, the middle after the source program is translated into PI32-V2 instructions is expressed as:
foobar:
v1=r1
v0=r0
v2=[v0+256]
v3=[v1+256]
v4=v3+v2
v5=[v0+260]
v6=v4+v5
v7=[v1+260]
v8=v6+v7
v9=[v0+264]
v10=v8+v9
v11=[v1+264]
v12=v10+v11
v13=[v1+268]
v14=v13<<1
v15=v12+v14
r0=v15
Rts
at this time, no register allocation is performed, wherein r0 and r1 represent physical registers, r0 is a first parameter, r1 is a second parameter, and v0, v1, … … and v15 are virtual registers, so that only one foobar function exists in the program to be compiled, and therefore, only the foobar function exists in the program to be compiled. The compiling method of the invention mainly collects the memory access instruction of a section of program (for example, a function body), then selects a proper base address, and inserts a new base address calculation instruction in a proper position, so that the offset of the memory access instruction can be reduced, and finally the purpose of reducing the total code size is achieved.
Referring to fig. 2, fig. 2 is a flowchart illustrating a program compiling method according to an embodiment, where the method includes:
s202: and reading the program to be compiled, and recording the base address and the offset address of each memory access instruction. It can be seen that the memory address of the memory access instruction is equal to the base address of the instruction plus the offset address of the instruction, and in this embodiment, the instructions of the function body are checked one by one first, and then the memory access instruction and its rough condition are obtained. For example, in the above embodiment, the memory access instruction includes:
v2=[v0+256]
v3=[v1+256]
v5=[v0+260]
v7=[v1+260]
v9=[v0+264]
v11=[v1+264]
v13=[v1+268]
in one embodiment, the step of reading the program to be compiled and recording the base address and the offset address of each memory access instruction may include reading the program to be compiled and dividing the program to be compiled into a plurality of functions. And recording the base address and the offset address of each memory access instruction in each function. In this embodiment, the basic unit of the function may be, that is, only one memory access instruction in the function range is considered in the method at a time, and the method is performed before the register allocation is performed. Typically within a compiler, a function is identified as a collection of basic blocks, a basic block being represented as a series of instructions.
S204: memory access instructions with the same base address are classified into a large class. In this embodiment, the base address register is used as the basis for the classification of the main class, so that the memory access instructions with the same base address are classified into a main class. For example, the above embodiments include two broad categories:
first major class, with v0 as the original base address:
v2=[v0+256]
v5=[v0+260]
v9=[v0+264]
second, base address with v1 as original:
v3=[v1+256]
v7=[v1+260]
v11=[v1+264]
v13=[v1+268]
s206: the memory address of the instruction with the smallest offset address in each class is used as the new base address for all instructions in that class. For example, in the above embodiment, the offset address is 256 minimum, so the memory address of the instruction with the minimum offset address is the new base address of all instructions in the group, that is, in the first large class, the instruction with v2 ═ v0+256] as the new base address includes:
v2 ═ v0+256] the new offset address for this instruction is + 0.
v5 ═ v0+260] the new offset address for this instruction is + 4.
v9 ═ v0+264] the new offset address for this instruction is + 8.
In the second broad category, instructions with v3 ═ v1+256] as the new base address include:
v3 ═ v1+256] the new offset address for this instruction is 0.
v7 ═ v1+260] the new offset address for this instruction is + 4.
v11 ═ v1+264] the new offset address for this instruction is + 8.
v13 ═ v1+268] the new offset address for this instruction is + 12.
S208: and calculating whether the new base address used by each large class can reduce the total code length of the memory access instruction of the large class according to the new base address of each large class.
S210: and if the large classes with the smaller total code length exist, modifying the codes of the program to be compiled, and modifying the base address of each instruction in the large classes into a new base address and an offset address into a new offset address so as to generate the target program.
Specifically, in the above example, the instruction in the first broad class requires the addition of an instruction to modify the base address:
v 16-v 0+256 adds the add instruction to modify the base address of all instructions in the first large class, which is a long coded instruction at the cost of 4 bytes.
And in this example, the memory access instructions in the first broad class are modified accordingly to:
v2 ═ v16+0] this instruction is reduced by 2 bytes compared to the original instruction v2 ═ v0+ 256.
v5 ═ v16+4] this instruction is reduced by 2 bytes compared to the original instruction v5 ═ v0+ 260.
v9 ═ v16+8] this instruction is reduced by 2 bytes compared to the original instruction v9 ═ v0+264 ].
Thus, in summary, instructions in the first broad class are compiled to reduce the code space by 2 bytes.
I.e. modifying the base address in the above embodiment from v0 to v0+256 requires an add instruction, or an extra 4-byte penalty, but at the same time can reduce the penalty of the other three instructions in the first large group by 2 bytes each, so this is profitable.
The same may be concluded that instructions in the second broad class may reduce the code space by 4 bytes:
v 17-v 1+256 adds the add instruction to modify the base address of all instructions in the first large class, which is a long coded instruction at the cost of 4 bytes.
And in this example, instructions in the second broad class:
v3 ═ v17+0] this instruction is reduced by 2 bytes compared to the original instruction v3 ═ v1+ 256.
v7 ═ v17+4] this instruction is reduced by 2 bytes compared to the original instruction v7 ═ v1+ 260.
v11 ═ v17+8] this instruction is reduced by 2 bytes compared to the original instruction v11 ═ v1+264 ].
v13 ═ v17+12] this instruction is reduced by 2 bytes compared to the original instruction v13 ═ v1+ 268.
According to the program compiling method, the instructions with the same base address are grouped into a set, and then the base addresses of the instructions in the set are reselected, so that the offset of the memory access instruction is reduced, the size of the total program is reduced, and the application range is wide.
In one embodiment, the step of dividing the memory access instructions with the same base address into a large class, that is, the step S204, may further include sorting the memory access instructions in each large class from small to large according to the offset address. For example, the three instructions in the first major class may be ordered in the order of their offset addresses 256, 260, 264, i.e., v2 ═ v0+256, v5 ═ v0+260, and v9 ═ v0+ 264. Similarly, the instructions in the second large class may also be sorted, and the sorting manner may be that the offset addresses are arranged from small to large, or that the offset addresses are arranged from large to small, which is not described herein again.
In one embodiment, after the step of using the memory address of the instruction with the smallest offset address in each large class as the new base address of all the instructions in the large class, that is, after the step S206, a step of further grouping the instructions in each large class may be further included.
In one embodiment, the step of further grouping the instructions in each of the categories is specifically:
step A, checking the instructions in the large class according to the offset addresses from small to large, and adding the instruction of which the encoding length is the shortest encoding length when the new base address is taken as the base address into a group taking the new base address as the base address.
For example, referring to the above embodiment, for the first large class, first sorting is performed according to the offset addresses, and the address of the instruction corresponding to the smallest offset address is selected as the new base address, that is, the offset is 256, and for the first large class, the instructions include:
v2 ═ v0+256] the new offset address for this instruction is + 0.
v5 ═ v0+260] the new offset address for this instruction is + 4.
v9 ═ v0+264] the new offset address for this instruction is + 8.
It can be seen that the above instructions need only be implemented in short coded form, and therefore no further grouping is required.
And step B, when the memory access instruction with the encoding length of the instruction not being the shortest encoding length is found by using the new base address as the base address, establishing a new group by using the memory address of the instruction as the new base address.
For example, assuming that there is a memory access instruction in the first class that needs to be encoded in long code when a new base address is used, a new set is created with the memory address of the instruction as the new base address.
Step A, B is repeated until all instructions in the broad class have been grouped.
In one embodiment, the step B may include:
when a memory access instruction with a new base address as a base address and an encoding length of the instruction not being the shortest encoding length is found, the base address of the current group is increased on the premise of ensuring that the encoding length of the instruction of the current group is not changed, so that the memory access instruction found currently reaches the shortest encoding length, and the increased base address is used as the new base address of the current group. If the attempt is unsuccessful, a new set is created with the memory address of the instruction as the new base address. For example, assuming that there is a memory access instruction a that needs to be encoded in a long code form when a new base address is used in the first large class, the new base address is first tried to be modified, and in the first large class, the new base address is v2 ═ v0+256], the new base address can be tried to be modified as v5 ═ v0+260], then the new offset address of the instruction v2 ═ v0+256 is-4 in the case of the base address being v5 ═ v0+260, the new offset address of the instruction v9 [ v0+264] is +4 in the case of the base address being v5 ═ v0+260], a new offset address of the instruction a is calculated similarly, and if the encoding length of the instruction a in the case of the base address being v5 ═ v0+260] is the shortest encoding length, then the modified base address is not present as v 6348 ═ v0+256], and if the encoding length of the instruction a is the base address is the shortest encoding length [ v0+260],256 ], a new set may be created with the memory address of the instruction as the new base address, e.g., with the memory address of instruction a as the new base address, or continue to try other base addresses, e.g., with v9 ═ v0+264, etc.
In one embodiment, the step of modifying the code of the program to be compiled, and modifying the base address of each instruction in the classes into a new base address and an offset address into a new offset address to generate the target program may specifically be: inserting an instruction of modifying the base address of each large class into a target position of a target program; wherein the target location is such that the instructions of the modified base address of each large class are executed before all instructions in that large class. Modifying the base address of each of the instructions in these broad classes to a new base address, the offset address to a new offset address. For example, as can be seen from the execution of the above embodiment, when v0 is r0 after the target position of the instruction inserted with the new base address is located, the position can ensure that the instruction for converting the base address is executed before other memory access instructions are executed, so after the above compiling, the program to be compiled finally converts to:
foobar:
v1=r1
v0=r0
v16=v0+256
v17=v1+256
v2=[v16+0]
v3=[v17+0]
v4=v3+v2
v5=[v16+4]
v6=v4+v5
v7=[v17+4]
v8=v6+v7
v9=[v16+8]
v10=v8+v9
v11=[v17+8]
v12=v10+v11
v13=[v17+12]
v14=v13<<1
v15=v12+v14
r0=v15
rts
it can be seen from the above analysis that the target program has a code size space reduced by 6 bytes compared to the original program to be compiled, and it has been tested that the code size can be reduced by about 2% by using the method of the present invention compared to the conventional compiling method, for example, by compiling the Linux 3.10 kernel to the PI32-V2 instruction set.
Please refer to fig. 3, fig. 3 is a schematic structural diagram of a program compiler in an embodiment, in which the compiler may include a program loading module 100, a base address transformation module 200, a coding length calculation module 300, a cost judgment module 400, and an object program generation module 500, an input end of the base address transformation module 200 is connected to an output end of the program loading module 100, an input end of the coding length calculation module 300 is connected to an output end of the base address transformation module 200, an input end of the cost judgment module 400 is connected to an output end of the coding length calculation module 300, and an input end of the object program generation module 500 is connected to an output end of the cost judgment module 400.
The program loading module 100 is configured to read a program to be compiled, and record a base address and an offset address of each memory access instruction; the memory address of the memory access instruction is equal to the base address of the instruction plus the offset address of the instruction.
The base address translation module 200 is configured to divide the memory access instructions with the same base address into a large class, and use the memory address of the instruction with the smallest offset address in each large class as a new base address of all the instructions in the large class.
The code length calculating module 300 is configured to calculate, according to the new base address of each major class, a total code length of the memory access instruction of each major class after the major class uses the new base address.
The cost judgment module 400 is configured to calculate, according to the new base address of each major class, whether each major class can use the new base address to reduce the total code length of the memory access instruction of the major class.
The target program generating module 500 is configured to modify the code of the program to be compiled if there is a large class with a smaller total code length, modify the base address of each instruction in the large classes into a new base address, modify the offset address into a new offset address, and generate the target program.
In one embodiment, the compiler may further include a sorting module 600, an input of the sorting module 600 is connected to the output of the base address transformation module 200, an output of the sorting module 600 is connected to the input of the encoding length calculation module 300, and the sorting module 600 is configured to sort the memory access instructions in the classes from small to large according to the offset addresses.
In one embodiment, the base address translation module 200 may include a first packet unit 210, a second packet unit 220, and a packet control unit 230, an input of the packet control unit 230 being coupled to an output of the ordering module 600, an input of the first packet unit 210 being coupled to an output of the packet control unit 230, and an input of the second packet unit 220 being coupled to an output of the first packet unit 210.
The grouping control unit 230 is used to determine whether all the instructions in the large class are grouped.
The first grouping unit 210 is configured to view the instructions in the large class from small to large according to the offset address, and add the instruction whose encoding length is the shortest encoding length when the new base address is used as the base address, to the group using the new base address as the base address.
The second packet unit 220 is configured to create a new group by using the memory address of the instruction as a new base address when the memory access instruction with the encoding length of the instruction not being the shortest encoding length is detected.
In one embodiment, the second packet unit 220 may include an encoding length determining subunit 221 and a packet subunit 222, wherein an input of the encoding length determining subunit 221 is connected to an output of the first packet unit 210, and an input of the packet subunit 222 is connected to an output of the encoding length determining subunit 221.
The encoding length determining subunit 221 is configured to check whether there is a memory access instruction whose encoding length is not the shortest encoding length when the new base address is used as the base address.
The grouping subunit 222 is configured to, when a memory access instruction is found, where the encoding length of the instruction is not the shortest encoding length, attempt to increase the base address of the current group so that the offset address of the currently found memory access instruction reaches the shortest encoding length on the premise that the encoding length of the offset address of the current group is not changed, use the increased base address as a new base address of the current group, and if the attempt is unsuccessful, create a new group by using the memory address of the instruction as the new base address.
In one embodiment, the target program generation module 500 is further configured to insert an instruction for modifying the base address of each of the large classes into a target location of the target program; wherein the target location is such that the instructions of the modified base address of each large class are executed before all instructions in that large class.
In one embodiment, the program loading module 100 may include a program reading unit 110, a program dividing unit 120, and a recording unit 130, wherein an input terminal of the program dividing unit 120 is connected to an output terminal of the program reading unit 110, and an input terminal of the recording unit 130 is connected to an output terminal of the program dividing unit 120.
The program reading unit 110 is configured to read a program to be compiled. The program dividing unit 120 is configured to divide the program to be compiled into several functions. The register unit 130 is used for registering a base address and an offset address of each memory access instruction in each function.
The above-mentioned embodiments of the compiler are not repeated here, and the above-mentioned compiling method may be specifically referred to.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (12)

1. A method of program compilation, the method comprising:
reading a program to be compiled, and recording a base address and an offset address of each memory access instruction, wherein the memory address of each memory access instruction is equal to the base address of the instruction plus the offset address of the instruction;
dividing the memory access instructions with the same base address into a large class;
taking the memory address of the instruction with the minimum offset address in each class as a new base address of all the instructions in the class;
calculating whether the new base address used by each large class can reduce the total code length of the memory access instruction of the large class according to the new base address of each large class; and if the large classes with the smaller total code length exist, modifying the codes of the program to be compiled, and modifying the base address of each instruction in the large classes into a new base address and an offset address into a new offset address so as to generate the target program.
2. The method for program compilation according to claim 1, wherein the step of grouping the memory access instructions having the same base address into a large class further comprises sorting the memory access instructions in each large class by offset address from smaller to larger.
3. The method for program compilation according to claim 2, wherein, after the step of using the memory address of the instruction with the smallest offset address in each class as the new base address for all of the instructions in the class, the method further comprises the step of grouping the instructions in each class;
the step of further grouping said instructions in a broad class comprises:
step A, checking the instructions in the large class from small to large according to the offset address, and adding the instructions of which the encoding length is the shortest encoding length when the new base address is taken as the base address into a group taking the new base address as the base address;
step B, when the memory access instruction with the encoding length of the instruction not being the shortest encoding length is found by using the new base address as the base address, establishing a new group by using the memory address of the instruction as the new base address;
step A, B is repeated until all instructions in the broad class have been grouped.
4. The program compiling method according to claim 3 wherein the step B is:
when the memory access instruction with the encoding length of the instruction not being the shortest encoding length is found by using the new base address as the base address, the memory access instruction is executed
On the premise of ensuring that the coding length of the instructions of the current group is not changed, increasing the base address of the current group to enable the offset address of the currently viewed memory access instruction to reach the shortest coding length, and taking the increased base address as a new base address of the current group;
if the attempt is unsuccessful, a new set is created with the memory address of the instruction as the new base address.
5. The program compiling method according to claim 1, wherein the step of modifying the code of the program to be compiled to modify the base address of each of the instructions in the classes to a new base address and an offset address to a new offset address to generate the target program specifically comprises:
inserting instructions of each large class that modify a base address at a target location of the target program; wherein the target location is such that instructions of the modified base address of each large class are executed before all of the instructions in that large class;
modifying the base address of each of the instructions in these broad classes to a new base address, the offset address to a new offset address.
6. The program compiling method according to claim 1, wherein the step of reading the program to be compiled and recording the base address and the offset address of each memory access instruction comprises:
reading a program to be compiled, and dividing the program to be compiled into a plurality of functions;
and recording the base address and the offset address of each memory access instruction in each function.
7. A program compiler apparatus, wherein the compiler comprises:
the program loading module is used for reading a program to be compiled and recording a base address and an offset address of each memory access instruction; the memory address of the memory access instruction is equal to the base address of the instruction plus the offset address of the instruction;
the base address conversion module is used for dividing the memory access instructions with the same base address into a large class, and taking the memory address of the instruction with the minimum offset address in each large class as a new base address of all the instructions in the large class;
the coding length calculation module is used for calculating the total code length of the memory access instruction of each large class after the large class uses the new base address according to the new base address of each large class;
the cost judgment module is used for calculating whether the new base address used by each large class can reduce the total code length of the memory access instruction of the large class according to the new base address of each large class;
and the input end of the target program generation module is connected with the output end of the cost judgment module, and the target program generation module is used for modifying the codes of the program to be compiled when large classes with smaller total code length exist, and modifying the base address of each instruction in the large classes into a new base address and an offset address into a new offset address so as to generate the target program.
8. The program compiler apparatus of claim 7, further comprising:
and the input end of the sorting module is connected with the output end of the base address conversion module, the output end of the sorting module is connected with the input end of the coding length calculation module, and the sorting module is used for sorting the memory access instructions in each large class from small to large according to offset addresses.
9. The program compiler apparatus of claim 8, wherein the base address translation module comprises:
the input end of the grouping control unit is connected with the output end of the sorting module, and the grouping control unit is used for judging whether all the instructions in the class are grouped completely;
the input end of the first grouping unit is connected with the output end of the grouping control unit, and the first grouping unit is used for checking the instructions in the large class from small to large according to the offset address when all the instructions in the large class are not grouped completely, and adding the instructions of which the coding length is the shortest coding length when the new base address is used as the base address into a group using the new base address as the base address;
and the input end of the second grouping unit is connected with the output end of the first grouping unit, and the second grouping unit is used for creating a new group by taking the memory address of the command as a new base address when the memory access command of which the coding length is not the shortest coding length is checked out and the new base address is taken as the base address.
10. The program compiler apparatus of claim 9, wherein the second packet unit comprises:
the input end of the coding length judging subunit is connected with the output end of the first grouping unit, and the coding length judging unit is used for checking whether a memory access instruction with the coding length not being the shortest coding length exists when the new base address is taken as the base address;
and the grouping subunit is used for trying to increase the base address of the current group on the premise of ensuring that the coding length of the offset address of the current group is not changed, so that the offset address of the currently looked-up memory access instruction reaches the shortest coding length, taking the increased base address as a new base address of the current group, and if the trying is unsuccessful, taking the memory address of the instruction as the new base address to create a new group.
11. The program compiler apparatus of claim 7, wherein the target program generation module is further configured to insert an instruction to modify a base address for each large class at a target location of the target program; wherein the target location is such that the instructions of the modified base address of each large class are executed before all of the instructions in that large class.
12. The program compiler apparatus of claim 7, wherein the program loading module comprises:
the program reading unit is used for reading a program to be compiled;
the input end of the program dividing unit is connected with the output end of the program reading unit, and the program dividing unit is used for dividing the program to be compiled into a plurality of functions;
and the input end of the recording unit is connected with the output end of the program dividing unit, and the recording unit is used for recording the base address and the offset address of each memory access instruction in each function.
CN201610974546.6A 2016-11-04 2016-11-04 Program compiling method and compiler Active CN106406972B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610974546.6A CN106406972B (en) 2016-11-04 2016-11-04 Program compiling method and compiler

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610974546.6A CN106406972B (en) 2016-11-04 2016-11-04 Program compiling method and compiler

Publications (2)

Publication Number Publication Date
CN106406972A CN106406972A (en) 2017-02-15
CN106406972B true CN106406972B (en) 2019-05-24

Family

ID=58014767

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610974546.6A Active CN106406972B (en) 2016-11-04 2016-11-04 Program compiling method and compiler

Country Status (1)

Country Link
CN (1) CN106406972B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368320B (en) * 2017-07-25 2020-06-26 南京林业大学 Simple early exercise data statistical system
CN112363779A (en) * 2020-11-25 2021-02-12 王志平 Safety control method for dynamic link program
CN112445729B (en) * 2020-11-30 2024-04-16 深圳开立生物医疗科技股份有限公司 Operation address determination method, PCIe system, electronic device and storage medium
CN113111012B (en) * 2021-04-14 2023-07-25 景德镇市明泰精工瓷业有限公司 Application data locator generation method and application data locating method
US12118359B2 (en) 2021-05-20 2024-10-15 Huawei Technologies Co., Ltd. Method and system for optimizing address calculations
CN113656330B (en) * 2021-10-20 2022-02-15 北京微核芯科技有限公司 Method and device for determining access address
CN118259970B (en) * 2024-05-30 2024-10-18 摩尔线程智能科技(北京)有限责任公司 Instruction processing method, device and system and electronic equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI569203B (en) * 2011-04-07 2017-02-01 威盛電子股份有限公司 Conditional load instructions in an out-of-order execution microprocessor
CN104346285B (en) * 2013-08-06 2018-05-11 华为技术有限公司 Internal storage access processing method, apparatus and system
JP2015082166A (en) * 2013-10-22 2015-04-27 ルネサスエレクトロニクス株式会社 Method for managing data storage flash memory and program therefor

Also Published As

Publication number Publication date
CN106406972A (en) 2017-02-15

Similar Documents

Publication Publication Date Title
CN106406972B (en) Program compiling method and compiler
US8601451B2 (en) System, method, and computer program product for determining whether code is unwanted based on the decompilation thereof
US9201652B2 (en) Methods and apparatus for storage and translation of entropy encoded software embedded within a memory hierarchy
KR101840905B1 (en) Counter operation in a state machine lattice
US10747946B2 (en) Non-transitory computer-readable storage medium, encoding apparatus, and encoding method
KR20160132853A (en) Parallel decision tree processor architecture
CN108710787B (en) Code obfuscation method and apparatus, computing device, computer storage medium
CN107967152B (en) Software local plagiarism evidence generation method based on minimum branch path function birthmarks
US20160062751A1 (en) Method and apparatus for optimising computer program code
CN115730313A (en) Malicious document detection method and device, storage medium and equipment
US10445095B2 (en) Information processing device, compiler method, and recording medium recording compiler program
US7924179B2 (en) Variable-length code determining device and variable-length code decoding method
US9201982B2 (en) Priority search trees
Burks et al. Higher-order Markov models for metagenomic sequence classification
CN103440122A (en) Novel static function identification method using reverse extension control flow graphs
CN102831004B (en) Method for optimizing compiling based on C*core processor and compiler
US6934941B2 (en) Compiler for generating risc object code whereby operations on bit variables written in source code are executed by processing based on bit judgement operations to thereby reduce the amount of object code
CN107436728B (en) Rule analysis result storage method, rule backtracking method and device
Ročkai et al. Techniques for memory-efficient model checking of C and C++ code
CN114282182A (en) Countermeasure software generation method and device and server
CN114428603A (en) Method and system for generating short and int type instructions based on compiler
US20240211596A1 (en) Malicious vba detection using graph representation
CN116521182A (en) Decompilation method, tool, readable storage medium and program product
JP2007249262A (en) System and method for detecting access out of range of array
CN117707613A (en) Program code isomerism method, isomerism device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder

Address after: 519000 No. 333, Kexing Road, Xiangzhou District, Zhuhai City, Guangdong Province

Patentee after: ZHUHAI JIELI TECHNOLOGY Co.,Ltd.

Address before: Floor 1-107, building 904, ShiJiHua Road, Zhuhai City, Guangdong Province

Patentee before: ZHUHAI JIELI TECHNOLOGY Co.,Ltd.

CP02 Change in the address of a patent holder