WO2011079796A1 - Method for compressing.net document - Google Patents
Method for compressing.net document Download PDFInfo
- Publication number
- WO2011079796A1 WO2011079796A1 PCT/CN2010/080459 CN2010080459W WO2011079796A1 WO 2011079796 A1 WO2011079796 A1 WO 2011079796A1 CN 2010080459 W CN2010080459 W CN 2010080459W WO 2011079796 A1 WO2011079796 A1 WO 2011079796A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- type
- name
- data
- definition
- offset
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
Definitions
- BACKGROUND OF THE INVENTION 1 Field of the Invention The present invention relates to a computer application or to a .net file compression method.
- BACKGROUND OF THE INVENTION Net is Microsoft's next-generation technology platform. It is a new Internet-based cross-language software development platform that conforms to today's software industry distributed computing, component-oriented, enterprise-level applications, software, and Web-centric. Trend. .net is not a development language, but it can support multiple development languages on the .net development platform, such as ⁇ #language, C++, Visual Basic, Jscript, etc.
- a smart card is a plastic card that is similar in size to a normal business card. It contains a silicon chip with a diameter of about 1 cm and has the function of storing information and performing complex calculations.
- the smart card chip integrates a microprocessor, a memory, and an input/output unit, so that the smart card is considered to be the world's smallest electronic computer.
- the smart card has a set of strong security and security control mechanisms, and the security control program is solidified in the read-only memory, thus having reliable security guarantees such as the inability to copy passwords.
- smart cards also have a large information storage capacity, and the advantages of the card function can be increased by using a microprocessor.
- a .net card is a smart card that contains a .net card virtual machine that can run .net programs.
- the so-called virtual machine means that it can be imagined as a machine that is simulated by software. In this machine, there are various hardware such as processors, memory, registers, etc., which simulate various commands and software running on this machine. There are no special requirements for the runtime environment, so the virtual machine is transparent to the programs running on it. For example, the x86 virtual machine simulates the operating environment of the x86 instruction program, and the c51 virtual machine simulates the operating environment of the c51 instruction program.
- .net programs include namespaces, reference types, definition types, definition methods, reference methods, IL (Intermediate Language) code, and more.
- IL Intermediate Language
- the current smart card still has limited storage space.
- some programs with large functions occupy a large storage space, and many .net programs cannot be stored and run.
- the .NET program in the related art has a poor compression effect and cannot be stored and operated on a small-capacity storage medium (for example, a smart card), and an effective solution has not been proposed for this problem.
- the present invention is directed to a compression method for a .net file, which can solve the problem that the .net program has a poor compression effect and cannot be stored and operated on a small-capacity storage medium (for example, a smart card).
- a method for compressing a .net file comprising at least one of the following steps: obtaining a reference type in a .net file, compressing the reference type; obtaining .
- the invention effectively reduces the storage capacity occupied by the .net file, and facilitates the use of the net file on a small storage device, which also saves resources and improves resource utilization.
- FIG. 1 is a structural block diagram of a compression device of a reference type in a .NET file provided by Embodiment 1 of the present invention
- FIG. 2 is a block diagram showing a specific structure of a reference type name obtaining module according to Embodiment 1 of the present invention
- 2 is a flowchart of a compression method of a reference type in a .NET file provided;
- FIG. 4 is a flowchart of a compression method of a reference type in a .NET file provided by Embodiment 3 of the present invention
- 5 is a schematic structural diagram of a .NET file provided by Embodiment 3 of the present invention
- FIG. 6 is a flowchart of a method for counting and counting a method of a statistical reference type according to Embodiment 3 of the present invention
- FIG. 7 shows an embodiment of the present invention.
- 4 is a flowchart of a compression method for defining a method of a .NET file
- FIG. 8 is a flowchart showing a compression method of a method for defining a .NET file according to Embodiment 5 of the present invention
- FIG. 9 shows the present invention.
- FIG. 10 is a flowchart showing a method for compressing a data item specific to the big header method according to Embodiment 5 of the present invention
- FIG. 12 is a structural block diagram of a compression device of a method body for defining a .NET file according to Embodiment 6 of the present invention
- FIG. 13 is a structural block diagram of a compression method of a method body for defining a .NET file according to Embodiment 6 of the present invention
- FIG. 10 is a flowchart showing a method for compressing a data item specific to the big header method according to Embodiment 5 of the present invention
- FIG. 12 is a structural block diagram of a compression device of a method body for defining a .NET file according to Embodiment 6 of the present invention
- FIG. 13 is a structural block diagram of a compression method of a method body for defining a
- FIG. 18 is a schematic diagram of a method for obtaining a method header according to Embodiment 10 of the present invention; FIG.
- FIG. 18 is a schematic diagram of a data format of a method for providing a header according to Embodiment 10 of the present invention
- 20 is a schematic diagram of a data format of a small head method according to Embodiment 10 of the present invention
- FIG. 21 is a flowchart of a method for compressing ILcode according to Embodiment 10 of the present invention
- FIG. 22 is a .net file provided by Embodiment 11 of the present invention.
- FIG. 23 is a flowchart of a method for compressing a namespace in a .NET file according to Embodiment 12 of the present invention
- FIG. 24 is a schematic diagram showing a structure of a .NET file according to Embodiment 12 of the present invention
- 25 is a structural diagram of a compression device of a type defined in the .NET file provided in Embodiment 13
- FIG. 26 is a flowchart showing a compression method of a type defined in the .NET file provided in Embodiment 14.
- FIG. 28 is a flowchart showing a compression method of a type defined in the .NET file provided in Embodiment 15. BEST MODE FOR CARRYING OUT THE INVENTION
- the present invention will be described in detail with reference to the accompanying drawings in conjunction with the embodiments.
- Embodiment 1 This embodiment provides a compression device of a reference type in a .NET file. As shown in FIG. 1, the device includes: a reference type name acquisition module. 102.
- the compression module 104, the statistics module 106, and the combination module 108, the functions of each module are as follows: a reference type name obtaining module 102, configured to obtain a name of a reference type used in a .net file; and a compression module 104, configured to reference a type
- the name of the reference type obtained by the name obtaining module 102 is compressed to obtain the name of the compressed reference type;
- the statistics module 106 is configured to collect a method count and a field count of the reference type obtained by the reference type name obtaining module 102.
- the combining module 108 is configured to: name the reference type compressed by the compression module 104 according to a predetermined format, and the statistics module 106. The calculated method count and field count are combined to obtain a compression result of the reference type.
- the reference type name obtaining module 102 can obtain the name of the reference type in a plurality of manners. This embodiment is described by using FIG. 2 as an example. As shown in FIG. 2, the reference type name obtaining module provided in this embodiment is used.
- the specific structure block diagram, the reference type name obtaining module 102 includes: a first metadata table obtaining unit 1022, configured to obtain a first metadata table in the .net file; and a plurality of tables included in the .net file, in this embodiment
- the metadata table TypeRef reference type or interface table
- the address information reading unit 1024 is configured to read the address information of the name of the reference type used in the .NET file from the first metadata table acquired by the first metadata table obtaining unit 1022; the reference type name reading unit 1026 And for reading the name of the reference type based on the address information read by the address information reading unit 1024.
- the algorithm used by the compression module 104 for compression may be a hash algorithm, and specifically, may be MD5, SHA-1 or SHA-2.
- the statistics module 106 may perform statistics on the information recorded in the second metadata table when the method count and the field count are performed, wherein the second metadata table is a metadata table MemberRef in the .net file, and each row in the table The data records the reference type information and the feature identification value.
- the current row data can be determined to point to the reference type of the first row of the first metadata table TypeRef, and then the reference type pointed to by the current row data is determined.
- the data in the feature identifier value is used to indicate whether the current row data record is a method or a field.
- the method count and the field count corresponding to the name of each reference type can be counted.
- the predetermined format in the combination module 108 is a fixed length byte, and the fixed length byte includes three parts, wherein the first part is the name of the compressed reference type, and the second part is the method count, The three parts are the field counts. These three parts can be combined arbitrarily.
- the purpose of including method counts and field counts in the compression result of a reference type is: when .net When other parts of the piece are also compressed, the corresponding method can be found according to the method, and the corresponding field is found according to the field count, so that the compressed reference type can be used normally.
- the compression module 104 in this embodiment compresses the name of the reference type obtained by the reference type name obtaining module 102, and the combination module 108 calculates the name of the reference type compressed by the compression module 104 and the statistical module 106.
- Embodiment 2 This embodiment provides a compression method of a reference type in a .NET file. The method is described as an example of running in the compression device provided in Embodiment 1. As shown in FIG.
- the method includes: Step 202 : obtaining the name of the reference type used in the .net file; Step 204: Compressing the name of the reference type to obtain the name of the compressed reference type; Step 206: Counting the method count and the field count of the reference type; Step 208: Combine the name of the compressed reference type, the method count, and the field count according to a predetermined format to obtain a compression result of the reference type.
- the step 202 specifically includes: acquiring a first metadata table in the .net file; reading address information of a name of the reference type used in the .net file from the first metadata table; reading the above according to the address information The name of the reference type.
- the step 202 may further include the step of generating a reference type name string by using the obtained reference type name, and the reference type name string may be implemented in any one of the following two ways:
- the step of compressing the name of the above-mentioned I type includes: hashing the name of the type I for the above I (or the generated reference type name string) to obtain a hash value; taking a predetermined byte of the hash value As the name of the compressed reference type.
- the algorithm used in hashing is:
- the step of counting the statistics and the counting of the fields specifically includes: acquiring a second metadata table; performing the following operations on each row of the second metadata table: reading the current row data in the second metadata table a reference type; when the name of the reference type pointed to by the current row data is consistent with the name of the obtained reference type, determining whether the current row data record is a method according to the feature identifier value of the current row data, and if so, the above The method type of the reference type is incremented by 1; otherwise, the field of the above reference type is counted as 1.
- the predetermined format mentioned in step 208 is a fixed length byte
- the fixed length byte includes three parts, wherein the first part is the name of the compressed reference type, and the second part is the method count.
- the third part is the field count.
- the first metadata table and the second metadata table may be specifically the metadata table in Embodiment 1, and details are not described herein again.
- the method provided in this embodiment is explained by taking the implementation in the compression device provided in Embodiment 1 as an example. ⁇ Compress the name of the obtained reference type, combine the name of the compressed reference type, and count the method count and field count to obtain the compressed reference type, which can effectively reduce the storage space occupied by the .net file.
- the .net file can be stored and run on a small-capacity storage medium (for example, a smart card), thereby enhancing the functionality of a small-capacity storage medium (for example, a smart card).
- a small-capacity storage medium for example, a smart card
- Embodiment 3 This embodiment provides a compression method of a reference type in a .NET file, in which an uncompressed file compiled by the i ⁇ .net platform is referred to as a .net file. As shown in FIG.
- Step 302 Obtain the first metadata table in the .net file,
- the first metadata table in the embodiment is specifically a metadata table TypeRef;
- the metadata table TypeRef reference type or interface table
- Metadata table As part of the PE (Portable Excutable, Portable Executable) file, this example uses the .net file compiled from the following code as an example: namespace MyCompany. MyOnCardApp ⁇ public class My Service: MarshalByRefObject
- the helloworldexe file is obtained and stored in binary form on the hard disk.
- the binary file is a .net file, as shown in Figure 5, for this embodiment.
- the metadata includes a metadata header (metadata header), a metadata table (tables), and the like. The following describes the process of obtaining the metadata table TypeRef: a. Positioning the Dos header of the net file, the Dos header obtained in this embodiment is 0x5 a4d; b.
- the offset address of the PE feature is obtained, and the offset address of the PE feature is 0x00000080.
- the first agreed byte is 0x003a bytes; c.
- the PE feature is located according to the PE feature offset address 0x00000080. Positioning the PE feature 0x4550; d. Reading the four bytes after the PE feature is shifted backward by the second predetermined byte.
- the 32-bit machine is taken as an example for description, and the second agreed byte is used.
- the data of 4 bytes is read as 0x00000010. This value indicates that there are 0x10 directories in the binary file and contains .net data; where, the metadata header of the .net file
- the relative virtual address is written in the above OxOF directory,
- the second agreed byte in the 64-bit machine is 0x0084 bytes; e. Read eight bytes of data after shifting the third agreed byte backward from the above data 0x00000010, in this embodiment, preferably The third agreed byte is 112 bytes.
- the first four bytes are 0x00002008, which is the relative virtual address of the .net data header, and the last four bytes are 0x00000048.
- the data base of the .net data header gets the linear address 0x00000208, and reads the .net header to get the following data:
- the above data is stored by the little endian.
- the first 4 bytes of the above data 0x48000000 is the length of the data, and the storage mode converted to the big end is 0x0000048; in this embodiment, the linear address is.
- the eight-byte data is read after the fourth contract byte is shifted backward by the .net data header.
- the fourth agreed byte is from .net
- a total of 8 bytes are read.
- the first four bytes are 0x0000218c, which is the relative virtual address of the metadata header (MetaData Header).
- the four bytes are 0x000009a0, which is the length of the metadata; h.
- the relative virtual address 0x0000218c of the metadata header is converted to the linear address 0x0000038c, and the metadata content is obtained according to the linear address and the metadata length; i. by the metadata header
- the fifth agreed byte starts reading and starts reading data of length 8 bytes, ie 0x000000092002 lc57, its binary form is 100100100000000000100001110001010111; in this embodiment, the fifth convention The number of bytes is "# ⁇ " in the stream. Counting the ninth byte; j.
- the first bit represents whether the metadata table Module exists, and if it is 1, the metadata table Module is present, if Yes, the certificate is not present.
- the second bit is 1 indicating that the metadata table TypeRef exists; wherein, in the data obtained in step i, starting from the lower bit, each bit Represents whether there is a corresponding table in the .net file; k.
- the data OxOOOOOOle is obtained, and it is judged that there are 30 data rows in the metadata table TypeRef;
- the data 0x000000092002 lc57 is shifted backward by 8 bytes, and the number of data rows of the metadata table existing in the .net file is sequentially stored in units of 4 bytes.
- the specific content of each metadata table is sequentially stored as a metadata table area; 1.
- the specific content of the metadata table TypeRef is read according to the agreed method.
- the agreed method is as follows.
- the .net file in this embodiment is taken as an example for description.
- step j it is determined that the metadata table Module exists, and the data of the data row number is 1, and the data of the metadata table Module is read.
- the row data is 10 bytes per row, so in the metadata table area, the backward offset is 10 bytes, the 11th byte starts as the content of the metadata table TypeRef, and the data behavior of the metadata table TypeRef is 30 rows.
- the above data is the first 8 rows of data in the metadata table TypeRef in the .net file provided in this embodiment.
- the metadata table TypeRef has 30 rows of data, and the remaining data processing methods are the same, no longer - enumeration.
- the first two bytes are the coded identifier of the type resolution scope
- the third and fourth bytes are the offset of the name of the reference type in the "#Strings" stream.
- the last two bytes are the offsets of the namespace name to which the reference type belongs in the "#Strings" stream.
- Table 1 Table 1
- the data in the table uses a big end representation method, for example, the data of the first data line is 0060, 0061, 005a, and the corresponding small end representation method is 0600 6100 5a00 ⁇ using the compression method provided by this embodiment.
- Step 304 Get the name of the reference type in the .net file, and convert the name of the reference type to the reference type name string; in step 302, obtain the name of the reference type relative to For the offset of the "#Strings" stream, the first reference type in the metadata table TypeRef, that is, the reference type with a relative offset of 0x0061 is taken as an example.
- the method for obtaining the name of the reference type is as follows: After obtaining the address 0x0000038c of the metadata header in step h in 302, the content is read backward from the metadata header.
- the tag "#Strings,,” is found, Read the first 8 bytes of "#Strings" to get the data 0x5C0300003C040000; the upper 4 bytes of the data 0x5C0300003C040000 are the offset of the "#Strings" stream relative to the metadata header, the lower 4 bytes are "#Strings”
- the reference type name string can also be obtained by: obtaining the name of the reference type, and obtaining the namespace name to which the reference type belongs, and using the connector "" to connect with the name of the reference type to get a reference.
- the type name string for example, the namespace of the first reference type belongs to System, and the reference type name string of the first reference type after the name conversion is obtained as follows: System.
- Step 306 Reference type name The string is hashed, and the predetermined byte is taken as the name of the compressed reference type; wherein the algorithm for performing the hash operation may be MD5, SHA-1, SHA-2, etc., in this embodiment, preferably ⁇ Illustrated with the MD5 algorithm, in step 304
- the reference type name string "MarshalByRefDbject" is subjected to the MD5 operation to obtain a 120-bit MD5 value "3064AB63C4B4DC57770E9BDF25B7547D"; in this embodiment, the first two bytes of the MD5 value are preferably taken as the name of the compressed reference type.
- the names of a reference type may be sequentially read and compressed to obtain a compression result in the order of steps 302-306, or all the .NET files may be obtained first. Referencing the name of the type, and then compressing the names of the many reference types one by one to obtain the compressed result, and buffering the obtained compression result; Step 308: Counting the method count and the field count of the reference type obtained above; see FIG.
- Step 3081 Obtain the second metadata table, the second metadata in this embodiment
- the table is specifically a metadata table MemberRef; in this embodiment, the metadata table MemberR is obtained.
- the process of ef includes: It can be seen from the binary data 100100100000000000100001110001010111 obtained in the step i of the metadata table TypeRef in step 302 that the reading is started from the lower bit, the 11th bit is 1 indicating that the metadata table exists, and there are 5 1s before the 11th bit.
- the metadata table Module includes 1 row of data behavior
- the metadata table TypeRef includes 30 rows of data behavior
- the metadata table TypeDef includes 6 rows of data behavior
- the metadata table Field includes 7 rows of data behavior
- the metadata table Method includes data behavior. 6 lines; according to the method agreed in step 1 in step 302, the data row of the metadata table Module has 10 bytes per behavior, and the data row of the metadata table TypeRef has 6 bytes per behavior, and the metadata table
- Each line of TypeDef has 14 bytes of data, and each row of data in the metadata table Field has 6 bytes.
- the above data is the first 8 in the metadata table MemberRef of the .net file provided in this embodiment.
- the data in the line, the other data processing methods are the same, here no longer - enumerated.
- a method and field information of a reference type are stored in the metadata table MemberRef, and each row of the above data records a reference feature of a method or field, that is, a signature value.
- each row of data represents the reference type pointed to by the field or method
- the middle 2 bytes are the name of the field or method
- the lower 2 bytes indicate the feature identifier value of the method or field relative to The offset of the "#Blob" stream
- the signature value of the signature records whether the data row represents a method or a field
- a return value For convenience of explanation, see Table 2, Table 2 is a list form of the above 8 rows of data:
- Step 3082 Read the reference type pointed to by each method or field in the metadata table MemberRef; as shown in Table 2, the high 2-byte record is the reference type pointed to by the row method or field, and the metadata table MemberRef is below.
- the first row of data is described, Class is 0x0029, 0x0029 is converted to binary 101001, 3 bits are shifted to the right to get 101, and converted to decimal is 5, then the field or method of the first row in the metadata table of the metadata table points to the metadata table.
- the reference type recorded in the fifth row of the TypeRef can be obtained by the method described in step 302 and step 304, and the name of the reference type recorded in the fifth row of the metadata table TypeRef is AssemblvKevNameAttribute; Step 3038: sequentially obtain the metadata table MemberRef
- the feature identifier value of each method or field is signature; as shown in Table 2, the ⁇ 2 byte of each row in the metadata table MemberRef is the feature identifier value of the method or field relative to the "#Blob" stream.
- the first row feature identifier value Signature offset is 0x0043, and the first row data is obtained by offsetting 0x0043 in the "#Blob" stream.
- the feature identifier value Signature is 0x2001010e;
- Step 3084 determining whether the current row data record is a method or a field according to the read feature identifier value Signature, if it is a method, performing step 3085; if it is a field, performing step 3086; To read
- the feature identification value Signature read out in the first row of data rows in the metadata table is described as an example.
- the specific process of determining whether the method or the field is based on the read feature identification value Signature is as follows: The first row of data rows is read out.
- the signature value of the feature identifier is 0x2001010e.
- the criterion of the embodiment is that when the last 4 digits of the signature value Signature is 0x060e, the current row data record is a field, otherwise, the row data record is a method;
- the data of the first row of the metadata table is recorded, that is, the reference type AssemblvKevNameAttribute refers to the method recorded in the first row of the metadata table MemberRef;
- Step 3085 The current row data in the metadata table MemberRef
- the method count of the reference type pointed to is incremented by 1, and then returns to step 4 to gather 3083;
- Step 3086 increment the field count of the reference type pointed to by the current row data in the metadata table MemberRef, and then return to step 4 to gather 3083;
- Step 4 The method provided in 3081-3086, after reading the metadata table MemberRef, all the references in the .net file are obtained.
- Step 310 Combine the name of the compressed reference type, the method count of the reference type, and the field count according to a predetermined format to obtain a compression result of the reference type.
- the predetermined format may be Table 3.
- the format shown that is, the format is a fixed length of bytes, the fixed length can be set as needed, the byte includes three parts, the first part is the name of the compressed reference type, and the second part is the The method count of the reference type, and the third part is the field count of the reference type. table 3
- the name of the compressed reference type I method count I field count combines the name of the compressed reference type obtained in step 306 with its corresponding method count and field count according to the structure shown in Table 3, and obtains the compressed reference type.
- the method count or field count of the reference type is 0, it is padded with 0x00, for example, the compression result of the reference type MarshalByRefObj ect is: 0x30640100.
- the compression structure shown in Table 3 is only an optimal structure, and the compression structure can also be transformed accordingly.
- Embodiment 4 Referring to FIG. 7, this embodiment provides a compression method for defining a .NET file, and the method includes: Step 4: 401: Locate the .net file, and then locate the .net file to .net.
- Step 402 Construct a character string of the content of the corresponding data item of each definition method in the stream according to the definition method table, and the content of the data item includes a parameter count;
- Step 403 hash the constructed character string Computation to convert to a name hash value;
- Step 404 Compress the execution identifier and the access identifier of the above defined method;
- Step 405 Compress the parameter table of the method defined in the stream;
- Step 406 Organize the name hash value, The compressed execution ID and access identification, parameter count, and compressed parameter tables yield a compressed structure.
- the parameter hash value, the compressed execution identifier, the access list, the parameter count, and the compressed parameter table are organized according to preset rules.
- the above .net file has a .net file header, which stores the starting position of the metadata structure table, the content description, and the byte size occupied by each content, with the starting position and contents.
- the byte size can be used to calculate the specific addresses of the metadata stream, the string stream, the blob stream, the GUID stream, and the user string stream to be used in this embodiment, so that the calculated address can locate the corresponding stream. .
- the storage capacity occupied by the .net file is effectively reduced, and the .net file is used on a small storage device; at the same time, resources are saved, and resources are improved. Utilization rate.
- Embodiment 5 provides a compression method for defining a .NET file.
- the method specifically includes the following steps: Step 501: According to the PE file structure and the .net file, locate the metadata table. In the definition method table and related stream, read the number of rows in the method table in the header, the number of rows represents the number of methods; set the method count value; the positioning process includes: by the file header of the PE file The content is located in the .net file, and the content in the .NET file header is located at the address of each stream in the .NET file (including the metadata stream, the string stream, the blob stream, the GUID stream, and the user string stream). size.
- the .net file is shown in the following table: Consists of a storage signature, a storage header, a stream header, and six data streams. Where the size of the storage signature is fixed, The size of the storage header is also fixed. The stream header stores the name, size, and offset address of each stream contained in the current .net file. With this data, you can locate the six streams of the .net file.
- the streams that need to be used in this example are a string stream, a blob stream, and a stream of metadata. Once you have located the metadata stream, you will continue to locate the metadata table in it.
- a metadata stream consists of a metadata header, a table record count, and a metadata table.
- the length of the metadata header is a fixed value, with an 8-byte MaskValid field, which identifies the bit vector of all existing tables.
- Each bit of the bit vector represents a table, and the value of each bit vector There are two choices of 0 and 1, 0 means that the table pointed to by the bit vector does not exist, and 1 means existence.
- the bit vectors mentioned above point to the tables of each type respectively. According to the values, it can be determined whether the tables exist, for example, the seventh bit of the bit vector corresponds to Define the method table, the value of the bit vector is 1, indicating that the definition method table does exist in the metadata table, so the metadata header identifies how many tables are in the metadata table.
- the table record count defines 4 bytes of data for each table identified above, the 4 bytes of data indicating the number of rows in each table, and the data width of each row in each table is specified. Since the length of the metadata header is fixed, the byte length of the table record count is also determined (this length is equal to the number of tables determined in the metadata header multiplied by 4), so that the metadata stream can be located from the metadata stream.
- the process of locating from the metadata table to defining the method table is as follows:
- the metadata table stores a number of different types of tables in turn (how many tables can be calculated according to the MaskValid field in the metadata header above), in each table There are many rows in the column, the number of rows has been specified by the table record count, and the length of each row is pre-defined.
- each row in the metadata table corresponds to a method, each method has a table of data items, and some items in the table are associated with a predefined stream, such as:
- the name item is associated with the string stream.
- the space occupied by each method in the table is determined, so that the data item address of a method can be located, and the value of the data item can be read from the data item address.
- the setting method count value is specifically that the initial value of the setting parameter i is 1, and the value represents the number of methods of performing compression.
- Step 502 Read a data item in each method in the definition method table; construct a character string according to the content in the obtained data item; each method in the method table includes the following data items: (1) RVA (4-byte unsigned integer) is the relative virtual address of the method body in the module, and the RVA turns to the read-only segment of the PE;
- ImplFlags (2-byte unsigned integer) implementation of the binary flag, indicating how the method is implemented;
- Flags (2-byte unsigned integer) represents the binary accessibility of method accessibility and other features;
- the name of the Name (offset in #Strings) method associated with the string stream.
- the entry index is a string of UTF-8 encoding format length greater than 0 and less than 1023 bytes;
- Signature (offset in #Blob 3 ⁇ 4 ⁇ ) method feature associated with blob 3 ⁇ 4 ⁇ .
- the record item indexes a blob stream whose length is greater than 0;
- ParamList (RID of the parameter table) The index of the record, indicating the starting position of the parameter list belonging to the method. The start point of the parameter list of the next method or the end point of the Param table determines the end position of this parameter list.
- a flowchart for constructing a string by using a method of defining a method of a .NET file the method for constructing a string includes: Step 5021: Read a value of a name item in the data item, and read according to the value Corresponding to the data in the string stream to get the name of the method; the name item corresponds to the "Name" item in the above data item, where the value indicates the offset address of the item in the string stream, according to the offset The address reads the method name from the "string stream", such as: MySampleMethod.
- Step 5022 Read the value of the signature item, and read the data in the corresponding blob stream according to the value, obtain and analyze the parameter information of the definition method and The type of the return value, where the parameter information specifically includes: the number of parameters and the type of each parameter, etc.; the signature item corresponds to the Signature item in the above table, as can be seen from the above table, the value in the signature item indicates the item
- the offset address in the "Blob stream” is used to read the signature information from the "Blob stream" according to the offset address, that is, a series of parameter information used by the method (eg: number of parameters, each parameter) Types of Etc.) and return value information (including the return value type), where the return value type points to the specific type in the other types of tables that are used during the use of the defined method.
- Step 5023 According to the return value type, find the type information pointed to by the return value type in the metadata table definition type table or the reference type table, and record the partial name and the namespace item in the data item table of the type.
- the information of the class such as the type name and the namespace name is read in the string stream;
- Step 5024 The return value of the full name string is constructed by applying the name of the namespace name and the type obtained above, and the return value constructed in this embodiment is all
- the preferred format of the name string is: namespace name-type name; the type name and namespace name of the parameter are respectively read according to the obtained parameter type, and the parameter full name string is constructed according to the obtained data, and the parameters constructed in this embodiment are all
- the preferred format of the name string Namespace name and type name;
- Step 5026 According to the return value obtained in
- Step 503 Perform a hash operation on the string obtained in step 502, and convert the first two bits of the operation result into a value type storage, where the data is a name hash value of the method; A series of data items are hashed, and a part of the values are intercepted, and the compression of the definition method is implemented.
- Step 504 Acquire an execution identifier and an access identifier of the defined method, and perform compression; the compression process is performed in step 502.
- Step 505 Determine the type of the method. If the method is a fat header method, set the method type related item in the combined identifier bit in step l 504 to 1, indicating that the method is a big header method.
- Step 4 506; If the method is a tiny header method, step 4 is performed; the step of determining the method type is specifically: analyzing the RVA value in the method data item obtained in step 502, and positioning the method by RVA Header information, the first byte of the method header information, the two bits of the byte indicate the method header type. If the value of the two bits is 2 (0010), the method is the small header method; if the low The value of two bits is 3 ( 0011 ), indicating that the method is a big method. Step 506: Acquire a data item unique to the big head method, and compress the content therein. Referring to FIG.
- a method flowchart for compressing the data item specific to the big head method includes: Step 5061: Steps The big-head method type information obtained in the 505 is analyzed, and the maximum stack size and the big-head method identifier are obtained, and the data describing the maximum stack size is compressed, and the data occupies 2 bytes (16 bits) in the original structure, and the 16-bit byte is taken. ⁇ 8 digits, discarding the upper 8 digits. Step 1062: Analyze the big head method identifier obtained in step 505, and obtain the local variable signature identifier to obtain the number of local variables. Referring to FIG. 11, the structure of the big head method identifier provided by this embodiment is shown in FIG.
- the identifier consists of 12 bytes.
- the first two bytes are identification information, which corresponds to the Flags item in the figure; the next two bytes are the maximum stack size information, corresponding to the MaxStack item in the figure; the next four The byte is the code size information, which corresponds to the Code Size item in the figure; the last 4 bytes are the local variable signature serial number, which corresponds to the Local Variables Signature Token item in the figure. Analyze the data in the serial number of the local variable signature in the big-head method identifier.
- the data is 0, it indicates that the number of local variables is 0; otherwise, the value is mapped to the StandAloneSig table of the metadata table (independent feature descriptor table, The table has a composite feature as a method local variable), reading the offset of the data item signature from the content of the Value item in the table, and reading the number of local variables; setting the header of the big header method after being compiled 16
- the hex code is as follows:
- the big head method obtained in the analysis step 505 identifies the fourth bit of the first byte in the Flags item. If the corresponding identifier of the bit is 1, the method has multiple segments; the fifth and sixth words in the Flags item are analyzed. Section gets the size of the code; according to the method header width and code size obtained above, it can be offset to the exception structure processing table; the byte width of the method header is specified in advance, the method header byte width of the big header method For the 12 bytes, the header byte width of the method of the small header method is also fixed. There is at least one segment stored in the method. If there are multiple segments, the segments are sequentially stored in a storage area; at least one abnormal structure is stored in each segment.
- the offset address of the storage area stored in the segment is located in the segment, and the first byte of the stored data is analyzed. If the 7th bit of the byte is 1, it indicates that the large header segment is abnormal.
- the structure type is FatFormat format, and step c is performed; otherwise, the exception structure type of the big header method is TinyFormat format, and step d is performed; specifically, the content in the segment is located according to the position of a segment in the structured exception processing table. And analyze the first byte of data stored in the segment.
- step 5064 Locating the contents of the segment according to the offset address of the storage area stored in the segment, and analyzing the 2nd to 5th bytes of the stored data, the 3 bytes indicating the storage space occupied by all the abnormal structures in a segment If the 8th bit in the 2nd byte is 1, it means that there are other segments (sections) behind the segment; if there are other segments, repeat the following operations until all segments are pressed After the completion of the following operations, the compressed structure of the abnormal structure information is organized, and then step 5064 is performed.
- the process of compressing each segment is as follows: If the abnormal structure of the segment is a large-head type, the storage space occupied by the abnormal structure in the segment is the first length, which is preferably n*24+4, where n is Number of exception structures; Flags bytes in the read method body), TryOffset (4 bytes), TryLength (4 bytes), HandlerOffset (4 bytes), HandlerLength (4 bytes;), ClassToken (4 bytes), discarding HandlerLength, compressing the values of TryOffset (4 bytes), TryLength (4 bytes), and HandlerOffset (4 bytes) into 2 bytes, compression method To discard the high order, retain the bit position; the four bytes of data in the ClassToken are compressed into one byte, and the compression method is to discard the high order, leaving only 8 bytes; the ClassToken obtained by the above steps is in the definition type and reference type table. Find the corresponding parameter type information. See Table 4 for the table of abnormal structure information after organization: Table 4
- each data item in the table is arbitrary, and the order can be adjusted at will.
- the process of compressing each segment is as follows: According to the type of the abnormal structure in the segment, it is determined that the abnormal structure in the current segment is TinyFormat (the small header format;), and the storage space occupied by the abnormal structure in the current segment is the second length.
- This embodiment is preferably: n* 12+4; Flags (2 bytes), TryOffset (2 bytes), TryLength (2 bytes), HandlerOffset (2 bytes), HandlerLength (2 words) of the read method in sequence Section) and ClassToken items; remove the HandlerLength item from the various configurations obtained above.
- the ClassToken item obtained by the above step 4 finds the corresponding exception type information in the definition type and the reference type table.
- Step 5064 Obtain the Finally count value; when the expression n*24+4 is used (see step c above to calculate the number of exception structures in the current segment, whether it is an exception structure of a big header type or an exception structure of a small header type, All follow this structure: 0 2 logo, see below for details:
- Step 5065 Obtain the garbage collection control attribute, where the step specifically includes: a) analyzing the CustomAttribute metadata table in the metadata table, if there is a data item in the row and the current analysis method (including the method header and method) Corresponding to the parameter, then analyze the value of the Type item, get the type information, the type name and its constructor, etc., and locate the offset value of the Blob stream in the Value item to locate the position in the blob stream, and analyze the first byte data.
- step 4 Get the length, skip the two-byte Prolog to get the parameter value of the constructor of the custom property type; b) If the information obtained in step 4 gathers a custom property has the type Transaction The attribute is identified as 0x40, that is, the seventh position is 1; c) if the type obtained in step a has a custom attribute type GCControl, and the value of the constructor parameter of the defined type obtained in step a, that is, the GCControlMode type The value of the corresponding value of the garbage collection control identifier is set accordingly:
- Step 5066 The value obtained above is organized according to the structure shown in Table 6.
- the header structure table is as follows: Table 6
- the garbage collection mark is the garbage collection control mark
- the data in the head structure table of the above-mentioned big head method has no order, and the order can be arbitrarily adjusted.
- Step 507 compress the parameter table; locate the value in the parameter data item ParamList (parameter table) obtained in step 502 to the parameter row corresponding to the metadata table parameter table, and according to the number of parameters obtained in step 502 Information, read the corresponding parameter line information, the information includes the following data items Flags (2 bytes), Sequence and Name items, compress these data items, specifically to discard the Sequence and Name items, and in the Flags item Compressing the content into 1 byte; analyzing the identifier of the parameter from the value of the Flags item of the read parameter row information; combining the above identifier and the parameter type obtained in step 502 in the type storage area in the compressed file
- the parameter information is in the format: Parameter identification 'offset of the parameter type.
- Step 508 If the number of parameters acquired in step 502 is greater than 1, return to step 507; otherwise, perform step 509; Step 509: organize the compressed data in steps 503, 504, 506, and 507 according to a preset rule to obtain a definition method.
- the compression structure preferably, the preset rules in this embodiment are specifically shown in Table 7 (the parameter count is the number of parameters): Table 7
- the order of each data item in Table 7 is arbitrary, and the order can be adjusted at will.
- the big header data block only exists when the identifier is identified as a big header in the identifier.
- the header header compression structure is as shown in Table 4; the exception information is determined according to the abnormal structure count, and all the exception structures are arranged in turn.
- the compressed structure of the abnormal structure information is shown in Table 8: Table 8
- Step 510 If the method count value is smaller than the number of rows of the header obtained in step 501, return to step l 502; otherwise, all operations are ended.
- the number of rows of the headers obtained in step 501 represents the number of defined methods stored in the data table, so the method count value is smaller than the number of rows in the header, indicating that there is also a definition method that is not compressed.
- This embodiment saves the storage space of the .net file by compressing the method part defined in the .net file, so that the .net file can be run on the small-capacity storage device; at the same time, resources are saved and the resource utilization is improved.
- Embodiment 6 provides a compression method of a method body for defining a .NET file.
- the apparatus includes: a method obtaining module 602, configured to obtain a definition method used in a .NET file.
- Method header can obtain the location information of the method header by first reading the information in the metadata table MethodDef, for example, RVA (relative virtual address), according to the location information, the data of the method header is obtained.
- RVA relative virtual address
- the compression module 604 is configured to obtain the ILcode obtained by the module 602 by the compression method, by compressing ILcode compresses the method body to obtain the compression result of the method body.
- Embodiment 7 This embodiment provides a compression method for a method body of a method for defining a .NET file. As shown in FIG.
- the device includes: a method obtaining module 702 and a compression module 704, where the compression module 704 includes: The variable offset determining unit 7042, the instruction compressing and calculating unit 7044, and the combining unit 7046, wherein the functions of the modules are as follows:
- the method obtaining module 702 is configured to obtain a method header of the definition method used in the .net file, and The method header obtains the ILcode corresponding to the definition method; the specific acquisition manner of the method header is the same as that of the embodiment 6, and is not described in detail herein.
- the compression module 704 is configured to compress the ILcode obtained by the module 702, and compress the method body by compressing the ILcode to obtain a compression result of the method body.
- the compression module 704 specifically includes: a local variable offset determining unit 7042, configured to acquire a local variable of the defined method according to a method header acquired by the method obtaining module 702, and determine an offset of the local variable according to the type of the local variable, where The offset of the local variable refers to the offset of the local variable in the compression structure corresponding to the above .net file; when the definition method is the Tiny Headers method, there is no local variable in the defined method, and at this time, the obtained The local variable is empty, and the offset of the corresponding local variable is also empty. When the definition method is a Fat Headers, there is a local variable in the defined method.
- the local variable is obtained, and the local variable is determined to be The offset in the compression structure; the basis for judging whether the definition method is a small header method is: reading the first byte of the method header, and determining whether the definition method is a big header method or a small header method according to the first byte above When the two bytes of the first byte read are 10, the definition method is a small header method, otherwise the definition method is a big header.
- Compression instruction computing unit 7044 a method for acquiring module 702 acquires the compressed ILcode, ILcode length after compression calculation;
- the ILcode mentioned in this embodiment includes an operation instruction and an operation parameter, wherein the operation parameter may be empty, or may be an identifier token, an offset of a local variable in the method body, or an offset of mega-transition, for ILcode.
- the compression mode can be determined according to the specific ILcode. For example: For the operation instruction without the operation parameter in ILcode, the operation instruction can be directly recorded; for the operation instruction with the operation parameter, the following three types can be used. The situation is handled separately:
- the skipped ILcode portion is determined according to the offset of the jump, and the skipped ILcode is determined.
- the skipped ILcode is determined.
- the skipped ILcode is determined according to the offset of the jump, and the skipped ILcode is determined.
- Partially recalculating the offset of the jump of the operation instruction ie, the above jump instruction
- recording the operation instruction and the recalculated jump offset for example: there are 10 operation instructions in the ILcode, wherein, The three operation instructions are jump instructions, and their operation parameters are 2, that is, two bytes are jumped after pointing. If two bytes are jumped, the jump to the fifth operation instruction is skipped.
- the ILcode part is the fourth operation instruction and its operation parameters.
- the skipped ILcode part needs to be compressed first, and the third part is re-modified according to the length of the compressed skipped ILcode part.
- the operation parameters of the operation instruction modify the original jump offset to the length of the compressed skipped ILcode portion.
- the meaning of the original ILcode is not changed.
- the combining unit 7046 is configured to compress the length of the compressed ILcode calculated by the instruction compression and calculation unit 7044, the offset of the local variable determined by the local variable offset determining unit 7042, and the compression of the instruction compression and calculation unit 7044 according to a predetermined format. After the ILcode is combined, the compression result of the method body is obtained.
- the predetermined format is a format in which the calculated length of the compressed ILcode, the offset of the local variable, and the compressed ILcode are sequentially arranged, and the front and rear positions of the three parts may be arbitrarily changed.
- the compression structure corresponding to the .net file mentioned in this embodiment means that the data in the .net file is arranged in order, and the offset of each row of data corresponds to the identifier token of the row data, for example: Namespace The offset of the data listed in the first row is 0, the corresponding token is 0, the offset of the data listed in the second row is 1 , the corresponding token is 1, and the class 4 dances in turn, as the .net file.
- the referenced data is arranged in the compressed structure before the definition data, and the token defining the data is also arranged.
- the offset should correspond to the token.
- the offset is one byte.
- the first defined type of token should be 6 and the offset is 6.
- the source code for a .net file is as follows: public String My S ampleMethod() ⁇
- each row of data records a namespace information, the first row of data (0) 0100 D93DEB00, where 0 in the parentheses is the offset of the namespace, the offset is not actually present , 0100 D93DEB00 is the specific information of the namespace, the structure of other data is similar to this row of data.
- the above data is only part of the .NET file compression structure, just to illustrate the correspondence between the identifier token in the .NET file and the offset in the compression structure.
- the ILcode is compressed by the instruction compression and calculation unit 7044 and the length of the compressed ILcode is calculated.
- the combined unit 7046 combines the compressed ILcode, the length of the compressed ILcode, and the offset of the local variable as the compression of the method body.
- the storage space occupied by the .net file can be effectively reduced, so that the .net file can be stored and run on j, a storage medium (for example, a smart card), thereby enhancing the storage of a small-capacity storage medium (for example, a smart card).
- Example 8 This embodiment provides a method for compressing a method body of a method for defining a .NET file. The method is described by using a compression device provided in Embodiment 6.
- the method includes: Step S20: Obtaining the method header of the definition method used in the .net file, and obtaining the ILcode of the definition method according to the method header; Step S30: compressing the ILcode in the above method, compressing the method body by compressing the ILcode, and obtaining the compression of the method body result.
- Step S20 Obtaining the method header of the definition method used in the .net file, and obtaining the ILcode of the definition method according to the method header
- Step S30 compressing the ILcode in the above method, compressing the method body by compressing the ILcode, and obtaining the compression of the method body result.
- the storage space occupied by the .net file can be effectively reduced, and the .net file can be stored and run on a small-capacity storage medium (for example, a smart card). This enhances the functionality of small-capacity storage media such as smart cards.
- Embodiment 9 provides a method for compressing a method body of a method for defining a .NET file.
- the method includes: Step 802: Obtain a method header of a definition method used in a .NET file, and Obtain the ILcode of the defined method according to the method header; include a namespace (Namespace), a reference type (TypeRef), a definition type (TypeDef), a definition method (MethodDef), a string, etc. in a .net file compiled by the .net platform. And storing in the form of a table and a stream.
- the structure of the .net file can be first converted into a compressed structure.
- the compression structure in this embodiment is compressed in the first embodiment. The structure is the same and will not be mentioned here.
- the IL includes: an operation instruction and an operation parameter, and the operation parameter may be empty, or may be an identifier token, an offset to a local variable, or an offset of a mega-turn.
- the first half of the identifier token indicates the table corresponding to the operation instruction, and the second half indicates the data of the first row of the table.
- the specific implementation of the method header may be obtained by first obtaining the metadata table MethodDef in the .net file; reading the address information of the definition method used in the .net file from the metadata table MethodDef; reading the definition method according to the address information Method header.
- the step 4 of reading the ILcode according to the method header includes: The information in the method header determines the length of the ILcode; the ILcode is read according to the determined length of the ILcode.
- Step 804 The method header obtains a local variable of the defined method, and the type of the local variable determines an offset of the local variable, where the offset of the local variable refers to the local variable in the compressed structure corresponding to the .net file. The offset is because there is no local variable in the small head method. At this time, the obtained local variable is empty, and the offset of the corresponding local variable is also empty.
- Step 806 Compress the read ILcode and calculate the length of the compressed ILcode.
- the compression mode may be determined according to the specific ILcode.
- the specific compression method refer to the ILcode compression method in Embodiment 7. Implementation, no longer detailed here.
- Step 808 Combine the length of the compressed ILcode, the offset of the local variable, and the compressed ILcode according to a predetermined format to obtain a compression result of the IL.
- the predetermined format is a format in which the calculated length of the compressed ILcode, the offset of the local variable, and the compressed ILcode are sequentially arranged, and the front and rear positions of the three parts may be arbitrarily changed.
- the compression method provided in this embodiment may further include: determining whether the method headers of all the defined methods used in the above .NET file have been read and completing IL compression, and if not, continuing to read the next definition method. Method header, and perform steps 802-808; otherwise, end compression.
- Embodiment 10 This embodiment provides a method for compressing a method body of a .NET file definition method. As shown in FIG.
- the method includes: Step 902: Obtain a definition method metadata table MethodDef; included in a .net file There are a plurality of tables, wherein the method metadata table MethodDef records the location information of the method headers of each definition method in the .net file; this embodiment takes the .net file obtained by compiling the following code as an example, and obtains the description thereof.
- Method of metadata table MethodDef public String My S ampleMethod() ⁇
- the .net platform to compile the helloworldexe file in binary code.
- the form is stored on the hard disk.
- the binary file is a .net file.
- the structure of the .net file provided in this embodiment includes a Dos header, a PE feature, and metadata (metadata). It includes metadata headers (metadata headers), metadata tables (tables), and so on.
- the process of reading the metadata table MethodDef is as follows: a. Positioning.
- the net file Dos header, the Dos header obtained in this embodiment is 0x5a4d; b.
- the first agreed byte is skipped from the dos header, and the PE feature is read out.
- the address is shifted to obtain the offset address 0x00000080 of the PE feature.
- the first agreed byte is 0x003a bytes.
- the PE feature is located according to the PE feature offset address 0x00000080, and the PE feature 0x4550 is obtained.
- d. Four bytes are read at the second agreed byte backward of the PE feature. In this embodiment, 32 The bit machine is taken as an example for description. After the second agreed byte is offset from the PE feature by 0x0074 bytes, the data of 4 bytes is read as 0x00000010. This value indicates that there are 0x10 directories in the binary file and includes .
- the metadata header of the .net file is written in the above OxOF directory relative to the virtual address, and the second agreed byte in the 64-bit machine is 0x0084 bytes; e. from the above data 0x00000010, After the third predetermined byte is read, the eight bytes of data are read.
- the third agreed byte is 112 bytes, and among the eight bytes of data, the first four are The byte is 0x00002008, which is the relative virtual address of the .net data header, and the last four bytes are 0x00000048, which is the length of the .net data header;
- the data base of the .net data header gets the linear address 0x00000208, and reads the .net header to get the following data:
- the above data is stored by the little endian.
- the first 4 bytes of the above data 0x48000000 is the length of the data, and the storage mode converted to the big end is 0x0000048; in this embodiment, the linear address is.
- the relative virtual address of the section of the .net data directory in the .net file is 0x00002000
- the file offset of the section is 0x00000200
- the fourth agreed byte is offset from the .net data header by 8 bytes, and reads a total of 8 bytes.
- the first four bytes are 0x0000218c, which is The relative virtual address of the metadata header (MetaData Header), the last four bytes are 0x000009a0, which is the length of the metadata; h.
- the relative virtual address 0x0000218c of the metadata header is converted to the linear address 0x0000038c, according to the linear address and the length of the metadata. , get the metadata content; i.
- the fifth agreed byte is the ninth byte starting from the start bit in the "# ⁇ "stream; j. according to the binary data obtained in step i, starting from the lower bit, for example, The first bit represents whether the metadata table Module exists. If it is 1, it proves that there is a metadata table Module.
- the certificate does not exist.
- the second bit is 1, indicating that the metadata table TypeRef exists, according to this rule, the bit is The unit reads from the low position to the high position, and the seventh bit is 1, indicating that the metadata table MethodDef exists; wherein, in the data obtained in step i, starting from the lower position, each bit represents whether there is a corresponding table in the .net file; a) k.
- the sixth predetermined number of bytes is read and the number of data rows of the metadata table MethodDef is read.
- the sixth agreed byte is 24 bytes, specifically backward biased.
- the metadata table Module includes 1 row of the data row
- the metadata table TypeRef includes the number According to line 23
- the metadata table TypeDef includes 3 rows of data rows
- each row of data represents information corresponding to a defined method, and the first four bytes of each row of data are relative virtual addresses RVA of the method header of the defined method.
- RVA virtual addresses
- the data in Table 9 uses the big endian representation, for example, the data of the first data row is
- Step 904 Read the method header in the definition method, and obtain the local variable information according to the information in the method header;
- the first definition method in the metadata table MethodDef is taken as an example.
- Step 904a Obtain the address of the method header of the definition method, and read the first byte of the method header.
- the RVA of the method header of the first definition method is 0x0000210C.
- the linear address obtained by RVA conversion is 0x0000030C
- the data is read in the .net file according to the address 0x0000030C
- the length of the read data byte is one word.
- the first byte read is: 0x13;
- Step 904b Determine whether the definition method is a big header method or a small header method, if it is a big header method, perform step 904d, if it is a small header method Step 904c is performed.
- the data 0x13 obtained in step 904a is expressed in binary form as: 00010011, and the last two digits are 11, which is a big header method.
- Step 904c The definition method is a small header method, there is no local variable, the local variable is set to null, and a complete definition method header information and ILcode corresponding to the method are obtained;
- Step 904d Obtain a complete definition method header information and The IL code corresponding to the method, and obtains the local variable token; see FIG.
- the large-head method label i consists of only 12 bytes.
- the two bytes are the information of the standard i, which corresponds to the Flags item in the figure; the next two bytes are the maximum stack size information, corresponding to the MaxStack item in the figure; the next four bytes are corresponding to the method.
- the ILcode size information corresponds to the Code Size item in the figure; the last 4 bytes are the local variable token, corresponding to the Local Variables Signature Token item in the figure. Analyze the data of the local variable token in the big-head method identifier.
- the data is 0, it indicates that the number of local variables is 0; otherwise, the value is located in the StandAloneSig table of the metadata table (independent feature descriptor table, the table has As a composite feature of the method local variable), the offset of the data item signature is read from the content of the Value item in the table, and the number of local variables is read; the format of the big head method is known, the method header information is 5-8 The byte is the ILcode length, the 9th-12th byte is the local variable token, and the local variable token is ILcode. In this embodiment: 0x11000002, the local variable token uses the big endian representation.
- the method header is obtained, and the ILcode corresponding to the method is obtained according to the method header as follows:
- Step S904e The local variable token acquires a local variable, and the type of the local variable determines the offset of the local variable in the compressed structure of the .net file; the method of obtaining the local variable is: positioning the data row of the metadata table StandAloneSig according to the local variable token
- the table records the offset of the local variable information of the defined method in the Blob stream, and reads the information of the local variable according to the offset.
- the local variable token is Ox 11000002
- the Ox 11 indicates that the pointing sequence is
- the metadata table of Ox 11 that is, the metadata table StandAloneSig, 0x000002 is the second row of data of the table
- the local variable of the definition method corresponds to the second row data of the metadata table StandAloneSig
- the obtained data is: 0x0100, 0x0100 is local
- the offset of the variable in the Blob stream is based on the offset.
- the local variable information is read in the Blob stream: 0x0607040E080E02 , where 0x06 represents the length of the local variable information, 0x07 is the local variable identifier, and 0x04 represents the length of the local variable.
- OxOE indicates that the type of the local variable is the reference type String, only the compression structure of .net can be known, to S
- the tring offset in the reference type TypeRef is 0x04.
- the compression program replaces the local variable storing the definition method in the .net file with: 0x04.
- 08 indicates that the type of the local variable is the reference type Int.
- the net file compression structure has an offset of 0x07; 02 indicates that the local variable type is a reference type Boolean, and the offset is 0x05 in the compression structure of the .net file; there are two OxOEs in the definition method, indicating the definition method There are two local variables in the type String.
- the positioning method of Blob 3 ⁇ 4 ⁇ is: the compression program locates the position of the blob stream, and after obtaining the address 0x0000038c of the metadata header in step h in step S302, the backward reading is started from the metadata header, when the tag is found. After #Blob", the first 8 bytes of "#Blob” are read, and the data Ox 2006000098010000 is obtained, where the upper 4 bytes are the offset of the Blob stream relative to the metadata header, and the lower 4 bytes are the Blob stream.
- Step 906 Read ILcode, and compress ILcode;
- the method header and ILcode information of the above defined method obtained in step 904d are as follows: Among them, the local variable tokenOx 11000002 is followed by ILcode,
- 0x0000002F indicates the length of ILcode, and determines ILcode as:
- the IL in the .net file includes one or more operation instructions and operation parameters, including two combinations, one in the form of "operation instruction opcode+operation parameter" and the other only operation instruction.
- the operation parameters mainly include three types: the operation object's token, offset or local variable offset in the ILcode of the method.
- the operation parameter after the jump instruction is an offset
- the local variable is in the ILcode of the method.
- the offset is the sequence number of the local variable in the local variable in the defined method. Referring to FIG.
- Step 906a Acquire an operation instruction in ILcode
- Step 906b Determine whether there is an operation parameter after the operation instruction, If not, perform the steps
- step 906c if yes, executing step 906d; step 906c: directly recording the operation instruction, performing step 4 906h; step 906d: determining the following operation parameter type according to the operation instruction, if the operation parameter is a jump offset, performing steps 906e, if the operation parameter is an offset of the local variable in the method to which it belongs, step 906f is performed, if the operation parameter is token, step 906g is performed; step 906e: recalculating the offset of the operation instruction, replacing the original offset, Step 906h is performed; in the embodiment, the operation parameter after the jump instruction is a jump offset, and in this embodiment, the ILcode is compressed, resulting in a change in the data position, so the offset needs to be recalculated.
- the original offset is replaced.
- Step 906f Record the operation instruction and the operation parameter, and execute step 906h; In this step, a special case is illustrated, and the operation instruction 0x0a, 0x0
- the operation instruction 0x0a points to the first local variable of the method to which the instruction belongs.
- the operation instruction 0x0b points to the second local variable of the method to which the instruction belongs.
- the operation instruction 0x0c points to the instruction to which the instruction belongs.
- the third local variable of the method, the operation instruction OxOd points to the fourth method of the method to which the instruction belongs. Part variables, therefore, there is no operation parameter after these instructions, according to step 906c operation; when the local variables in the definition method are greater than 4, directly record the original operation instruction and operation parameters, such as ILcode: 0x1104, operation instruction 0x11 points to
- the fifth local variable in the method is similar to that of ILcode when it is compressed.
- the offset in the offset of the local variable refers to the local variable.
- the token is four bytes, the higher one byte records the metadata table pointed to, and the lower three bytes indicate the pointing to the metadata table.
- ILcode in this embodiment The big end representation method is: Ox 72 1D00007, Ox 72 is the operation instruction, 0xlD00007 is pointed Data token, analysis shows that the high byte OxlD is the metadata table number, pointing to the metadata table FieldRVA, 0x000007 is the number of rows, should be the 7th line, then the IL instruction points to the .net file metadata table FieldRVA 7th Row data, according to the compression structure of the .net file to determine the offset of the row data, such as 0x02, then the original ILcode: 72 1D00007 can be replaced as follows: Ox 72 02; Step 906h: Determine whether all operation instructions and operation parameters of the defined method are read and compressed, if step 906i is performed, if not, return to step 906a; in the .net file, ILcode is recorded in the method header.
- the length for example, is 0x0000002F in this embodiment.
- Step 906 Calculate the definition method.
- the length of the compressed ILcode In this embodiment, the data obtained by the ILcode compression in the above defined method is:
- Step 908 Determine whether the method headers of all defined methods have been read and completed the compression of the corresponding ILcode, and if so, the execution step 4 910, if not, return to step 4 904, continue to read the next definition method header; in the .net file, as described in step k of step 902, the number of rows defining the method is recorded, that is, the method is defined. The number, in this embodiment, is four definition methods.
- Step 910 The compressed ILcode, the length of the compressed ILcode, and the local variable offset are organized according to a predetermined format to obtain a compressed method body. In this embodiment, preferably, The predetermined format is shown in Table 10: Table 10 The length of the compressed ILcode The offset of the local variable The compressed ILcode
- the compression result of the method body is:
- the method includes: Step 4: Collecting 1002, obtaining a namespace name of a current type in the .NET file;
- the method for obtaining the namespace name in the embodiment may be specifically: obtaining a metadata table containing a namespace name offset in the .net file; obtaining a namespace name offset of the current type from the metadata table, according to The offset of the namespace name reads the namespace name in the "#Strings" stream.
- the .net file contains multiple tables, and the metadata table including the namespace name offset has a definition type or an interface table TypeDef, and a reference type table TypeRef.
- the namespace name is compressed by using the following method: forming the namespace name into a namespace string; and scattering the namespace string.
- the column operation obtains a hash value; the predetermined byte in the hash value is taken as the compressed namespace name.
- the algorithm used in the hash operation may be: MD5, SHA-1 or SHA-2.
- the inclusion of the above namespace names into a namespace string includes: Using a connector to concatenate the public key token of the .net file with the namespace name to get a namespace string.
- Step 1006 determining a type count corresponding to the namespace name, where the type count refers to the number of types included in the namespace; preferably, the type count can be determined by the following method: When the above namespace name is When obtaining once (that is, the namespace name has not been obtained before), the type count corresponding to the above namespace name is set to 1, and each time the above namespace name is obtained, the type count is incremented by 1 until the above is traversed. Metadata table. Step 1008: Combine the compressed namespace name and the type count according to a predetermined format to obtain a compression result of the namespace corresponding to the namespace name.
- the predetermined format may be a fixed length byte.
- the fixed length byte includes two parts, a part of the byte is the above type count, and the other part of the remaining byte is the compressed namespace name.
- the step of obtaining the namespace name of the current type in the .NET file (step 1002) further includes: determining whether the currently obtained namespace name has been obtained, and if not, performing step 1004 above, if The namespace name has been obtained, and the type of the namespace is counted as 1.
- the method for compressing the namespace in the .NET file provided in this embodiment may further include: determining whether the namespace name to which all types belong has been read; if yes, performing the above compression according to a predetermined format.
- step 4 The above-mentioned namespace name and the above-mentioned type count are combined in step 4 (ie, step 4 is gathered 1008); otherwise, the namespace name to which the next type belongs is read (ie, returning step 4 is gathered 1002).
- the obtained namespace name is compressed, and the compressed namespace name is combined with the corresponding type count to obtain a compressed namespace, which can effectively reduce .
- the storage space occupied by the file enables the .net file to be stored and run on 'j, a capacity storage medium (for example, a smart card), thereby enhancing the function of a small-capacity storage medium (for example, a smart card).
- Embodiment 12 provides a method for compressing a namespace in a .NET file.
- a file that has not been compressed by a .net file is called a .net file, and is named by a compression program.
- the compression process of space includes: Step 1102: Obtain a metadata table that includes a namespace name offset in a .net file; and include a plurality of metadata tables in the .net file, where the namespace name is offset
- the metadata table has TypeDef (definition type, interface table), TypeRef (reference type table). The following is an example of obtaining the metadata table TypeDef in the .net file to illustrate the acquisition process of the metadata table; t according to the table method: namespace MyCompany . MyOnCardApp
- the helloworldexe file is obtained and stored in binary form on the hard disk.
- the binary file is a .net file, and the .net file structure is shown in Figure 24. As shown, including: Dos Header, PE features, etc.
- the process of obtaining the metadata table by the compression program is as follows: a. The compression program locates the Dos header of the .net file, and obtains the Dos header 0x5a4d; b. The compression program skips the first agreed byte from the Dos header and reads the PE feature. In the embodiment, the first agreed byte is 0x003a bytes; c.
- the compression program locates the PE feature according to the offset address 0x00000080 of the PE feature. Positioning to obtain the PE feature 0x4550; d. Starting from the PE feature, after reading the second predetermined number of bytes, read four bytes. In this embodiment, a 32-bit machine is taken as an example for description, and the second agreed byte is used. After offsetting 0x0074 bytes from the PE feature, the data of 4 bytes is read as 0x00000010. This value indicates that there are 0x10 directories in the binary file, and contains .net data; where the metadata header of the .net file The address is written in the above OxOF directory, and the second agreed byte in the 64-bit machine is 0x0084 bytes; e.
- the eight bytes are read after shifting the third agreed byte backward from the above data 0x00000010 Data, in this
- the third predetermined number of bytes is 112 bytes, and among the eight bytes of data, the first four bytes are 0x00002008, which is the relative virtual address of the .net data header, and the last four words.
- the section is 0x00000048, which is the length of the .net data header;
- the compression program only obtains the linear address 0x00000208 from the relative virtual address of the net data header. And read the .net header to get the following data:
- the above data is stored in a small endian.
- the first four bytes of the data are 0x48000000, and the data is converted to a large end.
- the storage mode is 0x0000048.
- the linear address is .net.
- the relative virtual address is the memory offset relative to the PE load point.
- the relative virtual address of the section of the .net data directory in the .net file is 0x00002000
- the file offset of the section is 0x00000200
- the compression program reads 8 bytes of data after shifting from the .net data header backward by the fourth predetermined byte.
- the fourth agreed byte is offset backward from the .net data header.
- a total of 8 bytes of data are read.
- the first four bytes are 0x0000218c, which is the relative virtual address of the metadata header (MetaData Header), and the last four bytes are 0x000. 009a0, which is the length of the metadata; h.
- the linear address is obtained from the relative virtual address 0x0000218c of the metadata header.
- the metadata content is obtained according to the linear address and the metadata length; i.
- the compression program is read backward by the metadata header, and when the flag "# ⁇ " is read, the first eight words of the flag "# ⁇ ” are read. Section, where the first four bytes are "# ⁇ ,, the address, the "# ⁇ ” stream is obtained by the address, and the fifth agreed byte in the "# ⁇ ” stream starts to read the length of 8 bytes.
- Data ie 0x0000000920021c57, whose binary form is
- the fifth agreed byte is the ninth byte starting from the start bit of the "# ⁇ "stream; j.
- the binary data obtained in step i is read from the lower bit.
- the first bit represents whether the metadata table Module exists. If it is 1, it proves that the metadata table Module exists. If it is 0, the proof does not exist.
- there is a metadata table Module and the second bit is 1, indicating that the metadata table TypeRef exists, and the third bit is 1, indicating that the metadata table TypeDef exists; wherein, in the data obtained in step i, Starting from the lower position, each bit represents whether there is a corresponding table in the .net file; k.
- the compression program reads the number of data lines of the metadata table TypeDef after shifting the sixth agreed byte after the data 0x0000000920021c57, in this embodiment In the middle, offsetting 16 bytes backwards and reading 4 bytes, the data 0x00000006 is obtained, and it is judged that there are 6 data rows in the metadata table TypeDef; wherein, in the metadata, the data 0x000000092002 lc57 is backward shifted In the data after 8 bytes, the number of data rows of the metadata table existing in the .net file is sequentially stored in units of 4 bytes, and after the data indicating the number of data rows, each of them is sequentially stored. Specific contents data table, the metadata table region; 1. Compress the program reads the content metadata table TypeDef obtained according to the agreed rule.
- the rules agreed in this embodiment are as follows.
- the compression program sequentially reads the data of the number of metadata records after the data 0x000000092002 lc57, that is, 0x00000001 and OxOOOOOOld, and the phase table can be obtained after the metadata table TypeDef.
- the data row of the metadata table Module is 10 bytes per row, and the data behavior in the metadata table TypeRef is 6 bytes per row.
- each behavior metadata table TypeDef records a type name and an attribute, and for each line, reads from the high position,
- the first 4 bytes are Flags (definition type identifier), 5, 6 bytes are the offsets of the defined type name relative to the "#Strings" stream, and 7, 8 bytes are the namespace names to which the definition type belongs.
- the offset of the "#Strings" stream see Table 11 for details: Table 11
- the metadata table TypeDef and TypeRef are type tables, and the type table includes namespace information, and the compression program needs to obtain namespace information from the metadata tables TypeDef and TypeRef.
- the type count of the namespace refers to the number of types existing in the namespace, and the type count of all the namespaces is set to 0 before the compression program starts performing the compression operation.
- the type count for each namespace is independent, that is, each namespace corresponds to a type count.
- Step 1104 The compression program obtains a namespace name of a type in the metadata table.
- the reading metadata table TypeDef is taken as an example.
- the compression program reads the second type in the metadata table TypeDef, according to the second type.
- the structure of the above metadata table reads the offset of the namespace to which it belongs, that is, 0x0023, where the offset of the namespace is an offset from the "#Strings" stream in the metadata, and the compression program is based on
- the process of getting the namespace name for the offset is as follows:
- the compression program locates the location of the "#Strings" stream in the metadata, and reads the namespace information by the namespace offset 0x0023, during the reading process, by offset address 0x0023 Read, encounter the first At the end of a 0x00, the namespace name is:
- the ASC II code corresponding to the above namespace is MyCompany.MyOnCardApp, where the compression program reads out a type of namespace name each time; the compression program locates in the metadata "#Strings, the location of the stream is: After obtaining the address 0x0000038c of the metadata header in step h in 1102, it reads backward from the metadata header, and when it finds the tag "" #Strings,,,,,, after reading ""#Strings,,,,, 8 bytes, get the data 0x5C0300003C040000, where the upper 4 bytes are the offset of the " #Strings" stream relative to the metadata header, the lower 4 bytes are the length of the " #Strings” stream, and the high 4 bytes are converted.
- Step 1106 Determine whether the read namespace name is duplicated with the namespace to which the type that has been read belongs. If yes, go to step 110. 8. If not, step 1110 is performed; because the compression program read in step 1104 is the namespace name of the second type in the metadata table, in this embodiment, the first type in the metadata table TypeDef does not exist.
- Step 1110 The namespace name is formed into a namespace string according to the agreed format. In order to distinguish the file to which the namespace belongs and reduce the collision rate of the data, the namespace name obtained in step 1104 needs to be named according to the agreed format.
- the space string in this embodiment, preferably, the agreed namespace string format is as shown in Table 12: Table 12
- PublicKey Token connector ( . ) Namespace Among them, PublicKeyToken is the public key label. When the .net compiler strongly signs the HelloWorld program, it will generate a HelloWorld.snk file. HelloWorld.snk contains the public and private keys. The above HelloWorld code is compiled by the compiler. After obtaining the PE file, the hash value is calculated for the PE file. The compiler uses the private key to sign the hash value and embed the public key in the PE file. The embedded public key is PublicKey, and the PublicKeyToken is for the PublicKey. The hash operation is obtained by taking the last eight bits. In this embodiment, the PublicKeyToken is 38 05 F8 26 9D 52 A5 B2 as an example.
- the connector is represented by ""
- the resulting namespace string is 3805F8269D52A5B2.
- MyCompany.MyOnCardApp The above connector can also be "-", “_”, space, etc., not limited to ",,;
- the namespace name is also required.
- the type count is set to 1 to indicate that there is at least one type in the namespace;
- Step 1112 hash the namespace string, and take the agreed number of bits as the The compressed namespace name; wherein, the hash operation may use an algorithm such as MD5, SHA-1, SHA-2, etc.
- the MD5 algorithm is preferably used, and the namespace string "3805F8269D52A5B2. MyCompany.
- MyOnCardApp performs a hash calculation to obtain a 120-bit calculation result; in this embodiment, preferably, the calculation result of the 120 takes the first three bytes as a compression result, and in order to make the data byte alignment, it is also possible to Fill in "00”, get the compressed namespace name "ACE6EB00"; the above-mentioned compressed namespace names are arranged in a little endian way, such as the big endian arrangement is 00EBE6AC, the storage in the computer is generally used.
- Step 1114 determining whether the namespace names belonging to all types have been read If yes, go to step 1116. If not, read the namespace name of the namespace to which the next type belongs, that is, return step 4 to 1104; determine whether the namespace names to which all types belong have been read. See if After reading all the rows in the metadata table, if you have read it, it means that the namespace names of all types have been read.
- Step 1116 the compression program organizes the compressed namespace name and the type count of the namespace according to the agreed format, and obtains a compression result of the namespace.
- the agreed format that is, after compression by the compression program
- the structure of the namespace is:
- the type count in the type count namespace name table 13 is the type count of the type contained in the namespace; wherein, in step 1114, the namespace names belonging to all types included in the metadata table are read and compressed. After the completion, you can get the compression result of all the namespace names, and get the type count of the types contained in all the namespaces.
- the compression result of the namespace obtained in step 1112 is taken as an example, where the namespace MyCompany There are 3 types of .MyOnCardApp, and the compressed namespace can be obtained as follows:
- the above compressed namespace is derived from the little endian representation.
- the compression structure of the namespace in the above compression result is only an optimal structure, and the structure can be transformed accordingly, for example, the type count of the namespace is placed after the compressed namespace name, or the value of the type count is equal. Coding transformation, etc., will not be described in detail here.
- the compression process of the namespace in this embodiment is only described by reading a type of namespace in the metadata table.
- a .net file may contain one or more namespaces, and Each namespace includes multiple types. In practice, all types in the metadata table should be read one by one for the corresponding namespace, the namespace name should be taken out, and the different named namespace names should be compressed.
- Type the type of each namespace and get the compressed namespace as described above.
- the method provided in this embodiment may be used when reading the namespace corresponding to each type in the metadata table in the .net file.
- the following gives the result of compressing the namespace of a .net file:
- the compression method of the namespace obtained by this embodiment obtains the namespace name and compresses it according to the agreed format.
- the namespace can be better compressed, and the space required for storing the .net file can be saved.
- the provided compression method implements the operation of the .net file and enhances the performance of the smart card.
- this embodiment provides a compression device of a type defined in a .NET file
- the device includes: a definition type information obtaining module 1202, configured to acquire information included in a definition type used in a .NET file; And obtaining the specified information and the counting information of the definition type according to the information included in the definition type; the compression module 1204 is configured to compress the specified information acquired by the definition type information acquiring module 1202; and the compression result storage module 1206 is configured to: compress the module 1204
- the compressed designation information and the count information acquired by the definition type information acquisition module 1202 are stored as a compression result of the definition type.
- the specific implementation of the definition type information obtaining module 1202 may be: first reading the metadata table where the definition type in the .NET file is located, that is, the metadata table TypeDef; and obtaining the .net file from the metadata table TypeDef.
- the information included in the definition type in this embodiment includes: an identifier of the definition type, an offset of the name of the definition type, and an offset of the method in the definition type; when a field is used in the definition type, The information contained in the definition type may also include an offset of the field in the definition type; the information may be obtained by reading the data in the metadata table TypeDef.
- each row of data represents a definition type.
- Each line of data has a total of 14 bytes.
- the information recorded by these 14 bytes is: The first 4 bytes are Flags (definition type identifier), and 5, 6 bytes are defined type names.
- the specified information includes: an identifier of the definition type, a name of the definition type, and a field corresponding to the definition type.
- the information of the name of the definition type can be found in the data stream corresponding to the .net file, and the information corresponding to the field of the definition type can also pass the above words.
- the offset of the segment is found in the Field table of the metadata table; the counting information includes: method overloading information of the defined type, method count and field count included in the definition type, and the like.
- the identifier of the defined type when the identifier of the defined type is compressed, the identifier of the defined type may be first classified into a type identifier, an access identifier, and Descriptive identification; Then, the type identifier, the access identifier, and the descriptive identifier are ORed, and the obtained data is used as the compression result of the identifier of the definition type; when the name of the definition type is compressed, the name of the definition type can be performed.
- the Greek operation extracting the agreed byte from the operation result as the compression result of the name of the definition type; when compressing the information corresponding to the field in the definition type (the name of the field, the identifier of the field, and the type of the field), according to the field
- the corresponding information content is divided into the following three cases:
- the compressed definition types include: a compression result of the definition type name, a compression result of the type identifier, a field count included in the type, a method count, a method overload information, a field corresponding to the type, and the like. .
- the information may be arranged in a predetermined format or may be arranged arbitrarily.
- this embodiment provides a compression method for defining a type in a .NET file. The method is described by using a compression device implemented in Embodiment 13, and the method includes: Step 1302: Defining a type The information obtaining module 1202 obtains the information included in the definition type used in the .NET file.
- Step 1304 The definition type information obtaining module 1202 obtains the specified information and the counting information of the definition type according to the information included in the definition type.
- Step 1306 The compression module 1204 Compressing the specified information;
- Step 1308 The compression result storage module 1206 stores the compressed designation information and the count information as a compression result of the definition type.
- the information included in the definition type obtained by the definition type information obtaining module 1202 is obtained by: first reading the definition type metadata table in the .net file, that is, the metadata table TypeDef; and then obtaining from the metadata table TypeDef
- the definition type used in the .net file contains information.
- the information included in the definition type in this embodiment includes: an identifier of the definition type, an offset of the name of the definition type, and an offset of the method in the definition type; when a field is used in the definition type, The information contained in the definition type may also include an offset of the field in the definition type; the information may be obtained by reading the data in the metadata table TypeDef.
- each row of data represents a definition type. Each row of data has a total of 14 bytes.
- the information of the 14 bytes is:
- the first 4 bytes are Flags (definition type identifier), and the 5th and 6 bytes are defined type names in "#Strings,, stream
- the offset, 7 or 8 bytes is the offset of the namespace name to which the defined type belongs in the "#Strings" stream in the .net file, and 9, 10 bytes are the succeeded parent of the defined type.
- the offset of 11, 12 bytes is the offset of the field contained in the definition type, and the 13 and 14 bytes are the offset of the method included in the definition type.
- the identifier of the defined type is divided into a type identifier, an access identifier, and a descriptive identifier.
- the step of compressing the specified information by the compression module 1204 in step 1306 includes: The access identifier and the descriptive identifier are ORed, and the obtained data is used as a compression result of the identifier of the definition type; the name of the definition type is hashed, and the agreed byte is extracted from the operation result as the name of the definition type. Compress the result.
- the definition type contains information including: an offset of the field in the definition type; correspondingly, the above specified information further includes: a field in the definition type Corresponding information; The counting information also includes: a field count of the defined type.
- the information corresponding to the field in this embodiment includes: a name of the field, an identifier of the field, and a type of the field; wherein, the identifier of the field is divided into an access identifier and a descriptive identifier; and correspondingly, the compression in step 1306
- the module 1204 compresses the specified information by: performing a hash operation on the name of the field, extracting the agreed byte from the operation result as a compression result of the name of the field; and accessing the identifier in the identifier of the field
- the descriptive identifier is ORed, and the result is the compression result of the identifier of the field; the offset of the type of the field in the compressed type is the compression result of the type of the field.
- the information included in the definition type includes: an offset of the parent class inherited by the definition type; correspondingly, the method further includes: determining whether the parent class inherited by the definition type is compressed, and if so, obtaining the inherited Offset of the parent class; otherwise, compressing the succeeded parent class and assigning an offset to the compressed succeeding parent class; correspondingly, the compressed result of the above defined type further includes the compressed The offset of the parent class.
- the method further includes: determining whether the defined type has an inherited interface; if yes, obtaining the inherited by the defined type The offset of the interface and the number of interfaces following the 7th; correspondingly, the compression result of the defined type also includes the offset of the above-mentioned inherited interface and the number of inherited interfaces.
- the method further includes: determining whether the definition type is a nested type, and if yes, obtaining a compressed offset of the type of the definition type; correspondingly, the compression result of the definition type further includes the definition type The type of offset after the type.
- the data compressed by each definition type is organized in the following format: a hash value defining the type name (the compression result of defining the type name), a compression result of the type identifier, an interface count inherited by the type, and a type parent
- Etc. The offset of the interface inherited by the type, and the information corresponding to the field in the type may have multiple. When there are multiple, the interface offset and the corresponding information of the current definition type are sequentially arranged.
- the offset type after the type of the defined type, the offset of the type after the compression of the interface, and the information corresponding to the field in the type may exist or not exist.
- Embodiment 15 This embodiment provides a compression method for defining a type in a .NET file. The compression method is described by taking a specific application example as an example. In the jtb application instance, the reference type compression in the .NET file is involved. And the stored part, the compression result of the reference type in this example is known content, and can be used directly.
- This embodiment takes the file compiled by the following code as an example to illustrate the compression method of the type defined in the .net file. Part of the code is as follows: namespace MyCompany. MyOnCardApp Public class My Service: MarshalByRefObject
- ClassC ClassB, IA, IB
- strField strl:
- the above code is compiled with the .net platform to get the helloworldexe file, which is stored in binary form on the hard disk.
- the binary file is a .net file.
- the .net file can run under Windows environment and is compatible with PE (Portable Executable, portable).
- Executable) File format, PE format is the format of Windows executable file.
- the .exe file and .dll file in Windows are all in PE format. See Figure 27, which is a schematic diagram of the structure of a .net file.
- the file includes a Dos header, a PE feature, and metadata (metadata).
- the metadata includes a metadata header (MetaData Header), a metadata table (MetaData Tables), and the like. Referring to FIG.
- the compression method for defining a type in the .NET file includes: Step 1401: Locating a starting address of a metadata table (Metadata Tables) in a net file, and acquiring an existing table bit vector;
- the metadata table is part of the .net file, and the process of locating the metadata table is as follows:
- the first agreed byte is skipped, and the offset address of the PE feature is read to obtain the offset address 0x00000080 of the PE feature.
- the first agreed byte is 0x003 a. Bytes;
- the PE feature is located, and the positioning is PE feature 0x00004550;
- the PE feature After the PE feature is started, four bytes are read after the second predetermined byte is offset.
- a 32-bit machine is taken as an example for description, and the second agreed byte is from the PE feature.
- the read data After offsetting 0x0074 bytes backward, the read data is 0x00000010. This value indicates that there are 0x10 directories in the binary file and contains .net data.
- the metadata header address of the .net file is written in the above OxOF. In the directory; if it is in a 64-bit machine, the second agreed byte is 0x0084 bytes;
- the third predetermined byte is read backward by eight bytes of data.
- the third agreed byte is 0x0070 bytes, where In the byte data, the first four bytes are 0x00002008, which is the relative virtual address of the .net data header in the .net file, and the last four bytes are 0x00000048, which is the length of the .net data header; 6).
- the relative virtual address of the .net data header in the net file is 0x00002008 and the linear address is obtained.
- the above data is stored in a small endian.
- the first 4 bytes of the above data 0x48000000 are converted into a big end storage mode of 0x0000048, indicating the length of the data; wherein, according to the .net metadata
- the length of the header 0x00000048 reads 72 bytes of data.
- the linear address is the address of the .net data in the .net file
- the relative virtual address is the memory offset relative to the PE load point.
- Linear address relative Virtual address - section relative virtual address + section file offset, in this embodiment, read
- the relative virtual address of the section of the .net data directory in the .net file is 0x00002000, and the file offset of the section is 0x00000200, then the linear address
- the fourth agreed byte is offset from the .net data header by 8 bytes, and then read.
- the fifth agreed byte is the ninth byte starting from the start bit of the "# ⁇ "stream; the bit vector of the existing table is read from the lower bit, and each bit represents a metadata. Table, if the table exists, the value of the corresponding bit is 1, otherwise 0; for example, starting from the low bit, the first bit represents whether the metadata table Module exists, and if it is 1, it proves that the metadata table Module exists, if If it is 0, the proof does not exist.
- Step 1402 Positioning The metadata table TypeDef (definition type); according to the bit vector of the existing table read out in step 4 1401, records whether the corresponding metadata table exists in the .net file from the clamp to the high position, wherein, 3 bits represent the existence of the metadata table TypeDef. If the third bit is 1, the metadata table TypeDef exists. If the third bit is 0, the metadata table TypeDef does not exist. In this embodiment, the metadata table TypeDef is stored.
- the process of reading the contents of the metadata table TypeDef is as follows: According to the existing table read in step 1401 The bit vector is known to have a metadata table Module and a metadata table TypeRef before the metadata table TypeDef in the embodiment of the present invention, wherein the metadata table Module has 1 data row and 10 bytes, and the metadata table TypeRef has 31 data.
- each data line has 14 bytes.
- the specific data is as follows:
- the data in each data row represents a definition type
- the first 4 bytes of each data row are Flags (definition type identifier)
- 5, 6 bytes are defined type names in .net
- the relative offset in the "#Strings" stream in the file, 7 or 8 bytes is the namespace name to which the definition type belongs
- the relative offset in the "#Strings" stream, 9, 10 bytes is the information of the succeeding parent class of the defined type, 11, 12 bytes are the first field contained in the definition type in the metadata table
- the data line number in the Field, 13, 14 bytes is the data line number of the first method contained in the definition type in the metadata table Method.
- the length of the data in each data row in each metadata table is fixed, and other elements can be calculated by the method of locating the metadata table according to the existing table bit vector and the number of metadata table rows.
- the offset address of the data table in the .net file is fixed, and other elements can be calculated by the method of locating the metadata table according to the existing table bit vector and the number of metadata table rows.
- Step 1403 The identifier and the name of the definition type are read according to the data in the metadata table TypeDef, and the identifier and the name of the definition type are respectively compressed; first, according to the data read into the metadata table TypeDef, read The identifier of the definition type is taken and compressed; the identifier of the definition type of the four bytes is read from the metadata table TypeDef, and the identifier of the definition type is known for each identifier attribute of the definition type; The identifier is divided into three parts: type identifier, access identifier, descriptive identifier, and redefines the values of each identifier attribute.
- the type identifier includes: predefined type 0x00, value type 0x01, enumeration type 0x02, array type 0x03, Class type 0x04, interface type 0x05, unmanaged pointer type 0x06, etc.; access identifiers include: non-public access type NotPublic (0x00), public access type Public (0x10), if nested type: access modifier identification public embedded Set type NestedPublic ( 0x20 ), private nested type NestedPrivate (0x30), family nested type NestedFamily (0x40) , Assembly nested type NestedAssembly (0x50), assembly and family nested type NestedFamANDAssem (0x60), assembly or family nested type NestedFamORAssem (0x70); Descriptive identifier is used to describe some properties of the field in the current type, such as There is a non-serializable field 0x08, otherwise 0x00.
- the method for compressing the identifier of the defined type in this embodiment is specifically: performing a OR operation on the type identifier, the access identifier, and the descriptive identifier, and the obtained 1-byte data is Compress the result for the identity of the defined type.
- the definition type ClassC of the 8th data line of the metadata table TypeDef read in step 1402 is 0x00100002 according to the definition of the first 4 bytes, and the type identifier is 0x04, and the access identifier is NestedPublic ( 0x20 ), the descriptive identifier is 0x00, the value of the above attribute is ORed 0x04
- the definition type name is read and compressed; that is, the name of the definition type is read according to the relative offset of the definition type name in the "#Strings"stream; for example , in the second data row data in the metadata table TypeDef read out, the value in the 5th and 6th bytes is 0x0019, after positioning the "#Strings" stream of the .net file, skipping 0x0019 bytes.
- Start reading data read 0x00 and end, get 0x4D79536572766572, the data is the name of the currently defined type in the metadata table TypeDef MyServer; hash the hash of the read definition type name, from the hash operation result
- the sixth agreed byte is taken as the result of the compression of the definition type.
- the sixth agreed byte is the first two bytes of the hash value.
- the hash algorithm used can be MD5, SHA-1, SHA-2, and so on.
- the MD5 algorithm is used to hash the MyServer, and the following is obtained: 0x0CBEFBClEF0639BA18485104440F399C, and the sixth agreed byte OxOCBE is taken as the compression result of the currently defined type MyServer name in the metadata table TypeDef.
- the combined method may be to use the connector to stitch the two together, or directly splicing the two types, or two types of names Math operations such as addition, subtraction, XOR.
- the code given in this embodiment is taken as an example:
- the type Class C is nested in the type Class A.
- the type name of the Class A is first spliced with the type name of the Class C, and is used in this embodiment.
- the connector "+" is spliced, hashing the ClassA+ClassC, and then taking the first two bytes of the hash value as the type name compression result of the ClassC, so as to avoid duplication of the compression result of the defined type name;
- the name of the ClassA is 0x0042
- the name of the ClassC is 0x0056
- the hash operation is performed on the ClassA+ClassC using the MD5 algorithm.
- the result is: 0x9B 8DE23910B330 AD80BDB76E7 AC 19092 , taking the first two bytes 0x9B8D as the currently defined type in the metadata table TypeDef
- the compression result of the ClassC name is: after obtaining the address 0x0000040c of the metadata header in step 1401, reading backward from the metadata header, when the tag is found "# After Strings" (0x23537472696E6773), the 8 bytes of data before "#Strings" are read, and the data 0x0004000080040000 is obtained.
- Step 1404 Obtain the method overload information in the current definition type, and obtain the method count and the field count of the method included in the current definition type.
- the method for obtaining the method count in this embodiment is as follows: Read the current data table TypeDef Defines the data in the 13th and 14th bytes of the data row where the type is located. The data is the data row number of the first method in the metadata table Method defined by the definition type, and then reads the next definition type of the current definition type. Contains the data line number of the first method in the metadata table Method, the data line number of the first method contained in the next definition type minus the data line number of the first method contained in the current definition type. The result is the count of all the methods contained in the current type.
- the method for obtaining all the method counts contained in the last data row in the metadata table TypeDef is: The number of data rows in the metadata table Method minus the data row number of the first method included in the last defined type data row, The result is the count of all the methods contained in the last data row in the metadata table TypeDef.
- the first method included in the type Class C is defined in the metadata table Method.
- the data line number in the metadata table is 0x0008, and the definition type of the type ClassC is included in the definition of the type StructB.
- the first method in the metadata table Method is the data line number OxOOOA. As can be seen from OxOOOA - 0x0008, the number of methods included in the definition type ClassC is 0x0002.
- the method overloaded information of the defined type is used to describe whether there is a virtual method in the current type. If it exists, the method overload information is incremented by 1, wherein the initial value of the overloaded information of each defined type method is 0.
- locate the metadata table Method the specific method is similar to the process of locating the metadata table TypeDef in step 1402, which is briefly described as follows: According to the bit vector of the existing table read in step 1401, whether the corresponding metadata table exists in the .net file is sequentially recorded from the lower to the upper, wherein the seventh bit represents whether the metadata table Method exists, in this embodiment.
- the data of the 7th bit is 1, and the metadata table Method exists; the ninth byte after the existing epitope vector 0x000002092002 lf57 starts, and the four bytes are recorded as one unit corresponding to the record in the .net file.
- the number of data rows included in the metadata table According to the existing table vector, there are four other metadata tables before the metadata table Method. The four metadata tables in front of the Method are skipped. After the data 0x000002092002 lf57 The 25th byte starts to read 4 bytes, and the data 0x00000009 is obtained.
- This data indicates that there are 9 data rows in the metadata table Method; the contents of the metadata table Method are read: According to the bit vector of the existing table and The data indicating the number of rows of the metadata table can be known.
- the metadata table Module has 10 bytes before the metadata table Method, and the metadata table TypeRef has 186 bytes.
- Data Sheet TypeDef 140 bytes and metadata table byte Field 54, after skipping the foregoing metadata table 3, Method read metadata table 9 data lines of 126 bytes. Then, according to the data line number of the first method included in the current definition type in the metadata table Method, the data of the corresponding data row is read from the metadata table Method, and the seventh and eighth of each data row in the metadata table Method The byte is the method identifier.
- the method identifier it is determined whether the method in the current type is a virtual method. If yes, the method overload information is incremented by one; if the method count of the currently defined type is greater than 1, then the metadata table continues. Read the next data line and perform the operation according to the above method. Until all the methods included in the current definition type are read and judged; specifically, the method for determining whether the method in the current type is a virtual method and needs to open a new storage slot is: reading the identifier of the method in the current type, The ID is used to perform the AND operation with 0x0100. If the result of the operation is 0x0100, it can be determined that the method in the current type is a virtual method and a new storage slot needs to be opened.
- Obtaining the field count included in the current definition type is similar to the method of obtaining the method count included in the current type.
- the simple description is as follows: Read the data stored in the 11th and 12th bytes of the row of the currently defined type in the metadata table TypeDef, Get the data row number of the first field contained in the current definition type in the metadata table Field, and then read the next definition type of the current definition type stored in the 11th and 12th bytes in the next data row.
- the first field is the data line number in the metadata table Field, and the latter is the data line number minus
- the result of going to the former data line number is the field count contained in the current type.
- the first field included in the type Class C is defined in the data table Field.
- the data line number in the metadata table Field is 0x0007, and the first definition type of the type ClassC is defined as the first type included in the StructB.
- the field data line number is 0x0009. It can be seen from 0x0009 - 0x0007 that the number of fields included in the current definition type ClassC is 0x0002, that is, the field count of the defined type ClassC is 0x0002.
- Step 1405 Acquire information corresponding to the field in the currently defined type and compress it.
- the method of reading the metadata table Field is similar to step 1402; the length of each data row in the metadata table Field is 6 bytes; wherein the first and second bytes are stored in the data row in the metadata table Field.
- the process of compressing the information corresponding to the fields in the currently defined type is as follows:
- the seventh agreed byte is the first two bytes of the hash value; in this embodiment, Class C includes two fields, and the name (field name) is strField and iField respectively, and the field name of the field strField The "strField" is compressed as an example.
- the hash value obtained after compression is: 0x846461722F82E1CAB3D95632E8424089
- the first two bytes of the hash value 0x8464 are taken as the compression result of the field name "strField".
- the flag information of the field can be determined according to the Flags of the field, and the field is in this embodiment.
- the third byte data indicates the field type, or the field type information is included in the fourth byte; according to the third byte
- the field type shown finds the corresponding type in the metadata table, or parses the field type information contained in the 4th byte to get the corresponding type of the field in the metadata table, and compresses the type in the metadata table.
- the offset is saved as the type of the field.
- the method of locating the "#Blob” stream position is located in the metadata in step 41402 "#Strings, the method of the location of the stream is similar: after the address 0x0000040c of the metadata header is obtained in step 1401, the backward reading is started from the metadata header, and when the tag "#Blob" (0x23426C6F6) is found, the read is performed.
- the 8 bytes of data before "#Blob” get the data 0xD4080000C8010000, where the high 4 bytes are converted to big end and the representation is 0x000008D4, which means the offset of the "#Blob” stream relative to the metadata header, 4 words lower
- the representation of the section converted to big end is 0x0001C8, which means the length of the "#Blob” stream, and the data area of the "#Blob” stream is obtained by offsetting 0x0000040c4 from the address of the metadata header to 0x000008D4.
- the method for defining the field type of the field strField contained in the type ClassC is as follows: Read the data of the 4th byte of strField in the metadata table Field to get OxOOOA, and then offset in the data area of the "#Blob" stream. The data is read at OxOOOA, and the data 0x02 of the first byte is read, indicating that it is necessary to read data of 2 bytes in length after the data, and ⁇ , where the second byte is 0x06, indicating the word.
- Post-holiday data Shows the field type, continues to read the third byte to get the data ⁇ , according to the language specification, ⁇ indicates that the field type is string type, find the offset of the string type compressed in the metadata table TypeRef, the result of the search is 0x03, save 0x03 as the field type of the field strField.
- This embodiment defines that the information corresponding to the compressed field contains three parts: 2 bytes of length name (field name) Compressed value, 1 byte Flags (Field ID) Compressed value and 1-byte Signature information compression value; if the current type contains multiple fields, each field is compressed and saved in order.
- the ClassC contains The information of the fields strField and iField is compressed: 0x84641103 F1EC0106. If there is no defined field in the currently defined type, the item does not exist in the compressed definition type.
- Step 1406 Determine whether the current definition type has a parent class, if If there is a parent class and its parent class is uncompressed, then step 4 is aggregated 1403, and the parent class of the current class is compressed by a recursive method.
- step 1407 is performed; data is read from the 9th and 10th bytes of the data line of the current type in the metadata table TypeDef; if the read data is 0x0000, the current type has no parent class, and the execution step 4 is 1407;
- the data read from the 9th and 10th bytes of the data line of the current type in the metadata table TypeDef is not 0x0000, and the read data stored in the little endian storage mode is converted into the format of the big endian storage, that is, the conversion After the high byte is in the front, the low byte is in the back form, and then the data is converted into binary
- the system number is shifted to the right by two digits, and the data row number of the parent class of the current type in the metadata table TypeRef or TypeDef is obtained; the data row number obtained after the shift is ANDed with 0x03, and if the operation result is 0, the current type is In the metadata table TypeRef, the parent class performs step 1407; if the operation result is 1, it can
- Step 1407 Obtain the compressed type of the type of the parent class of the definition type 7; read the data from the 9th and 10th bytes of the data line of the current type in the metadata table TypeDef, if the read data is 0x0000, the current type has no parent class, save OxFF means that the current type has no parent type; if the data read from the 9th and 10th bytes of the data line of the current type in the metadata table TypeDef is not 0x0000, it will be read
- the data saved in the little endian storage mode is converted into a big endian storage format, that is, the high byte first and the low byte are after the conversion, and then the binary number of the converted data is shifted to the right by two, to obtain the current
- the acquisition process of the parent class of ClassC is as follows: Read the 9th and 10th bytes of the data row of the ClassC in the metadata table TypeDef, and obtain the data converted to the big end storage format and then 0x0014, converted into binary For: 10100, shifting the binary number 10100 to the right by 2 bits to get 101 is 0x05. Since the result of the AND operation between 0x05 and 0x03 is 0x01, it can be seen that the parent class of ClassC is 0x05 data lines in the metadata table TypeDef, and then finds the element. The compressed offset of the 0x05 data row of the data table TypeDef is stored as the offset of the ClassC parent class.
- Step 1408 Assign an offset to the currently defined type; after compressing each defined type according to the method described in the above step, obtain the .net file.
- the compression information of the metadata table TypeRef (reference type) continues to reference the type-compressed offset to continuously assign an offset to the definition type. For example: If the offset of the last type after the reference type compression in the .net file is OxlA, the offset of the first type in the compressed definition type is OxlB, and the offset of the second defined type after compression is 0xlC.
- Step 1409 Determine whether there is a definition type waiting for the parent class offset in the cache, if yes, execute step 4 to gather 1407, otherwise execute step 4 to gather 1410;
- Step 1410 Determine whether all the defined types of data are analyzed, if Then, step 1411 is performed. Otherwise, step 1403 is executed to continue compressing the remaining definition types in the metadata table TypeDef.
- Step 1411 Obtain the interface offset and the number of inherited interfaces inherited by the currently defined type; when all the defined types of data are After the above processing, you can query the metadata table.
- the data in Interfacelmpl gets the interface offset and the number of interfaces inherited by each defined type.
- the metadata table Interfacelmpl interface class
- the method of reading the metadata table is similar to step 1402. ; Read the metadata table Interfacelmpl.
- each row has 4 bytes, the first and second bytes of Class indicate that the type of the interface inheriting the interface is in the data row of the metadata table TypeDef, and the value conversion in the third and fourth byte Interface
- the value obtained by shifting the binary number to the right by 2 bits is the data row of the interface class in the metadata table TypeDef.
- Step 1412 Obtain the compressed offset of the type of the nested type. Determine whether the current type is a nested type according to the identifier information of the currently defined type. If the type is a nested type, obtain the offset of the type of the current nested type.
- Quantity if it is determined that the current type is not a nested type, the item does not exist in the compressed definition type.
- the definition type identification information of reading four bytes in the metadata table TypeDef in step 1403 it is judged whether the current type is a nested type, and if it is a nested type, it is also read in the metadata table NestedClass (nested type).
- the information is based on the existing epitope vector positioning metadata table NestedClass, the specific method is similar to step 1402; read the metadata table NestedClass.
- NestedClass There are 4 bytes in each row in the metadata table NestedClass, the first and second bytes are NestedClass, which indicates the data row of the current nested type in the metadata table TypeDef, and the 3rd and 4th bytes are EnclosingClass indicating the currently defined type.
- the data type in which the type is in the metadata table TypeDef Find the data row of the type of the current type in the metadata table TypeDef, obtain the compressed offset of the type of the current type, and use it as the type of the current nested type; Step 1413: Organize and store the compressed Define type data.
- the compressed data for each defined type is organized in the following format: the hash value that defines the type name, the compression result of the type identifier, the interface count inherited by the type, the offset of the type parent class, and the fields contained in the type.
- Count the method overload information in the type, the offset of the type of the nested type, the offset of the interface where the type is located, and the information corresponding to the field in the type; where, the offset of the interface of the type, the information corresponding to the field in the type
- the information corresponding to the field may or may not exist in the three parts. If it does not exist, the information is not written in the compression result.
- the result of defining the type ClassC is:
- the data to be compressed in the embodiment is the binary data that is compiled after the code written in the .NET structure, and the compression rate of the defined type can be up to 30% by the method provided in this embodiment, thereby effectively Reduce the storage space occupied by .net files, so that .net files can be stored and run on small-capacity storage media (such as smart cards), which enhances the function of small-capacity storage media (such as smart cards).
- the compression method or the compression device provided by the above embodiments effectively reduces the storage space occupied by the .net file, which facilitates the use of .net files on various devices, and also saves system resources and improves resource utilization.
- modules or steps of the present invention can be implemented by a general-purpose computing device, which can be concentrated on a single computing device or distributed over a network composed of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device so that they may be stored in the storage device by the computing device, or they may be separately fabricated into individual integrated circuit modules, or Multiple modules or steps are made into a single integrated circuit module.
- the invention is not limited to any specific combination of hardware and software.
- the above is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modification, equivalent substitution, improvement, etc. made within the "God and Principles" of the present invention shall be included in the protection of the present invention. Within the scope.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A method for compressing.net document is provide, in which the method includes at least one of the following steps: obtaining a reference type in the.net document and compressing the reference type; obtaining a definition method in the.net document and compressing the definition method; obtaining a method body of the definition method in the.net document and compressing the method body; obtaining a name space in the.net document and compressing the name space; obtaining a definition type in the.net document and compressing the definition type. By compressing the.net document, the memory capacity occupied by the.net document can be effectively reduced, so that the.net document can be stored and operated on a memory medium such as an intelligent card with a small capacity.
Description
•net文件压缩方法 技术领域 本发明涉及计算机应用领 i或, 涉及一种. net文件压缩方法。 背景技术 net是微软的新一代技术平台,是全新的基于互联网的跨语言软件开发平 台, 顺应了当今软件工业分布式计算、 面向组件、 企业级应用、 软件月艮务化 和以 Web为中心等大趋势。 .net并非开发语言, 但是在 .net开发平台上可以 支持多种开发语言, 如〇#语言、 C++、 Visual Basic、 Jscript等。 智能卡是一种大小和普通名片相仿的塑料卡片, 内含一块直径 1cm左右 的硅芯片,具有存储信息和进行复杂运算的功能。 它被广泛地应用于电话卡、 金融卡、 身份识别卡以及移动电话、 付费电视等领域。 其中, 智能卡的芯片 上集成了微处理器、 存储器以及输入 /输出单元等, 使得智能卡被认为是世界 上最小的电子计算机。 并且在智能卡上拥有一整套性能较强的安全保密控制 机制, 安全控制程序被固化在只读存储器中, 因而具有无法复制密码等可靠 的安全保证。 智能卡与普通磁卡相比, 还具有信息存储容量大, 可利用微处 理器来增加卡片功能等优点。 BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a computer application or to a .net file compression method. BACKGROUND OF THE INVENTION Net is Microsoft's next-generation technology platform. It is a new Internet-based cross-language software development platform that conforms to today's software industry distributed computing, component-oriented, enterprise-level applications, software, and Web-centric. Trend. .net is not a development language, but it can support multiple development languages on the .net development platform, such as 语言#language, C++, Visual Basic, Jscript, etc. A smart card is a plastic card that is similar in size to a normal business card. It contains a silicon chip with a diameter of about 1 cm and has the function of storing information and performing complex calculations. It is widely used in telephone cards, financial cards, identification cards, and mobile phones, pay TV and other fields. Among them, the smart card chip integrates a microprocessor, a memory, and an input/output unit, so that the smart card is considered to be the world's smallest electronic computer. Moreover, the smart card has a set of strong security and security control mechanisms, and the security control program is solidified in the read-only memory, thus having reliable security guarantees such as the inability to copy passwords. Compared with ordinary magnetic cards, smart cards also have a large information storage capacity, and the advantages of the card function can be increased by using a microprocessor.
.net卡是一种含有可以运行. net程序的 .net卡虚拟机的啟处理器智能卡。 所谓虚拟机, 是指可以把它想象成一台用软件来模拟的机器, 在这台机器里 有处理器、 内存、 寄存器等各种硬件, 模拟运行各种指令, 在这台机器上运 行的软件对运行环境没有特殊要求, 所以虚拟机对在它上运行的程序来说是 透明的。 例如, x86虚拟机模拟的是 x86指令程序的运行环境, c51虚拟机模 拟的是 c51指令程序的运行环境。 A .net card is a smart card that contains a .net card virtual machine that can run .net programs. The so-called virtual machine means that it can be imagined as a machine that is simulated by software. In this machine, there are various hardware such as processors, memory, registers, etc., which simulate various commands and software running on this machine. There are no special requirements for the runtime environment, so the virtual machine is transparent to the programs running on it. For example, the x86 virtual machine simulates the operating environment of the x86 instruction program, and the c51 virtual machine simulates the operating environment of the c51 instruction program.
.net 程序包括命名空间、 引用类型、 定义类型、 定义方法、 引用方法、 IL ( Intermediate Language, 中间语言) 代码等。 但是目前的智能卡由于体积和存储芯片的限制, 其存储空间仍然有限, 随着软件的发展, 部分功能大的程序占用存储空间较大, 对于很多的 .net 程 序并不能进行存储和运行。
综上所述, 相关技术中的 .net 程序压缩效果较差, 不能很好地在小容量 的存储介质 (例如: 智能卡) 上存储和运行, 且针对该问题目前尚未提出有 效的解决方案。 发明内容 本发明旨在提供一种 .net文件的压缩方法,能够解决. net程序压缩效果较 差, 不能很好地在小容量的存储介质(例如: 智能卡)上存储和运行等问题。 在本发明的实施例中, 提供了一种. net 文件的压缩方法, 所述方法包括 下列步 中的至少一个: 获取. net文件中的引用类型, 对所述引用类型进行压缩; 获取. net文件中的定义方法, 对所述定义方法进行压缩; 获取. net文件中的定义方法的方法体, 对所述方法体进行压缩; 获取. net文件中的命名空间, 对所述命名空间进行压缩; 获取. net文件中的定义类型, 对所述定义类型进行压缩。 本发明通过对 .net文件压缩,有效地减小了 .net文件占用的存储容量, 利 于. net文件在小存储量的设备上使用, 也节省了资源, 提高了资源的利用率。 附图说明 此处所说明的附图用来提供对本发明的进一步理解, 构成本申请的一部 分, 本发明的示意性实施例及其说明用于解释本发明, 并不构成对本发明的 不当限定。 图 1是本发明实施例 1提供的 .net文件中引用类型的压缩装置的结构框 图; 图 2是本发明实施例 1提供的引用类型名称获取模块的具体结构框图; 图 3是本发明实施例 2提供的 .net文件中引用类型的压缩方法的流程图; 图 4是本发明实施例 3提供的 .net文件中引用类型的压缩方法的流程图;
图 5是本发明实施例 3提供的 .net文件的结构示意图; 图 6是本发明实施例 3提供的统计引用类型的方法计数和字段计数的方 法流程图; 图 7示出了本发明实施例 4提供的一种. net文件的定义方法的压缩方法 流程图; 图 8示出了本发明实施例 5提供的一种. net文件的定义方法的压缩方法 流程图; 图 9示出了本发明实施例 5提供的利用. net文件中的定义方法信息构造 字符串的方法流程图; 图 10示出了本发明实施例 5提供的为压缩大头方法特有的数据项的方 法流程图; 图 11示出了本发明实施例 5提供的大头方法标识的结构示意图; 图 12是本发明实施例 6提供的 .net文件的定义方法的方法体的压缩装置 的结构框图; 图 13是本发明实施例 7提供的 .net文件的定义方法的方法体的压缩装置 的结构框图; 图 14是本发明实施例 8提供的 .net文件的定义方法的方法体的压缩方法 流程图; 图 15是本发明实施例 9提供的 .net文件的定义方法的方法体的压缩方法 流程图; 图 16是本发明实施例 10提供的 .net文件的定义方法的方法体的压缩方 法的流程图; 图 17是本发明实施例 10提供的 .net文件的结构示意图; 图 18是本发明实施例 10提供的获取方法头的方法流程图 图 19是本发明实施例 10提供的大头方法的数据格式示意图;
图 20是本发明实施例 10提供的小头方法的数据格式示意图; 图 21是本发明实施例 10提供的对 ILcode进行压缩的方法流程图; 图 22是本发明实施例 11提供的 .net文件中命名空间的压缩方法的流程 图; 图 23是本发明实施例 12提供的 .net文件中命名空间的压缩方法的流程 图; 图 24是本发明实施例 12提供的 .net文件的结构的示意图; 图 25示出了实施例 13提供的 .net文件中定义类型的压缩装置结构图; 图 26示出了实施例 14提供的 .net文件中定义类型的压缩方法流程图; 图 27示出了实施例 14提供的 .net文件的结构示意图; 图 28示出了实施例 15提供的 .net文件中定义类型的压缩方法流程图。 具体实施方式 下面将参考附图并结合实施例, 来详细说明本发明。 下文中将参考附图并结合实施例来详细说明本发明。 需要说明的是, 在 不冲突的情况下, 本申请中的实施例及实施例中的特征可以相互组合。 下面结合附图对技术方案的实施作进一步的详细描述: 实施例 1 本实施例提供了一种 .net文件中引用类型的压缩装置, 如图 1所示, 该 装置包括: 引用类型名称获取模块 102、 压缩模块 104、 统计模块 106和组 合模块 108, 各个模块的功能如下: 引用类型名称获取模块 102, 用于获取. net 文件中使用的引用类型的名 称; 压缩模块 104, 用于对引用类型名称获取模块 102获取的引用类型的名 称进行压缩, 得到压缩后的引用类型的名称;
统计模块 106, 用于统计引用类型名称获取模块 102获取的引用类型的 方法计数和字段计数; 组合模块 108, 用于按照预先确定的格式对压缩模块 104压缩后的引用 类型的名称、 统计模块 106统计出的方法计数和字段计数进行组合, 得到引 用类型的压缩结果。 其中, 引用类型名称获取模块 102可以釆用多种方式获取引用类型的名 称, 本实施例以图 2所示为例进行说明, 如图 2所示, 为本实施例提供的引 用类型名称获取模块的具体结构框图, 引用类型名称获取模块 102包括: 第一元数据表获取单元 1022, 用于获取. net文件中第一元数据表; 在. net 文件中包含有多个表, 本实施例中将元数据表 TypeRef (引用类型或接口表) 作为第一元数据表, 该表中记录了 .net 文件中使用的引用类型的名称和该引 用类型所属的命名空间。 地址信息读取单元 1024, 用于从第一元数据表获取单元 1022获取的第 一元数据表中读取该 .net文件中使用的引用类型的名称的地址信息; 引用类型名称读取单元 1026, 用于才艮据地址信息读取单元 1024读取的 地址信息读取引用类型的名称。 优选的, 压缩模块 104进行压缩时釆用的算法可以是散列算法, 具体而 言, 可以为 MD5、 SHA-1或 SHA-2等。 统计模块 106在统计方法计数和字段计数时, 可以 居第二元数据表中 记录的信息进行统计, 其中, 第二元数据表为. net 文件中的元数据表 MemberRef, 该表中的每行数据记录了引用类型信息和特征标识值, 根据 I 用类型信息中的数据可以确定当前行数据指向第一元数据表 TypeRef中第几 行记录的引用类型, 进而确定该当前行数据指向的引用类型的名称; 特征标 识值中的数据用以指明该当前行数据记录的是方法还是字段, 根据特征标识 值中的数据则可以统计出每个引用类型的名称对应的方法计数和字段计数。 优选的, 组合模块 108中的预先确定的格式为固定长度的字节, 该固定 长度的字节包括三部分, 其中, 第一部分为压缩后的引用类型的名称, 第二 部分为方法计数, 第三部分为字段计数。 这三部分可以任意组合。 在引用类型的压缩结果中包含方法计数和字段计数的目的是: 当. net 文
件的其他部分也进行了对应的压缩时, 可以才艮据该方法计数查找到对应的方 法, 根据字段计数查找到对应的字段, 使压缩后的引用类型能够正常使用。 本实施例中的压缩模块 104釆用对引用类型名称获取模块 102获取到的 引用类型的名称进行压缩, 组合模块 108通过对压缩模块 104压缩后的引用 类型的名称和统计模块 106统计出的方法计数、 字段计数进行组合, 得到压 缩后的引用类型, 可以有效地降低. net文件占用的存储空间,使. net文件可以 在小容量存储介质 (例如: 智能卡) 上存储并运行, 进而增强了小容量存储 介质 (例如: 智能卡) 的功能。 实施例 2 本实施例提供了一种 .net 文件中引用类型的压缩方法, 该方法以在实施 例 1提供的压缩装置中运行为例进行说明, 如图 3所示, 该方法包括: 步骤 202: 获取. net文件中使用的引用类型的名称; 步骤 204: 对所述引用类型的名称进行压缩, 得到压缩后的引用类型的 名称; 步骤 206: 统计所述引用类型的方法计数和字段计数; 步骤 208: 按照预先确定的格式对所述压缩后的引用类型的名称、 所述 方法计数和字段计数进行组合, 得到引用类型的压缩结果。 优选的, 步骤 202具体包括: 获取. net文件中第一元数据表; 从第一元 数据表中读取该 .net 文件中使用的引用类型的名称的地址信息; 根据该地址 信息读取上述引用类型的名称。 优选的, 步骤 202中还可以包括 居获取的引用类型的名称生成引用类 型名称字符串的步骤, 该引用类型名称字符串具体可以釆用以下两种方式中 的任一种实现: .net programs include namespaces, reference types, definition types, definition methods, reference methods, IL (Intermediate Language) code, and more. However, due to the limitation of the size and storage chip, the current smart card still has limited storage space. With the development of software, some programs with large functions occupy a large storage space, and many .net programs cannot be stored and run. In summary, the .NET program in the related art has a poor compression effect and cannot be stored and operated on a small-capacity storage medium (for example, a smart card), and an effective solution has not been proposed for this problem. SUMMARY OF THE INVENTION The present invention is directed to a compression method for a .net file, which can solve the problem that the .net program has a poor compression effect and cannot be stored and operated on a small-capacity storage medium (for example, a smart card). In an embodiment of the present invention, a method for compressing a .net file is provided, the method comprising at least one of the following steps: obtaining a reference type in a .net file, compressing the reference type; obtaining . net a definition method in the file, compressing the definition method; obtaining a method body of the definition method in the .NET file, compressing the method body; obtaining a namespace in the .net file, and compressing the namespace Get the definition type in the .net file and compress the definition type. By compressing the .net file, the invention effectively reduces the storage capacity occupied by the .net file, and facilitates the use of the net file on a small storage device, which also saves resources and improves resource utilization. BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are set to illustrate,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 1 is a structural block diagram of a compression device of a reference type in a .NET file provided by Embodiment 1 of the present invention; FIG. 2 is a block diagram showing a specific structure of a reference type name obtaining module according to Embodiment 1 of the present invention; 2 is a flowchart of a compression method of a reference type in a .NET file provided; FIG. 4 is a flowchart of a compression method of a reference type in a .NET file provided by Embodiment 3 of the present invention; 5 is a schematic structural diagram of a .NET file provided by Embodiment 3 of the present invention; FIG. 6 is a flowchart of a method for counting and counting a method of a statistical reference type according to Embodiment 3 of the present invention; FIG. 7 shows an embodiment of the present invention. 4 is a flowchart of a compression method for defining a method of a .NET file; FIG. 8 is a flowchart showing a compression method of a method for defining a .NET file according to Embodiment 5 of the present invention; FIG. 9 shows the present invention. A flowchart of a method for constructing a character string using the definition method information in the .NET file provided in Embodiment 5; FIG. 10 is a flowchart showing a method for compressing a data item specific to the big header method according to Embodiment 5 of the present invention; FIG. 12 is a structural block diagram of a compression device of a method body for defining a .NET file according to Embodiment 6 of the present invention; FIG. 13 is a structural block diagram of a compression method of a method body for defining a .NET file according to Embodiment 6 of the present invention; A block diagram of a compression method of a method body of a method for defining a .NET file; FIG. 14 is a flowchart of a method for compressing a method body of a method for defining a .NET file according to Embodiment 8 of the present invention; A flowchart of a method for compressing a method body of a method for defining a .NET file according to Embodiment 9 of the present invention; FIG. 16 is a flowchart of a method for compressing a method body for defining a .NET file according to Embodiment 10 of the present invention; FIG. 18 is a schematic diagram of a method for obtaining a method header according to Embodiment 10 of the present invention; FIG. 18 is a schematic diagram of a data format of a method for providing a header according to Embodiment 10 of the present invention; 20 is a schematic diagram of a data format of a small head method according to Embodiment 10 of the present invention; FIG. 21 is a flowchart of a method for compressing ILcode according to Embodiment 10 of the present invention; and FIG. 22 is a .net file provided by Embodiment 11 of the present invention. FIG. 23 is a flowchart of a method for compressing a namespace in a .NET file according to Embodiment 12 of the present invention; FIG. 24 is a schematic diagram showing a structure of a .NET file according to Embodiment 12 of the present invention; 25 is a structural diagram of a compression device of a type defined in the .NET file provided in Embodiment 13; FIG. 26 is a flowchart showing a compression method of a type defined in the .NET file provided in Embodiment 14. FIG. A schematic diagram of the structure of a .NET file provided in Embodiment 14; FIG. 28 is a flowchart showing a compression method of a type defined in the .NET file provided in Embodiment 15. BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, the present invention will be described in detail with reference to the accompanying drawings in conjunction with the embodiments. The invention will be described in detail below with reference to the drawings in conjunction with the embodiments. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict. The implementation of the technical solution is further described in detail below with reference to the accompanying drawings: Embodiment 1 This embodiment provides a compression device of a reference type in a .NET file. As shown in FIG. 1, the device includes: a reference type name acquisition module. 102. The compression module 104, the statistics module 106, and the combination module 108, the functions of each module are as follows: a reference type name obtaining module 102, configured to obtain a name of a reference type used in a .net file; and a compression module 104, configured to reference a type The name of the reference type obtained by the name obtaining module 102 is compressed to obtain the name of the compressed reference type; The statistics module 106 is configured to collect a method count and a field count of the reference type obtained by the reference type name obtaining module 102. The combining module 108 is configured to: name the reference type compressed by the compression module 104 according to a predetermined format, and the statistics module 106. The calculated method count and field count are combined to obtain a compression result of the reference type. The reference type name obtaining module 102 can obtain the name of the reference type in a plurality of manners. This embodiment is described by using FIG. 2 as an example. As shown in FIG. 2, the reference type name obtaining module provided in this embodiment is used. The specific structure block diagram, the reference type name obtaining module 102 includes: a first metadata table obtaining unit 1022, configured to obtain a first metadata table in the .net file; and a plurality of tables included in the .net file, in this embodiment The metadata table TypeRef (reference type or interface table) is used as the first metadata table, which records the name of the reference type used in the .NET file and the namespace to which the reference type belongs. The address information reading unit 1024 is configured to read the address information of the name of the reference type used in the .NET file from the first metadata table acquired by the first metadata table obtaining unit 1022; the reference type name reading unit 1026 And for reading the name of the reference type based on the address information read by the address information reading unit 1024. Preferably, the algorithm used by the compression module 104 for compression may be a hash algorithm, and specifically, may be MD5, SHA-1 or SHA-2. The statistics module 106 may perform statistics on the information recorded in the second metadata table when the method count and the field count are performed, wherein the second metadata table is a metadata table MemberRef in the .net file, and each row in the table The data records the reference type information and the feature identification value. According to the data in the type information, the current row data can be determined to point to the reference type of the first row of the first metadata table TypeRef, and then the reference type pointed to by the current row data is determined. The data in the feature identifier value is used to indicate whether the current row data record is a method or a field. According to the data in the feature identifier value, the method count and the field count corresponding to the name of each reference type can be counted. Preferably, the predetermined format in the combination module 108 is a fixed length byte, and the fixed length byte includes three parts, wherein the first part is the name of the compressed reference type, and the second part is the method count, The three parts are the field counts. These three parts can be combined arbitrarily. The purpose of including method counts and field counts in the compression result of a reference type is: when .net When other parts of the piece are also compressed, the corresponding method can be found according to the method, and the corresponding field is found according to the field count, so that the compressed reference type can be used normally. The compression module 104 in this embodiment compresses the name of the reference type obtained by the reference type name obtaining module 102, and the combination module 108 calculates the name of the reference type compressed by the compression module 104 and the statistical module 106. Counting and field counting are combined to obtain a compressed reference type, which can effectively reduce the storage space occupied by the .net file, so that the .net file can be stored and run on a small-capacity storage medium (for example, a smart card), thereby enhancing the small size. The function of a capacity storage medium (for example, a smart card). Embodiment 2 This embodiment provides a compression method of a reference type in a .NET file. The method is described as an example of running in the compression device provided in Embodiment 1. As shown in FIG. 3, the method includes: Step 202 : obtaining the name of the reference type used in the .net file; Step 204: Compressing the name of the reference type to obtain the name of the compressed reference type; Step 206: Counting the method count and the field count of the reference type; Step 208: Combine the name of the compressed reference type, the method count, and the field count according to a predetermined format to obtain a compression result of the reference type. Preferably, the step 202 specifically includes: acquiring a first metadata table in the .net file; reading address information of a name of the reference type used in the .net file from the first metadata table; reading the above according to the address information The name of the reference type. Preferably, the step 202 may further include the step of generating a reference type name string by using the obtained reference type name, and the reference type name string may be implemented in any one of the following two ways:
1 ) 将上述获取的引用类型的名称转换成预定的编码格式, 例如 ASCII 编码, 生成引用类型名称字符串。 1) Convert the name of the reference type obtained above into a predetermined encoding format, such as ASCII encoding, to generate a reference type name string.
2 ) 先获取上述引用类型所属的命名空间名称; 将该命名空间名称与上 述引用类型的名称组合, 生成引用类型名称字符串。
对上述 I用类型的名称进行压缩的步骤包括:对上述 I用类型的名称(或 者生成的引用类型名称字符串) 进行散列运算得到散列值; 取所述散列值中 预定的字节作为压缩后的引用类型的名称。 其中, 散列运算釆用的算法为:2) First obtain the namespace name to which the above reference type belongs; combine the namespace name with the name of the above reference type to generate a reference type name string. The step of compressing the name of the above-mentioned I type includes: hashing the name of the type I for the above I (or the generated reference type name string) to obtain a hash value; taking a predetermined byte of the hash value As the name of the compressed reference type. Among them, the algorithm used in hashing is:
MD5、 SHA-1或 SHA-2等。 优选的, 统计方法计数和字段计数的步骤具体包括: 获取第二元数据表; 对该第二元数据表的每一行执行下述操作: 读取该第二元数据表中当前行数据所指向的引用类型; 当上述当前行数据指向的引用类型的名称与获取的引用类型的名称一致 时,根据上述当前行数据的特征标识值判断该当前行数据记录的是否为方法, 如果是, 将上述引用类型的方法计数加 1 ; 否则, 将上述引用类型的字段计 数力口 1。 优选的, 步骤 208中提到的预先确定的格式为固定长度的字节, 该固定 长度的字节包括三部分, 其中, 第一部分为压缩后的引用类型的名称, 第二 部分为方法计数, 第三部分为字段计数。 其中, 上述第一元数据表和第二元数据表可以具体为实施例 1中的元数 据表, 这里不再赘述。 本实施例提供的方法是以在实施例 1提供的压缩装置中实现为例进行的 说明。 釆用对获取到的引用类型的名称进行压缩, 组合压缩后的引用类型的 名称和统计出的方法计数、 字段计数, 得到压缩后的引用类型, 可以有效地 降低. net文件占用的存储空间, 使. net文件可以在小容量存储介质 (例如: 智 能卡)上存储并运行, 进而增强了小容量存储介质(例如: 智能卡)的功能。 实施例 3 本实施例提供了一种 .net 文件中引用类型的压缩方法, 该方法中将经 i±.net平台编译后的未被压缩的文件称为. net文件。 如图 4所示, 为本实施例提供的. net文件中引用类型的压缩方法流程图, 包括步骤 302至步骤 310, 具体如下: 步骤 302: 获取. net文件中的第一元数据表,本实施例中的第一元数据表 具体为元数据表 TypeRef;
在 .net文件中包含有多个表,其中元数据表 TypeRef(引用类型或接口表) 中记录了 .net 文件中使用的引用类型的名称和该引用类型所属的命名空间的 信息; 元数据表为 PE ( Portable Excutable, 可移植可执行) 文件的一部分, 本 实施例以将下面的代码编译后得到的. net文件为例进行说明: namespace MyCompany. MyOnCardApp { public class My Service: MarshalByRefObject MD5, SHA-1 or SHA-2, etc. Preferably, the step of counting the statistics and the counting of the fields specifically includes: acquiring a second metadata table; performing the following operations on each row of the second metadata table: reading the current row data in the second metadata table a reference type; when the name of the reference type pointed to by the current row data is consistent with the name of the obtained reference type, determining whether the current row data record is a method according to the feature identifier value of the current row data, and if so, the above The method type of the reference type is incremented by 1; otherwise, the field of the above reference type is counted as 1. Preferably, the predetermined format mentioned in step 208 is a fixed length byte, and the fixed length byte includes three parts, wherein the first part is the name of the compressed reference type, and the second part is the method count. The third part is the field count. The first metadata table and the second metadata table may be specifically the metadata table in Embodiment 1, and details are not described herein again. The method provided in this embodiment is explained by taking the implementation in the compression device provided in Embodiment 1 as an example.压缩 Compress the name of the obtained reference type, combine the name of the compressed reference type, and count the method count and field count to obtain the compressed reference type, which can effectively reduce the storage space occupied by the .net file. The .net file can be stored and run on a small-capacity storage medium (for example, a smart card), thereby enhancing the functionality of a small-capacity storage medium (for example, a smart card). Embodiment 3 This embodiment provides a compression method of a reference type in a .NET file, in which an uncompressed file compiled by the i±.net platform is referred to as a .net file. As shown in FIG. 4, the flowchart of the compression method of the reference type in the .NET file provided in this embodiment includes steps 302 to 310, as follows: Step 302: Obtain the first metadata table in the .net file, The first metadata table in the embodiment is specifically a metadata table TypeRef; There are multiple tables in the .net file, where the metadata table TypeRef (reference type or interface table) records the name of the reference type used in the .NET file and the namespace of the reference type. Metadata table As part of the PE (Portable Excutable, Portable Executable) file, this example uses the .net file compiled from the following code as an example: namespace MyCompany. MyOnCardApp { public class My Service: MarshalByRefObject
{ {
static Version ver = new Version(l? 1, 1, 1); static Int32 callCount = 0; static ClassB classb = new ClassB(); Static Version ver = new Version(l ? 1, 1, 1); static Int32 callCount = 0; static ClassB classb = new ClassB();
String strResult = Boolean.FalseString; public String MySampleMethod() String strResult = Boolean.FalseString; public String MySampleMethod()
{ {
String strHello = "Hello World!"; return strHello + callCount. ToString(); String strHello = "Hello World!"; return strHello + callCount. ToString();
} }
} public class ClassB {} public struct StructB{ } } public class ClassB {} public struct StructB{ }
} 对上述代码使用. net平台编译后得到 helloworldexe文件, 并以二进制的 形式存储在硬盘上, 该二进制文件为. net文件, 如图 5所示, 为本实施例措
供的 .net 文件的结构示意图, 该文件包括 Dos 头、 PE 特征以及元数据 ( MetaData ), 元数据中包括元数据头( MetaData Header ),元数据表 ( tables ) 等。 下面对获取元数据表 TypeRef的过程进行说明: a. 定位. net文件的 Dos头, 本实施例得到的 Dos头为 0x5 a4d; b. 从 dos头后跳过第一约定个字节, 读出 PE特征的偏移地址, 得到 PE 特征的偏移地址 0x00000080; 在本实施例中, 第一约定个字节为 0x003a个字节; c. 才艮据 PE特征偏移地址 0x00000080定位 PE特征, 定位得到 PE特征 0x4550; d. 在 PE特征向后偏移第二约定个字节后读取四个字节,在本实施例中, 以 32位机为例进行说明, 第二约定个字节为从 PE特征向后偏移 0x0074字 节后, 读出 4 个字节的数据为 0x00000010, 此值说明该二进制文件中存在 0x10个目录且包含 .net数据; 其中, .net文件的元数据头相对虚拟地址写在上述第 OxOF个目录中, 在} After compiling the above code using the .net platform, the helloworldexe file is obtained and stored in binary form on the hard disk. The binary file is a .net file, as shown in Figure 5, for this embodiment. A schematic diagram of a .NET file, including a Dos header, a PE feature, and metadata (metadata). The metadata includes a metadata header (metadata header), a metadata table (tables), and the like. The following describes the process of obtaining the metadata table TypeRef: a. Positioning the Dos header of the net file, the Dos header obtained in this embodiment is 0x5 a4d; b. skipping the first agreed byte from the dos header, reading The offset address of the PE feature is obtained, and the offset address of the PE feature is 0x00000080. In this embodiment, the first agreed byte is 0x003a bytes; c. The PE feature is located according to the PE feature offset address 0x00000080. Positioning the PE feature 0x4550; d. Reading the four bytes after the PE feature is shifted backward by the second predetermined byte. In this embodiment, the 32-bit machine is taken as an example for description, and the second agreed byte is used. After offsetting 0x0074 bytes from the PE feature, the data of 4 bytes is read as 0x00000010. This value indicates that there are 0x10 directories in the binary file and contains .net data; where, the metadata header of the .net file The relative virtual address is written in the above OxOF directory,
64位机中第二约定个字节为 0x0084个字节; e. 从上述数据 0x00000010,向后偏移第三约定个字节后读取八个字节数 据, 在本实施例中, 优选地, 第三约定个字节为 112个字节, 在此八个字节 数据中, 前四个字节为 0x00002008, 为. net数据头的相对虚拟地址, 后四个 字节为 0x00000048, 为. net数据头的长度; The second agreed byte in the 64-bit machine is 0x0084 bytes; e. Read eight bytes of data after shifting the third agreed byte backward from the above data 0x00000010, in this embodiment, preferably The third agreed byte is 112 bytes. In the eight bytes of data, the first four bytes are 0x00002008, which is the relative virtual address of the .net data header, and the last four bytes are 0x00000048. The length of the net data header;
£ 才艮据 .net数据头的相对虚拟地址得到线性地址 0x00000208,并读取 .net 数据头得到如下数据: The data base of the .net data header gets the linear address 0x00000208, and reads the .net header to get the following data:
48000000020005008C210000A0090000 090000000500000600000000000000005 48000000020005008C210000A0090000 090000000500000600000000000000005
020000080000000000000000000000000
0000000000000 需要说明的是, 上述数据釆用小端的存储方式, 例如, 上述数据前 4个 字节 0x48000000为该数据的长度, 转换成大端的存储方式为 0x0000048; 在本实施例中, 线性地址为. net数据在 .net文件中的地址,相对虚拟地址 为相对于 PE载入点的内存偏移, 线性地址和相对虚拟地址的转换关系为: 线性地址=相对虚拟地址-节相对虚拟地址 +节的文件偏移, 在本实施例中,读 取. net文件中 .net数据目录的节的相对虚拟地址为 0x00002000 , 节的文件偏 移 为 0x00000200 , 则 线 性 地 址020000080000000000000000000000000 0000000000000 It should be noted that the above data is stored by the little endian. For example, the first 4 bytes of the above data 0x48000000 is the length of the data, and the storage mode converted to the big end is 0x0000048; in this embodiment, the linear address is. The address of the net data in the .net file, the relative virtual address is the memory offset relative to the PE load point, and the conversion relationship between the linear address and the relative virtual address is: linear address = relative virtual address - section relative virtual address + section File offset, in this embodiment, the relative virtual address of the section of the .net data directory in the .net file is 0x00002000, and the file offset of the section is 0x00000200, then the linear address
=0x00002008-0x00002000+0x00000200=0x00000208; g. 由. net数据头向后偏移第四约定个字节后读取八个字节数据, 在本实 施例中第四约定个字节为从 .net数据头向后偏移 8个字节后, 读取共 8个字 节, 在这 8 个字节中, 前四个字节为 0x0000218c, 为元数据头 ( MetaData Header ) 的相对虚拟地址, 后四个字节为 0x000009a0 , 为元数据的长度; h. 将元数据头的相对虚拟地址 0x0000218c 转换得到线性地址 0x0000038c, 根据线性地址和元数据长度, 得到元数据内容; i. 由元数据头向后读取, 当读取到标志" #〜 "时, 读取标志" #〜"前的八个 字节, 其中前四个字节为" #〜"的地址, 通过该地址得到" #〜"流, 在" #〜"流中 第五约定个字节开始读取开始读取长度为 8 个字节的数据, 即 0x000000092002 lc57, 其 二 进 制 形 式 为 100100100000000000100001110001010111 ; 在本实施例中,第五约定个字节为" #〜"流中起始位开始算起第 9个字节; j. 根据步骤 i中得到的二进制数据, 从低位开始读取, 例如, 第 1位代 表元数据表 Module是否存在, 如果是 1则证明存在元数据表 Module, 如果 是 0证明不存在, 在本实施例中, 存在元数据表 Module, 并且第 2位为 1 表示元数据表 TypeRef存在; 其中, 在步骤 i中所得到的数据中, 从低位开 始, 每一位代表. net文件中是否存在对应的表; k. 在数据 0x000000092002 lc57后偏移第六约定个字节后读取元数据表 TypeRef的数据行数, 在本实施例中为向后偏移 12个字节后读取 4个字节, 得到数据 OxOOOOOOle, 判断得出元数据表 TypeRef 中存在 30个数据行; 其
中, 在元数据中,数据 0x000000092002 lc57向后偏移 8个字节后的数据中以 每 4个字节为一个单位依次存储了在. net文件中存在的元数据表的数据行数, 在表示数据行数的数据后, 依次存储了每个元数据表的具体内容, 为元数据 表区域; 1. 根据约定的方法读取元数据表 TypeRef的具体内容。 其中, 约定的方 法如下, 以本实施例中的. net文件为例进行说明, 在步骤 j 中判断得出元数 据表 Module存在, 并读取其数据行数为 1 , 元数据表 Module的数据行每行 数据为 10个字节, 因此在元数据表区域中, 向后偏移 10个字节, 第 11个字 节开始为元数据表 TypeRef的内容, 元数据表 TypeRef的数据行为 30行, 每 行数据为 6个字节, 因此元数据表 TypeRef的数据长度为 30*6=180个字节; 元数据表 TypeRef的部分数据如下: =0x00002008-0x00002000+0x00000200=0x00000208; g. The eight-byte data is read after the fourth contract byte is shifted backward by the .net data header. In this embodiment, the fourth agreed byte is from .net After the data header is shifted backward by 8 bytes, a total of 8 bytes are read. Among the 8 bytes, the first four bytes are 0x0000218c, which is the relative virtual address of the metadata header (MetaData Header). The four bytes are 0x000009a0, which is the length of the metadata; h. The relative virtual address 0x0000218c of the metadata header is converted to the linear address 0x0000038c, and the metadata content is obtained according to the linear address and the metadata length; i. by the metadata header After reading, when reading the flag "#~", read the eight bytes before the flag "#~", where the first four bytes are the address of "#~", which is obtained by the address "#~ "Stream, in the ##" stream, the fifth agreed byte starts reading and starts reading data of length 8 bytes, ie 0x000000092002 lc57, its binary form is 100100100000000000100001110001010111; in this embodiment, the fifth convention The number of bytes is "#~" in the stream. Counting the ninth byte; j. According to the binary data obtained in step i, reading from the lower bit, for example, the first bit represents whether the metadata table Module exists, and if it is 1, the metadata table Module is present, if Yes, the certificate is not present. In this embodiment, there is a metadata table Module, and the second bit is 1 indicating that the metadata table TypeRef exists; wherein, in the data obtained in step i, starting from the lower bit, each bit Represents whether there is a corresponding table in the .net file; k. Reading the number of data rows of the metadata table TypeRef after offsetting the sixth agreed byte after the data 0x000000092002 lc57, in this embodiment, offsetting 12 backwards After reading 4 bytes after the byte, the data OxOOOOOOle is obtained, and it is judged that there are 30 data rows in the metadata table TypeRef; In the metadata, the data 0x000000092002 lc57 is shifted backward by 8 bytes, and the number of data rows of the metadata table existing in the .net file is sequentially stored in units of 4 bytes. After the data representing the number of data rows, the specific content of each metadata table is sequentially stored as a metadata table area; 1. The specific content of the metadata table TypeRef is read according to the agreed method. The agreed method is as follows. The .net file in this embodiment is taken as an example for description. In step j, it is determined that the metadata table Module exists, and the data of the data row number is 1, and the data of the metadata table Module is read. The row data is 10 bytes per row, so in the metadata table area, the backward offset is 10 bytes, the 11th byte starts as the content of the metadata table TypeRef, and the data behavior of the metadata table TypeRef is 30 rows. The data per line is 6 bytes, so the data length of the metadata table TypeRef is 30*6=180 bytes; part of the data of the metadata table TypeRef is as follows:
0600 6100 5A00 0600 6100 5A00
0600 7400 5A00 0600 7400 5A00
0600 7B00 5A00 0600 8500 5A00 0600 7B00 5A00 0600 8500 5A00
0600 EF00 DD00 0600 EF00 DD00
0600 0801 DD00 0600 0801 DD00
0600 2101 DD00 0600 2101 DD00
0600 3C01 DD00 上述数据为本实施例所提供的 .net文件中元数据表 TypeRef 中的前 8行 数据, 在本实施例中元数据表 TypeRef共 30行数据, 剩余的数据处理方法相 同, 不再——列举。 在上述数据的每一行中, 由高位到低位, 前两个字节为类型解析作用域 的编码标识, 第 3、 4个字节为该引用类型的名称在" #Strings"流中的偏移量, 后两个字节为该引用类型所属的命名空间名称在" #Strings"流中的偏移量, 为 便于说明, 参见表 1所示:
表 1 0600 3C01 DD00 The above data is the first 8 rows of data in the metadata table TypeRef in the .net file provided in this embodiment. In this embodiment, the metadata table TypeRef has 30 rows of data, and the remaining data processing methods are the same, no longer - enumeration. In each row of the above data, from the high to the low, the first two bytes are the coded identifier of the type resolution scope, and the third and fourth bytes are the offset of the name of the reference type in the "#Strings" stream. The last two bytes are the offsets of the namespace name to which the reference type belongs in the "#Strings" stream. For ease of explanation, see Table 1: Table 1
需要说明的是, 表中的数据使用大端的表示方法, 例如第一个数据行的 数据为 0060, 0061 , 005a, 对应的小端表示方法为 0600 6100 5a00^ 在使用本实施例提供的压缩方法时, 初始化设定所有引用类型的方法计 数为 0, 所有引用类型的字段计数为 0, 上述引用类型的方法计数为在. net文 件中该引用类型包括的方法的个数, 引用类型的字段计数为该引用类型包括 的字段的个数; 步骤 304: 获取. net文件中引用类型的名称, 并将该引用类型的名称转换 为引用类型名称字符串; 在步骤 302中获取了引用类型的名称相对于" #Strings"流的偏移量, 本实 施例以元数据表 TypeRef中第一个引用类型,即相对偏移量为 0x0061的引用 类型为例, 说明获取引用类型的名称方法如下: 在步骤 302中的步骤 h中获得元数据头的地址 0x0000038c后,从元数据 头开始向后读取, 当发现标记" #Strings,,"后, 读取" #Strings"的前 8个字节, 得到数据 0x5C0300003C040000; 数据 0x5C0300003C040000的高 4个字节为" #Strings"流相对于元数据头 的偏移, 低 4个字节为" #Strings"流的长度, 其中, 高 4个字节转换成大端的 表示方式为 0x0000035c, 低 4个字节转换成大端的表示方式为 0x0000043c; 才艮据元数据头的地址 0x0000038c向后偏移 0x0000035c得到" #Strings,,¾f 的数据区域,根据元数据表 TypeRef中第一个引用类型的偏移量,由 "#Strings" 流的头向后偏移 0x0061 , 并向后读取直到 00结束, 得到第一个引用类型的
名称为 0x4D61727368616C42 795265664F626A656374; 通过 ASC II码对第一个引用类型的名称进行转换得到引用类型名称字符 串 MarshalByRefDbject, 其他引用类型的名称读取与转换方法与第一个引用 类型的名称相同, 不再赘述; 其中, 引用类型名称字符串还可以通过下述方法获取: 获取引用类型的 名称, 并获取该引用类型所属的命名空间名称, 将命名空间名称使用连接符 " "与引用类型的名称连接得到引用类型名称字符串, 例如第一个引用类型所 属的命名空间为 System , 按上述方法可得到第一个引用类型的名称转换后的 引用类型名称字符串: System. MarshalByRefDbject; 步骤 306: 对引用类型名称字符串进行散列运算, 并取预定的字节作为 压缩后的引用类型的名称; 其中, 进行散列运算的算法可以为 MD5、 SHA-1、 SHA-2等, 本实施例 中, 优选地釆用 MD5算法进行说明, 对步骤 304 中得到的引用类型名称字 符串" MarshalByRefDbject"进行 MD5 运算得到一个 120 位的 MD5 值 "3064AB63C4B4DC57770E9BDF25B7547D"; 在本实施例中, 优选地取上述 MD5 值前两个字节作为压缩后的引用类 型的名称, 即" 3064"; 需要说明的是, 在本实施例中, 可以按照步骤 302-306的顺序, 依次读 取一个引用类型的名称并进行压缩得到压缩结果, 也可以先获得 .net 文件中 所有的引用类型的名称, 然后对得多引用类型的名称逐一进行压缩得到压缩 结果, 并将得到的压缩结果进行緩存; 步骤 308: 统计上述得到的引用类型的方法计数和字段计数; 参见图 6, 为本实施例提供的统计引用类型的方法计数和字段计数的方 法流程图, 引用类型的方法计数和字段计数统计方法如下: 步骤 3081 : 获取第二元数据表, 本实施例中的第二元数据表具体为元数 据表 MemberRef; 在本实施例中, 获取元数据表 MemberRef的过程包括:
由步骤 302 中获取元数据表 TypeRef 的步骤 i 中获得的二进制数据 100100100000000000100001110001010111可知, 由低位开始读取, 第 11位 为 1表示元数据表 MemberRef存在, 第 11位前共有 5个 1 , 判断得出元数 据表 MemberRef前存在其他 5个元数据表; 由步骤 k得知, 读取得到的元数据表 MemberRef中存在 23个数据行, 并且元数据表 MemberRef前的五个表中共存在 50个数据行; 其中, 元数据 表 Module包括数据行为 1行, 元数据表 TypeRef包括数据行为 30行, 元数 据表 TypeDef 包括数据行为 6行, 元数据表 Field包括数据行为 7行, 元数 据表 Method包括数据行为 6行; 根据步骤 302中的步骤 1中约定的方法得知,元数据表 Module的数据行 中每行为 10个字节, 元数据表 TypeRef的数据行中每行为 6个字节, 元数据 表 TypeDef的数据行中每行为 14个字节, 元数据表 Field的数据行中每行为 6个字节, 元数据表 Method的数据行中每行为 14个字节; 因 此元数据表 MemberRef 在元数据表区域的偏移地址为 1* 10+30*6+6* 14+7*6+6* 14=400, 从元数据表区 i或开始偏移 400个字节后得 到元数据表 MemberRef的内容, 元数据表 MemberRef的数据行为每行 6个 字节, 计算得到元数据表 MemberRef的长度为 23* 6= 138个字节, 元数据表 MemberRef的部分数据如下: It should be noted that the data in the table uses a big end representation method, for example, the data of the first data line is 0060, 0061, 005a, and the corresponding small end representation method is 0600 6100 5a00^ using the compression method provided by this embodiment. When the initialization method sets the reference count of all reference types to 0, the field count of all reference types is 0, the method of the above reference type counts as the number of methods included in the reference type in the .net file, and the field count of the reference type The number of fields included in the reference type; Step 304: Get the name of the reference type in the .net file, and convert the name of the reference type to the reference type name string; in step 302, obtain the name of the reference type relative to For the offset of the "#Strings" stream, the first reference type in the metadata table TypeRef, that is, the reference type with a relative offset of 0x0061 is taken as an example. The method for obtaining the name of the reference type is as follows: After obtaining the address 0x0000038c of the metadata header in step h in 302, the content is read backward from the metadata header. When the tag "#Strings,," is found, Read the first 8 bytes of "#Strings" to get the data 0x5C0300003C040000; the upper 4 bytes of the data 0x5C0300003C040000 are the offset of the "#Strings" stream relative to the metadata header, the lower 4 bytes are "#Strings" The length of the stream, where the high 4 bytes are converted to the big end, the representation is 0x0000035c, and the lower 4 bytes are converted to the big end, the representation is 0x0000043c; the data header 0x0000038c is offset backward by 0x0000035c. #Strings,,3⁄4f The data area, according to the offset of the first reference type in the metadata table TypeRef, offset by 0x0061 from the head of the "#Strings" stream, and read backward until the end of 00, get the first a reference type The name is 0x4D61727368616C42 795265664F626A656374; The name of the first reference type is converted by the ASC II code to get the reference type name string MarshalByRefDbject. The name reading and conversion methods of other reference types are the same as the name of the first reference type, and will not be described again. ; The reference type name string can also be obtained by: obtaining the name of the reference type, and obtaining the namespace name to which the reference type belongs, and using the connector "" to connect with the name of the reference type to get a reference. The type name string, for example, the namespace of the first reference type belongs to System, and the reference type name string of the first reference type after the name conversion is obtained as follows: System. MarshalByRefDbject; Step 306: Reference type name The string is hashed, and the predetermined byte is taken as the name of the compressed reference type; wherein the algorithm for performing the hash operation may be MD5, SHA-1, SHA-2, etc., in this embodiment, preferably进行 Illustrated with the MD5 algorithm, in step 304 The reference type name string "MarshalByRefDbject" is subjected to the MD5 operation to obtain a 120-bit MD5 value "3064AB63C4B4DC57770E9BDF25B7547D"; in this embodiment, the first two bytes of the MD5 value are preferably taken as the name of the compressed reference type. That is, "3064"; it should be noted that, in this embodiment, the names of a reference type may be sequentially read and compressed to obtain a compression result in the order of steps 302-306, or all the .NET files may be obtained first. Referencing the name of the type, and then compressing the names of the many reference types one by one to obtain the compressed result, and buffering the obtained compression result; Step 308: Counting the method count and the field count of the reference type obtained above; see FIG. 6, The method for counting the method count and the field count of the statistical reference type provided in this embodiment, the method for counting the count of the reference type and the method for counting the field are as follows: Step 3081: Obtain the second metadata table, the second metadata in this embodiment The table is specifically a metadata table MemberRef; in this embodiment, the metadata table MemberR is obtained. The process of ef includes: It can be seen from the binary data 100100100000000000100001110001010111 obtained in the step i of the metadata table TypeRef in step 302 that the reading is started from the lower bit, the 11th bit is 1 indicating that the metadata table exists, and there are 5 1s before the 11th bit. There are five other metadata tables before the metadata table MemberRef; from step k, there are 23 data rows in the metadata table MemberRef, and there are 50 data rows in the five tables before the metadata table MemberRef. Wherein, the metadata table Module includes 1 row of data behavior, the metadata table TypeRef includes 30 rows of data behavior, the metadata table TypeDef includes 6 rows of data behavior, the metadata table Field includes 7 rows of data behavior, and the metadata table Method includes data behavior. 6 lines; according to the method agreed in step 1 in step 302, the data row of the metadata table Module has 10 bytes per behavior, and the data row of the metadata table TypeRef has 6 bytes per behavior, and the metadata table Each line of TypeDef has 14 bytes of data, and each row of data in the metadata table Field has 6 bytes. Each data row of the metadata table Method is used. Is 14 bytes; therefore the offset of the metadata table MemberRef in the metadata table area is 1* 10+30*6+6* 14+7*6+6* 14=400, from the metadata table area i or After starting offsetting 400 bytes, the content of the metadata table MemberRef is obtained. The data behavior of the metadata table MemberRef is 6 bytes per line, and the length of the metadata table MemberRef is calculated to be 23*6=138 bytes, metadata. Some of the data for the table MemberRef are as follows:
2900 C000 4300 3100 C000 4300 2900 C000 4300 3100 C000 4300
3900 C000 4800 3900 C000 4800
4100 C000 4300 4100 C000 4300
4900 C000 4300 4900 C000 4300
5100 C000 4300 5900 C000 4300 5100 C000 4300 5900 C000 4300
6100 C000 4300 上述数据为本实施例所提供的. net文件的元数据表 MemberRef 中的前 8
行中的数据, 其他数据处理方法相同, 这里不再——列举。 在元数据表 MemberRef中储存有引用类型的方法和字段信息,上述数据 的每一行记录一个方法或字段的引用特征, 即特征标识值 (signature)。 其中, 每行数据的高 2个字节表示该字段或方法指向的引用类型, 中间 2个字节为 该字段或方法的名称, 低 2 个字节表示该方法或字段的特征标识值相对于 "#Blob"流的偏移量, 特征标识值 signature 记录了该数据行所表示的为方法 还是字段, 以及返回值等信息。 为了便于说明, 参见表 2 , 表 2为上述 8行数据的列表形式: 6100 C000 4300 The above data is the first 8 in the metadata table MemberRef of the .net file provided in this embodiment. The data in the line, the other data processing methods are the same, here no longer - enumerated. A method and field information of a reference type are stored in the metadata table MemberRef, and each row of the above data records a reference feature of a method or field, that is, a signature value. The upper 2 bytes of each row of data represent the reference type pointed to by the field or method, the middle 2 bytes are the name of the field or method, and the lower 2 bytes indicate the feature identifier value of the method or field relative to The offset of the "#Blob" stream, the signature value of the signature records whether the data row represents a method or a field, and a return value. For convenience of explanation, see Table 2, Table 2 is a list form of the above 8 rows of data:
表 2 Table 2
步骤 3082: 读取元数据表 MemberRef 中每个方法或字段指向的引用类 型; 如表 2所示, 高 2个字节记录的为该行方法或字段指向的引用类型, 下 面以元数据表 MemberRef中第一行数据进行说明, Class为 0x0029,将 0x0029 转换成二进制 101001 , 向右移 3位得到 101 , 转换成十进制为 5 , 则元数据 表 MemberRef中第一行的字段或方法指向元数据表 TypeRef中第 5行所记录 的引用类型, 通过步骤 302和步骤 304所述的方法可得到元数据表 TypeRef 中第 5行所记录的引用类型的名称为 AssemblvKevNameAttribute; 步骤 3083 : 依次获取元数据表 MemberRef 中每个方法或字段的特征标 识值 signature; 如表 2所示,在元数据表 MemberRef中每一行的氐 2字节为该方法或字 段的特征标识值 signature相对于" #Blob"流的偏移;获取元数据表 MemberRef 中每个方法或字段的 signature的方法: ^下:
定位" #Blob"流的位置,即在步骤 302中的步骤 h中获得元数据头的地址 0x0000038c 后, 从元数据头开始向后读取, 当发现标记" #Blob"后, 读取 "#Blob"前的 8个字节, 得到数据 Ox E4070000BC010000, 其中高 4个字节为 "#Blob"流相对于元数据头的偏移, 低 4个字节为" #Blob"流的长度, 高 4个 字节转换成大端的表示方式为 0x000007e4, 氏 4个字节转换成大端的表示方 式为 0x00000 lbc; 压缩程序根据元数据头的地址 0x0000038c , 向后偏移 0x000007e4得到 "#Strings"流的数据区域; 本实施例以读取元数据表 MemberRef中第一行数据行为例进行说明,第 一行特征标识值 Signature偏移为 0x0043 ,在" #Blob"流中偏移 0x0043获得第 一行数据的特征标识值 Signature为 0x2001010e; 步骤 3084: 根据读出的特征标识值 Signature判断该当前行数据记录的 是方法还是字段,如果是方法,执行步骤 3085 ,如果是字段,执行步骤 3086; 本实施例以读取元数据表 MemberRef 中第一行数据行读出的特征标识 值 Signature为例进行说明, 才艮据读出的特征标识值 Signature判断是方法还 是字段的具体过程如下: 第一行数据行读出的特征标识值 Signature为 0x2001010e, 本实施例的 判断标准为当特征标识值 Signature的后 4位为 0x060e时, 该当前行数据记 录的为字段, 否则, 该行数据记录的为方法; 由此可知, 在本实施例中, 元 数据表第一行数据记录的为方法, 即引用类型 AssemblvKevNameAttribute引 用了元数据表 MemberRef中第一行所记录的方法; 步骤 3085: 将元数据表 MemberRef 中该当前行数据所指向的引用类型 的方法计数加 1 , 然后返回步 4聚 3083; 步骤 3086: 将元数据表 MemberRef 中该当前行数据所指向的引用类型 的字段计数加 1 , 然后返回步 4聚 3083; 通过步 4聚 3081-3086中提供的方法, 将元数据表 MemberRef全部读取完 成后, 得到. net文件中所有引用类型的方法计数和字段计数; 需要说明的是, 在本实施例中, 步骤 302-306与步骤 308的顺序可以互 换执行, 即先执行步骤 308, 再执行步骤 302-306, 也可得到本实施例相同的 效果;
步骤 310: 按照预先确定的格式对压缩后的引用类型的名称、 该引用类 型的方法计数和字段计数进行组合, 得到引用类型的压缩结果; 在本实施例中, 预先确定的格式可以为表 3所示格式, 即该格式为一段 固定长度的字节,该固定长度可以根据需要设定, 这段字节中包括三个部分, 第一部分为压缩后的引用类型的名称, 第二部分为该引用类型的方法计数, 第三部分为该引用类型的字段计数。 表 3 Step 3082: Read the reference type pointed to by each method or field in the metadata table MemberRef; as shown in Table 2, the high 2-byte record is the reference type pointed to by the row method or field, and the metadata table MemberRef is below. The first row of data is described, Class is 0x0029, 0x0029 is converted to binary 101001, 3 bits are shifted to the right to get 101, and converted to decimal is 5, then the field or method of the first row in the metadata table of the metadata table points to the metadata table. The reference type recorded in the fifth row of the TypeRef can be obtained by the method described in step 302 and step 304, and the name of the reference type recorded in the fifth row of the metadata table TypeRef is AssemblvKevNameAttribute; Step 3038: sequentially obtain the metadata table MemberRef The feature identifier value of each method or field is signature; as shown in Table 2, the 氐2 byte of each row in the metadata table MemberRef is the feature identifier value of the method or field relative to the "#Blob" stream. Move; get the method of the signature of each method or field in the metadata table MemberRef: ^下下: Positioning the "#Blob" stream, that is, after obtaining the address 0x0000038c of the metadata header in step h in step 302, reading backward from the metadata header, and reading "#Blob" after reading "#Blob" The first 8 bytes of Blob" get the data Ox E4070000BC010000, where the upper 4 bytes are the offset of the "#Blob" stream relative to the metadata header, and the lower 4 bytes are the length of the "#Blob" stream, high The conversion of 4 bytes into big end is 0x000007e4, and the conversion of 4 bytes into big end is 0x00000 lbc; the compression program offsets 0x000007e4 according to the address 0x0000038c of the metadata header to get the stream of "#Strings" The data area; in this embodiment, the first row data behavior example in the metadata table MemberRef is read. The first row feature identifier value Signature offset is 0x0043, and the first row data is obtained by offsetting 0x0043 in the "#Blob" stream. The feature identifier value Signature is 0x2001010e; Step 3084: determining whether the current row data record is a method or a field according to the read feature identifier value Signature, if it is a method, performing step 3085; if it is a field, performing step 3086; To read The feature identification value Signature read out in the first row of data rows in the metadata table is described as an example. The specific process of determining whether the method or the field is based on the read feature identification value Signature is as follows: The first row of data rows is read out. The signature value of the feature identifier is 0x2001010e. The criterion of the embodiment is that when the last 4 digits of the signature value Signature is 0x060e, the current row data record is a field, otherwise, the row data record is a method; In this embodiment, the data of the first row of the metadata table is recorded, that is, the reference type AssemblvKevNameAttribute refers to the method recorded in the first row of the metadata table MemberRef; Step 3085: The current row data in the metadata table MemberRef The method count of the reference type pointed to is incremented by 1, and then returns to step 4 to gather 3083; Step 3086: increment the field count of the reference type pointed to by the current row data in the metadata table MemberRef, and then return to step 4 to gather 3083; Step 4: The method provided in 3081-3086, after reading the metadata table MemberRef, all the references in the .net file are obtained. The method count and the field count of the type; it should be noted that, in this embodiment, the sequence of steps 302-306 and step 308 can be performed interchangeably, that is, step 308 is performed first, and then steps 302-306 are performed, and the present invention can also be obtained. The same effect of the embodiment; Step 310: Combine the name of the compressed reference type, the method count of the reference type, and the field count according to a predetermined format to obtain a compression result of the reference type. In this embodiment, the predetermined format may be Table 3. The format shown, that is, the format is a fixed length of bytes, the fixed length can be set as needed, the byte includes three parts, the first part is the name of the compressed reference type, and the second part is the The method count of the reference type, and the third part is the field count of the reference type. table 3
压缩后的引用类型的名称 I 方法计数 I 字段计数 按照表 3所示结构将步骤 306中得到的压缩后的引用类型的名称和与其 对应的方法计数和字段计数进行组合, 得到压缩后的引用类型, 当该引用类 型的方法计数或字段计数为 0 时, 使用 0x00 进行填充, 例如对引用类型 MarshalByRefObj ect的压缩结果为: 0x30640100。 表 3所示的压缩结构仅为最优的结构, 压缩结构还可以做相应的变换, 例如将上述压缩结构中方法计数、字段计数置于压缩后的引用类型的名称后, 或将方法计数、 字段计数的值进行同等编码变换等; 下面给出一个 .net文件中引用类型的压缩结果: The name of the compressed reference type I method count I field count combines the name of the compressed reference type obtained in step 306 with its corresponding method count and field count according to the structure shown in Table 3, and obtains the compressed reference type. When the method count or field count of the reference type is 0, it is padded with 0x00, for example, the compression result of the reference type MarshalByRefObj ect is: 0x30640100. The compression structure shown in Table 3 is only an optimal structure, and the compression structure can also be transformed accordingly. For example, after the method count and the field count in the above compression structure are placed under the name of the compressed reference type, or the method is counted, The value of the field count is equivalently encoded, etc.; the compression result of the reference type in a .net file is given below:
8E3B 0100 0000 8E3B 0100 0000
//System.Runtime.CompilerServices.RuntimeHelpers //System.Runtime.CompilerServices.RuntimeHelpers
(1) DB8C 0100 0000 (1) DB8C 0100 0000
//.HelloWorld.exe.3805F8269D52A5B2. //.HelloWorld.exe.3805F8269D52A5B2.
(2) 8 ICE 0000 0000 //System. Void (2) 8 ICE 0000 0000 //System. Void
(3) 2711 0100 0000 //System. String (3) 2711 0100 0000 //System. String
(4) 2722 0000 0100 //System.Boolean (4) 2722 0000 0100 //System.Boolean
(5) 3064 0100 0000 //System. MarshalByRefObj ect (5) 3064 0100 0000 //System. MarshalByRefObj ect
(6) 34B6 0100 0000 //System. Version
(7) C061 0100 0000 //System.Int32 (6) 34B6 0100 0000 //System. Version (7) C061 0100 0000 //System.Int32
(8) A245 0000 0000 //System. Byte (8) A245 0000 0000 //System. Byte
(9) 4410 0000 0000 //System. Array (9) 4410 0000 0000 //System. Array
(10) 4970 0100 0000 //System. Object (11) ED88 0000 0000 //System. ValueType (10) 4970 0100 0000 //System. Object (11) ED88 0000 0000 //System. ValueType
(12) A1FA 0100 0000 //System. Type (12) A1FA 0100 0000 //System. Type
(13) CBOF 0100 0000 (13) CBOF 0100 0000
//SmartCard. Runtime . Remoting . Channel s . APDU . APDUS erverChannel //SmartCard. Runtime . Remoting . Channel s . APDU . APDUS erverChannel
(14) D9F8 0100 0000 //System.Runtime. Remoting. Channels. ChannelServices (14) D9F8 0100 0000 //System.Runtime. Remoting. Channels. ChannelServices
(15) F55B 0100 0000 (15) F55B 0100 0000
〃Sy stem. Runtime . Remoting . RemotingConfiguration 上述数据均釆用的是小端的表示方式, 前两个字节为压缩后的引用类型 的名称, 中间两个字节为方法计数, 后两个字节为字段计数, "//"后为. net文 件中引用类型名称字符串, 通过对比可以得出, 本实施例所提供的方法具有 较好的压缩效果, 可以通过对引用类型的压缩使. net 文件占用的存储空间减 小, 可以降低. net文件的使用受存储空间的限制。 需要说明的是, 在附图的流程图示出的步骤可以在诸如一组计算机可执 行指令的计算机系统中执行, 并且, 虽然在流程图中示出了逻辑顺序, 但是 在某些情况下, 可以以不同于此处的顺序执行所示出或描述的步骤。 实施例 4 参见图 7,本实施例提供了一种 .net文件的定义方法的压缩方法,该方法 包括: 步 4聚 401: 定位到 .net文件, 才艮据该. net文件定位到 .net文件中的元数据 表中的定义方法表及相关的流;
步骤 402: 才艮据上述定义方法表将上述流中每个定义方法的相应数据项 的内容构造字符串, 所述数据项的内容包括参数计数; 步骤 403 : 对构造得到的字符串进行散列运算以转换成名称散列值; 步骤 404: 将上述定义方法的执行标识和访问标识进行压缩; 步骤 405 : 将上述流中定义方法的参数表进行压缩; 步骤 406: 组织上述名称散列值、 压缩的执行标识和访问标识、 参数计 数和压缩的参数表, 得到压缩结构。 其中, 上述名称散列值、 压缩的执行标识和访问表识、 参数计数和压缩 的参数表是按照预设规则进行组织的。 上述. net文件带有一个 .net文件头,其中存储了上述元数据结构表的起始 位置、 内容描述及各项内容所占的字节大小, 有了起始位置和各项内容所占 的字节大小就可以计算出本实施例中要用到的元数据流、字符串流、 blob流、 GUID 流和用户字符串流等的具体地址, 才艮据计算出的地址可以定位对应的 流。 本实施例通过对. net文件中的定义方法部分压缩,有效地减小了 .net文件 占用的存储容量, 利于. net文件在小存储量的设备上使用; 同时节省了资源, 提高了资源的利用率。 实施例 5 本实施例提供了一种 .net文件的定义方法的压缩方法, 参见图 8,该方法 具体包括下列步骤: 步骤 501 : 才艮据 PE文件结构及. net文件, 定位到元数据表中的定义方法 表及相关的流, 读取表头中的方法表的行数, 该行数代表了方法的个数; 设 定方法计数值; 定位的过程包括:由 PE文件的文件头中的内容定位到 .net文件,再由 .net 文件头中的内容定位到 .net文件中的各个流(包括元数据流、 字符串流、 blob 流、 GUID流和用户字符串流 )的地址和大小。 .net文件如上表所示: 由存储 签名、 存储头、 流头和六个数据流组成。 其中, 存储签名的大小是固定的,
存储头的大小也是固定的, 流头中存储有当前. net 文件中包含的各个流的名 称、 大小和偏移地址, 有了这些数据, 就可以定位到 .net 文件的六个流, 本 实施例中需要用到的流有字符串流、 blob流和元数据流。 定位到元数据流后, 还要继续定位到其中的元数据表。 元数据流由元数 据头、 表记录计数和元数据表构成。 其中, 元数据头的长度为固定值, 其中 带有一个 8个字节的 MaskValid字段, 标识了现存所有表的位向量, 位向量 的每一位代表一个表, 每一位位向量的取值有 0和 1两个选择, 0代表该位 向量指向的表不存在, 反之 1 代表存在。 目前, 元数据模式下共定义了 44 个类型的表, 上面提到的位向量分别指向每个类型的表, 根据其中的值就可 以确定这些表是否存在, 比如位向量的第七位对应着定义方法表, 该位向量 的值为 1 , 说明元数据表中确实存在定义方法表, 因此元数据头标识出了元 数据表中共有多少个表。 表记录计数为上述标识的每张表定义了 4个字节的数据, 这 4个字节的 数据表明了其中每张表的行数, 每张表中的每一行的数据宽度是指定的。 由 于元数据头的长度是固定的, 表记录计数的字节长度也是确定的 (该长度等 于元数据头中确定的表张数乘以 4 ), 因此能够从元数据流定位到元数据表。 从元数据表定位到定义方法表的过程如下: 元数据表中依次存放了多个 不同类型的表(具体有多少个表,可以根据在上面的元数据表头的 MaskValid 字段算出), 每张表中列有很多行, 行数已经由表记录计数规定好, 且每一行 所占的长度是事先规定好的,因此通过计算定义方法表前所有表所占的长度, 就可以从元数据表定位到定义方法表。 元数据表中的每一行对应着一个方法, 每个方法都具有一张数据项表, 表中的部分项与预先定义的流相关联, 如: 名称项与字符串流相关联。 表中 每个方法占用的空间是确定的, 由此可以定位到某个方法的数据项地址, 从 该数据项地址可以读取数据项的值。 设定方法计数值具体为设定参数 i初始值为 1 , 该值代表执行压缩的方 法的个数。 步骤 502: 读取定义方法表中每个方法中的数据项; 根据获取的数据项 中的内容构造字符串; 方法表中每个方法都包括以下数据项:
( 1 ) RVA ( 4字节无符号整数)为模块中方法体的相对虚拟地址( relative virtual address) , RVA转向 PE的只读段; 〃Sy stem. Runtime . Remoting . RemotingConfiguration The above data is used in the little endian representation, the first two bytes are the name of the compressed reference type, the middle two bytes are the method count, the last two bytes are The field count, after "//" is the reference type name string in the .net file. It can be concluded by comparison. The method provided in this embodiment has a good compression effect, and can be made by compressing the reference type. The occupied storage space is reduced, which can be reduced. The use of net files is limited by the storage space. It should be noted that the steps shown in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer executable instructions, and, although the logical order is shown in the flowchart, in some cases, The steps shown or described may be performed in an order different than that herein. Embodiment 4 Referring to FIG. 7, this embodiment provides a compression method for defining a .NET file, and the method includes: Step 4: 401: Locate the .net file, and then locate the .net file to .net. The definition method table and related streams in the metadata table in the file; Step 402: Construct a character string of the content of the corresponding data item of each definition method in the stream according to the definition method table, and the content of the data item includes a parameter count; Step 403: hash the constructed character string Computation to convert to a name hash value; Step 404: Compress the execution identifier and the access identifier of the above defined method; Step 405: Compress the parameter table of the method defined in the stream; Step 406: Organize the name hash value, The compressed execution ID and access identification, parameter count, and compressed parameter tables yield a compressed structure. The parameter hash value, the compressed execution identifier, the access list, the parameter count, and the compressed parameter table are organized according to preset rules. The above .net file has a .net file header, which stores the starting position of the metadata structure table, the content description, and the byte size occupied by each content, with the starting position and contents. The byte size can be used to calculate the specific addresses of the metadata stream, the string stream, the blob stream, the GUID stream, and the user string stream to be used in this embodiment, so that the calculated address can locate the corresponding stream. . In this embodiment, by partially compressing the definition method in the .net file, the storage capacity occupied by the .net file is effectively reduced, and the .net file is used on a small storage device; at the same time, resources are saved, and resources are improved. Utilization rate. Embodiment 5 This embodiment provides a compression method for defining a .NET file. Referring to FIG. 8, the method specifically includes the following steps: Step 501: According to the PE file structure and the .net file, locate the metadata table. In the definition method table and related stream, read the number of rows in the method table in the header, the number of rows represents the number of methods; set the method count value; the positioning process includes: by the file header of the PE file The content is located in the .net file, and the content in the .NET file header is located at the address of each stream in the .NET file (including the metadata stream, the string stream, the blob stream, the GUID stream, and the user string stream). size. The .net file is shown in the following table: Consists of a storage signature, a storage header, a stream header, and six data streams. Where the size of the storage signature is fixed, The size of the storage header is also fixed. The stream header stores the name, size, and offset address of each stream contained in the current .net file. With this data, you can locate the six streams of the .net file. The streams that need to be used in this example are a string stream, a blob stream, and a stream of metadata. Once you have located the metadata stream, you will continue to locate the metadata table in it. A metadata stream consists of a metadata header, a table record count, and a metadata table. The length of the metadata header is a fixed value, with an 8-byte MaskValid field, which identifies the bit vector of all existing tables. Each bit of the bit vector represents a table, and the value of each bit vector There are two choices of 0 and 1, 0 means that the table pointed to by the bit vector does not exist, and 1 means existence. At present, there are 44 types of tables defined in the metadata mode. The bit vectors mentioned above point to the tables of each type respectively. According to the values, it can be determined whether the tables exist, for example, the seventh bit of the bit vector corresponds to Define the method table, the value of the bit vector is 1, indicating that the definition method table does exist in the metadata table, so the metadata header identifies how many tables are in the metadata table. The table record count defines 4 bytes of data for each table identified above, the 4 bytes of data indicating the number of rows in each table, and the data width of each row in each table is specified. Since the length of the metadata header is fixed, the byte length of the table record count is also determined (this length is equal to the number of tables determined in the metadata header multiplied by 4), so that the metadata stream can be located from the metadata stream. The process of locating from the metadata table to defining the method table is as follows: The metadata table stores a number of different types of tables in turn (how many tables can be calculated according to the MaskValid field in the metadata header above), in each table There are many rows in the column, the number of rows has been specified by the table record count, and the length of each row is pre-defined. Therefore, by calculating the length of all the tables before the definition method table, you can locate from the metadata table. Define a method table. Each row in the metadata table corresponds to a method, each method has a table of data items, and some items in the table are associated with a predefined stream, such as: The name item is associated with the string stream. The space occupied by each method in the table is determined, so that the data item address of a method can be located, and the value of the data item can be read from the data item address. The setting method count value is specifically that the initial value of the setting parameter i is 1, and the value represents the number of methods of performing compression. Step 502: Read a data item in each method in the definition method table; construct a character string according to the content in the obtained data item; each method in the method table includes the following data items: (1) RVA (4-byte unsigned integer) is the relative virtual address of the method body in the module, and the RVA turns to the read-only segment of the PE;
( 2 ) ImplFlags ( 2字节无符号整数 ) 实现的二进制标志, 表示方法的实 现方式; ( 3 ) Flags ( 2字节无符号整数 )表示方法可访问性和其他特征的二进制 标志; (2) ImplFlags (2-byte unsigned integer) implementation of the binary flag, indicating how the method is implemented; (3) Flags (2-byte unsigned integer) represents the binary accessibility of method accessibility and other features;
( 4 ) Name ( #Strings中的偏移量)方法的名称, 与字符串流相关联。 该 记录项索引一个 UTF-8编码格式长度大于 0且小于 1023个字节的字符串; (4) The name of the Name (offset in #Strings) method, associated with the string stream. The entry index is a string of UTF-8 encoding format length greater than 0 and less than 1023 bytes;
( 5 ) Signature ( #Blob ¾ϊ中的偏移量 )方法特征, 与 blob ¾ϊ 目关联。 该 记录项索引一个长度大于 0的 blob流; (5) Signature (offset in #Blob 3⁄4ϊ) method feature, associated with blob 3⁄4ϊ. The record item indexes a blob stream whose length is greater than 0;
( 6 ) ParamList (参数表的 RID ) 记录索引, 指明属于方法的参数列表 的开始位置。 下一方法的参数列表的起点或 Param表的结束点决定此参数列 表的结束位置。 参见图 9, 为利用 .net文件的定义方法信息构造字符串的方法流程图,构 造字符串的方法包括: 步骤 5021 : 读取上述数据项中名称项的值, 并才艮据该值读取对应于字符 串流中的数据得到方法的名称; 名称项对应着上述数据项中 "Name"项, 其中的值指示出该项在 "字符串 流,,中的偏移地址, 根据该偏移地址从"字符串流"中读取方法名称, 如: MySampleMethod。 步骤 5022: 读取签名项的值, 并根据该值读取对应的 Blob流中的数据, 获取并分析定义方法的参数信息及返回值的类型, 其中参数信息具体包括: 参数的个数和每个参数的类型等等信息; 签名项对应着上述表中的 Signature项, 由上表可知, 签名项中的值指示 出该项在" Blob流"中的偏移地址, 才艮据该偏移地址从 "Blob流"中读取签名信 息, 即该方法用到的一系列参数信息(如: 参数个数、 每个参数的类型等等) 和返回值信息(包括返回值类型)。 其中, 返回值类型指向该定义方法使用过 程中要用到的其它类型表中的具体类型。
步骤 5023 : 根据返回值类型, 在元数据表定义类型表或者引用类型表中 找到返回值类型所指向的类型信息, 并通过该类型的数据项表中对应的名称 项和命名空间项记录的偏移在字符串流中读取该类型名称和命名空间名称等 类的信息; 步骤 5024: 应用上述获取的命名空间名称和类型的名称构造返回值全名 称字符串, 本实施例构造的返回值全名称字符串的优选格式为: 命名空间名 称-类型名称; 根据获取的参数类型分别读取参数的类型名称及命名空间名 称, 根据得到的数据构造参数全名称字符串, 本实施例构造的参数全名称字 符串的优选格式: 命名空间名称 ·类型名称; 步 4聚 5025: 判断步 4聚 5022 中获取的参数个数是否大于 1 , 如果是, 返 回步骤 5024, 继续获取每个参数的信息; 否则, 执行步骤 5026; 步骤 5026: 根据步骤 5024 中得到的返回值全名称字符串和参数全名称 字符串、 步骤 5021中获取的方法名称和步骤 5022中获取的每个参数的信息 构造字符串; 构造的字符串的优选格式如下: 返回值全名称字符串 方法的名称 (参数全名称字符串, ... ... ) 注: 省略号分别为参数的个数中的每个参数信息。 步骤 503: 对步骤 502 中构造得到的字符串进行散列运算, 取运算结果 的前两位将其转换成值类型存储, 该数据即为方法的名称散列值; 通过对步骤 502中获取的一系列数据项进行散列运算, 并截取其中部分 值的过程, 实现了对定义方法的压缩; 步骤 504: 获取定义方法的执行标识和访问标识, 并进行压缩; 压缩过程为将步骤 502中获取的原定义方法中的执行标识 ( ImplFlags ) 和访问标识 (Flags ) 中的数据项进行重组, 舍弃其中的部分数据项, 最终将 4个字节的标识项合为 2个字节的标识项, 实现了对标识项的压缩。 步骤 505: 确定方法的类型, 如果方法为大头 (fat header ) 方法, 将步 l 504 中组合的标识位中的方法类型相关项置 1 , 表明该方法为大头方法,
执行步 4聚 506; 如果方法为小头 (tiny header ) 方法, 执行步 4聚 507; 确定方法类型的步骤具体为: 分析步骤 502 中获取的方法数据项中的 RVA值, 通过 RVA定位到方法头信息, 分析方法头信息的第一个字节, 该 字节的氏二位表示方法头类型, 如果该氏二位的值为 2 ( 0010 ), 则表示方法 为小头方法; 如果该低两位的值为 3 ( 0011 ), 表示方法为大头方法。 步骤 506: 获取大头方法特有的数据项, 并对其中的内容进行压缩; 参见图 10, 为压缩大头方法特有的数据项的方法流程图, 对特有数据项 压缩的方法包括: 步骤 5061 : 对步骤 505中获取的大头方法类型信息进行分析, 得到最大 栈大小和大头方法标识, 压缩描述最大栈大小的数据, 该数据在原有结构中 占 2个字节 ( 16位), 取该 16位字节的氐 8位, 舍弃高 8位。 步骤 5062: 分析步骤 505中获取的大头方法标识, 获取局部变量签名标 识, 以获取局部变量个数; 参见图 11 , 为本实施例提供的大头方法标识的结构示意图, 由图 11可 知, 大头方法标识由 12 个字节组成, 前两个字节为标识信息, 对应着图中 的 Flags项;之后的两个字节为最大栈大小信息,对应着图中的 MaxStack项; 再后的 4个字节为代码大小信息, 对应着图中的 Code Size项; 最后 4个字 节为局部变量签名序列号, 对应着图中的 Local Variables Signature Token项。 分析大头方法标识中的局部变量签名序列号中的数据, 如果该数据为 0, 则 表明局部变量个数为 0; 否则, 根据该值定位到元数据表的 StandAloneSig表 (独立特征描述符表,该表具有作为方法局部变量的复合特征),从该表中的 Value项的内容读取到数据项签名的偏移, 并读取局部变量个数; 设定大头方法头经编译后得到的 16进制代码如下: (6) ParamList (RID of the parameter table) The index of the record, indicating the starting position of the parameter list belonging to the method. The start point of the parameter list of the next method or the end point of the Param table determines the end position of this parameter list. Referring to FIG. 9, a flowchart for constructing a string by using a method of defining a method of a .NET file, the method for constructing a string includes: Step 5021: Read a value of a name item in the data item, and read according to the value Corresponding to the data in the string stream to get the name of the method; the name item corresponds to the "Name" item in the above data item, where the value indicates the offset address of the item in the string stream, according to the offset The address reads the method name from the "string stream", such as: MySampleMethod. Step 5022: Read the value of the signature item, and read the data in the corresponding blob stream according to the value, obtain and analyze the parameter information of the definition method and The type of the return value, where the parameter information specifically includes: the number of parameters and the type of each parameter, etc.; the signature item corresponds to the Signature item in the above table, as can be seen from the above table, the value in the signature item indicates the item The offset address in the "Blob stream" is used to read the signature information from the "Blob stream" according to the offset address, that is, a series of parameter information used by the method (eg: number of parameters, each parameter) Types of Etc.) and return value information (including the return value type), where the return value type points to the specific type in the other types of tables that are used during the use of the defined method. Step 5023: According to the return value type, find the type information pointed to by the return value type in the metadata table definition type table or the reference type table, and record the partial name and the namespace item in the data item table of the type. The information of the class such as the type name and the namespace name is read in the string stream; Step 5024: The return value of the full name string is constructed by applying the name of the namespace name and the type obtained above, and the return value constructed in this embodiment is all The preferred format of the name string is: namespace name-type name; the type name and namespace name of the parameter are respectively read according to the obtained parameter type, and the parameter full name string is constructed according to the obtained data, and the parameters constructed in this embodiment are all The preferred format of the name string: Namespace name and type name; Step 4: 5025: Determine whether the number of parameters obtained in step 4 is 5022 is greater than 1, if yes, return to step 5024, continue to obtain information of each parameter; Go to step 5026; Step 5026: According to the return value obtained in step 5024, the full name string and the parameter The full name string, the method name obtained in step 5021, and the information construct string of each parameter obtained in step 5022; the preferred format of the constructed string is as follows: Return value Full name string method name (parameter full name character) String, ... ) Note: The ellipses are each parameter information in the number of parameters. Step 503: Perform a hash operation on the string obtained in step 502, and convert the first two bits of the operation result into a value type storage, where the data is a name hash value of the method; A series of data items are hashed, and a part of the values are intercepted, and the compression of the definition method is implemented. Step 504: Acquire an execution identifier and an access identifier of the defined method, and perform compression; the compression process is performed in step 502. In the original definition method, the data items in the execution identifier (ImplFlags) and the access identifier (Flags) are reorganized, and some of the data items are discarded, and finally the 4-byte identification items are combined into a 2-byte identification item. The compression of the identification item is implemented. Step 505: Determine the type of the method. If the method is a fat header method, set the method type related item in the combined identifier bit in step l 504 to 1, indicating that the method is a big header method. Step 4: 506; If the method is a tiny header method, step 4 is performed; the step of determining the method type is specifically: analyzing the RVA value in the method data item obtained in step 502, and positioning the method by RVA Header information, the first byte of the method header information, the two bits of the byte indicate the method header type. If the value of the two bits is 2 (0010), the method is the small header method; if the low The value of two bits is 3 ( 0011 ), indicating that the method is a big method. Step 506: Acquire a data item unique to the big head method, and compress the content therein. Referring to FIG. 10, a method flowchart for compressing the data item specific to the big head method, the method for compressing the unique data item includes: Step 5061: Steps The big-head method type information obtained in the 505 is analyzed, and the maximum stack size and the big-head method identifier are obtained, and the data describing the maximum stack size is compressed, and the data occupies 2 bytes (16 bits) in the original structure, and the 16-bit byte is taken.氐 8 digits, discarding the upper 8 digits. Step 1062: Analyze the big head method identifier obtained in step 505, and obtain the local variable signature identifier to obtain the number of local variables. Referring to FIG. 11, the structure of the big head method identifier provided by this embodiment is shown in FIG. The identifier consists of 12 bytes. The first two bytes are identification information, which corresponds to the Flags item in the figure; the next two bytes are the maximum stack size information, corresponding to the MaxStack item in the figure; the next four The byte is the code size information, which corresponds to the Code Size item in the figure; the last 4 bytes are the local variable signature serial number, which corresponds to the Local Variables Signature Token item in the figure. Analyze the data in the serial number of the local variable signature in the big-head method identifier. If the data is 0, it indicates that the number of local variables is 0; otherwise, the value is mapped to the StandAloneSig table of the metadata table (independent feature descriptor table, The table has a composite feature as a method local variable), reading the offset of the data item signature from the content of the Value item in the table, and reading the number of local variables; setting the header of the big header method after being compiled 16 The hex code is as follows:
1B 30 02 00 38 00 00 00 03 00 00 11 00 14 0A 00 72 21 00 00 70 OA 00 DE 12 0B 00 06 07 6F 14 00 1B 30 02 00 38 00 00 00 03 00 00 11 00 14 0A 00 72 21 00 00 70 OA 00 DE 12 0B 00 06 07 6F 14 00
00 OA 28 0D 00 00 OA OA 00 DE 00 00 DE OF 00 06 00 OA 28 0D 00 00 OA OA 00 DE 00 00 DE OF 00 06
72 3F 00 00 70 28 0D 00 00 OA OA 00 DC 00 06 0C
2B 00 08 2A 步骤 5063 : 分析大头方法标识, 得到异常结构计数和异常信息, 对其中 的异常信息进行压缩; 大头方法标识中的 Flags项占用 2个字节 (对应上述代码中的 301B ), 0x301B =(0011 0000 0001 1011) 2, ^口果其中 ό 第四位为 0, 贝' J表明该大头方 法中没有异常信息, 异常结构数为 0, 执行步骤 507; 否则, 如果其中的第 四位为 1 , 则表明该方法在 IL代码后有多个段, 即异常结构处理表, 执行下 列步骤: a ) 分析大头方法标识信息, 获取异常结构处理表; 结构化异常处理表由很多个段构成, 每个段中存有至少一个异常信息。 分析步骤 505中获取的大头方法标识 Flags项中的第一个字节的第四位, 如果该位对应的标识为 1 , 则说明此方法拥有多个段; 分析 Flags项中第 5、 6 字节得到代码的大小; 根据上述获取的方法头字节宽度和代码的大小就可 以偏移定位到异常结构处理表; 方法头的字节宽度是事先规定好的, 大头方法的方法头字节宽度为 12 字节, 相应的, 小头方法的方法头字节宽度也是固定的。 方法中存有至少一个段, 如果存有多个段, 则这多个段顺序存储在一片 存储区中; 每个段中存储有至少一个异常结构。 b ) 才艮据段存放的存储区的偏移地址定位到段中内容, 分析其中所存数 据的第一个字节, 如果该字节的第 7位为 1 , 则表示该大头方法段的异常结 构类型为 FatFormat格式, 执行步骤 c; 否则, 表示该大头方法的异常结构类 型为 TinyFormat格式, 执行步骤 d; 具体为根据某个段在结构化异常处理表中的位置定位到该段中内容, 并 分析段中所存数据的第一个字节。 c )根据段存放的存储区的偏移地址定位到段中内容, 分析其中所存数据 的第 2到 5个字节, 这 3个字节表明一个段中所有异常结构所占的存储空间 的大小; 如果第 2个字节中的第 8位为 1 , 则表示还有其它的段 ( section ) 艮在这个段的后面; 如果还有其它段, 重复执行下面操作, 直到所有段都压
缩完; 否则, 执行完下面操作后, 组织异常结构信息的压缩结构, 之后执行 步骤 5064。 对每个段进行压缩的过程如下: 如果该段的异常结构为大头类型, 则段 中异常结构所占的存储空间大小为第一长度, 本实施例优选为 n*24+4, 其中 n为异常结构数;读取方法体段中的 Flags 个字节)、 TryOffset ( 4个字节)、 TryLength ( 4个字节)、 HandlerOffset ( 4个字节)、 HandlerLength ( 4个字节;)、 ClassToken ( 4个字节) 项, 舍弃其中的 HandlerLength, 将 TryOffset ( 4个 字节)、 TryLength ( 4个字节)、 HandlerOffset ( 4个字节) 项的值都压缩成 2 字节, 压缩方法为舍弃高位, 保留氏位; ClassToken中的四个字节数据压 缩成一个字节, 压缩方法为舍弃高位, 只保留氏 8位字节; 通过上面步骤获 取的 ClassToken在定义类型和引用类型表中找到对应的参数类型信息。 参见 表 4, 为组织后的异常结构信息表: 表 472 3F 00 00 70 28 0D 00 00 OA OA 00 DC 00 06 0C 2B 00 08 2A Step 5063: Analyze the big head method identifier, get the abnormal structure count and abnormal information, and compress the abnormal information; the Flags item in the big head method identifier occupies 2 bytes (corresponding to 301B in the above code), 0x301B =(0011 0000 0001 1011) 2, ^口果中ό The fourth digit is 0, and the shell 'J indicates that there is no abnormal information in the big header method, the number of the abnormal structure is 0, and step 507 is performed; otherwise, if the fourth digit is If it is 1, it indicates that the method has multiple segments after the IL code, that is, the exception structure processing table, and performs the following steps: a) analyzing the big head method identification information, and obtaining the exception structure processing table; the structured exception processing table is composed of many segments , at least one exception message exists in each segment. The big head method obtained in the analysis step 505 identifies the fourth bit of the first byte in the Flags item. If the corresponding identifier of the bit is 1, the method has multiple segments; the fifth and sixth words in the Flags item are analyzed. Section gets the size of the code; according to the method header width and code size obtained above, it can be offset to the exception structure processing table; the byte width of the method header is specified in advance, the method header byte width of the big header method For the 12 bytes, the header byte width of the method of the small header method is also fixed. There is at least one segment stored in the method. If there are multiple segments, the segments are sequentially stored in a storage area; at least one abnormal structure is stored in each segment. b) The offset address of the storage area stored in the segment is located in the segment, and the first byte of the stored data is analyzed. If the 7th bit of the byte is 1, it indicates that the large header segment is abnormal. The structure type is FatFormat format, and step c is performed; otherwise, the exception structure type of the big header method is TinyFormat format, and step d is performed; specifically, the content in the segment is located according to the position of a segment in the structured exception processing table. And analyze the first byte of data stored in the segment. c) Locating the contents of the segment according to the offset address of the storage area stored in the segment, and analyzing the 2nd to 5th bytes of the stored data, the 3 bytes indicating the storage space occupied by all the abnormal structures in a segment If the 8th bit in the 2nd byte is 1, it means that there are other segments (sections) behind the segment; if there are other segments, repeat the following operations until all segments are pressed After the completion of the following operations, the compressed structure of the abnormal structure information is organized, and then step 5064 is performed. The process of compressing each segment is as follows: If the abnormal structure of the segment is a large-head type, the storage space occupied by the abnormal structure in the segment is the first length, which is preferably n*24+4, where n is Number of exception structures; Flags bytes in the read method body), TryOffset (4 bytes), TryLength (4 bytes), HandlerOffset (4 bytes), HandlerLength (4 bytes;), ClassToken (4 bytes), discarding HandlerLength, compressing the values of TryOffset (4 bytes), TryLength (4 bytes), and HandlerOffset (4 bytes) into 2 bytes, compression method To discard the high order, retain the bit position; the four bytes of data in the ClassToken are compressed into one byte, and the compression method is to discard the high order, leaving only 8 bytes; the ClassToken obtained by the above steps is in the definition type and reference type table. Find the corresponding parameter type information. See Table 4 for the table of abnormal structure information after organization: Table 4
其中, 表中各数据项的顺序是任意的, 前后次序可以随意调整。 d ) 分析步骤 b 中获取的偏移地址中的数据, 第 2个字节表明了一个段 中所有异常结构所占的存储空间的大小; 如果该字节的第 8位为 1 , 则表示 还有其它的段 (section ) 跟在这个段的后面; 如果还有其它段, 重复执行下 面操作,直到所有段都压缩完;否则,执行完下面操作后,直接执行步骤 5064。 对每个段进行压缩的过程如下: 根据段中异常结构的类型, 判断出当前 段中异常结构为 TinyFormat (小头格式;), 则当前段中异常结构所占的存储 空间大小为第二长度, 本实施例优选为: n* 12+4; 依次读取方法的 Flags ( 2 字节)、 TryOffset ( 2字节)、 TryLength ( 2字节)、 HandlerOffset ( 2字节)、 HandlerLength ( 2 字节) 和 ClassToken 项; 从上述获取的各项配置中删除 HandlerLength项。 通过上面步 4聚获取的 ClassToken项在定义类型和引用类 型表中找到对应的异常类型信息。 步骤 5064: 获取 Finally计数值; 当通过表达式 n*24+4(见上面步骤 c计算出当前段中异常结构的个数后, 无论是大头类型的异常结构还是小头类型的异常结构, 后面都接着这样的结 构:
0 2 标识, 具体内容参见下面: The order of each data item in the table is arbitrary, and the order can be adjusted at will. d) analyzing the data in the offset address obtained in step b, the second byte indicating the size of the storage space occupied by all the exception structures in a segment; if the eighth bit of the byte is 1, it means There are other sections following the section; if there are other sections, repeat the following operations until all the sections are compressed; otherwise, after performing the following operations, go directly to step 5064. The process of compressing each segment is as follows: According to the type of the abnormal structure in the segment, it is determined that the abnormal structure in the current segment is TinyFormat (the small header format;), and the storage space occupied by the abnormal structure in the current segment is the second length. This embodiment is preferably: n* 12+4; Flags (2 bytes), TryOffset (2 bytes), TryLength (2 bytes), HandlerOffset (2 bytes), HandlerLength (2 words) of the read method in sequence Section) and ClassToken items; remove the HandlerLength item from the various configurations obtained above. The ClassToken item obtained by the above step 4 finds the corresponding exception type information in the definition type and the reference type table. Step 5064: Obtain the Finally count value; when the expression n*24+4 is used (see step c above to calculate the number of exception structures in the current segment, whether it is an exception structure of a big header type or an exception structure of a small header type, All follow this structure: 0 2 logo, see below for details:
2 2 TryOf set 自方法体开始的 try块的偏移 (以字节为单位) 2 2 TryOf set The offset of the try block from the method body (in bytes)
4 1 TryLength try块的长度 (以字节为单位) 4 1 Length of the TryLength try block (in bytes)
5 2 HandlerOf set 上述 try块的 Handler地址 5 2 HandlerOf set Handler address of the above try block
7 1 HandlerLength 上述 Handler代码的大小 (以字节为单位) 7 1 HandlerLength Size of the above Handler code (in bytes)
8 4 ClassToken 异常处理的基本类型的元数据标识 8 5 FilterOf set 从该结构中能够读取标识位, 该标识位中, 大头类型用 4字节表示 , 头类型用 2字节表示, 参见表 5 , 有如下几个可能值: 下表列出了每个异常处理项所用的标识值: 8 4 ClassToken Metadata ID of the basic type of exception handling 8 5 FilterOf set From this structure, the flag bit can be read. In this flag bit, the big header type is represented by 4 bytes, and the header type is represented by 2 bytes, see Table 5. There are several possible values: The following table lists the identity values used for each exception handling item:
表 5 table 5
如果当前的异常结构的标识位为过滤异常或者最终异常,则将 Finally计 数加 1 , 直到分析完所有异常结构, 从而获得 Finally计数值。 步骤 5065 : 获取垃圾回收控制属性, 该步骤具体包括: a )分析元 *数据表中的 CustomAttribute元数据表, 如果该表中有行中的 数据项 Parent与当前分析方法 (包括方法头与方法的参数)相对应, 那么分 析 Type项的值, 得到类型信息, 类型名称及其构造方法等, 并 居 Value项 中指向 Blob流的偏移值定位到 blob流中位置, 分析第一个字节数据得到长 度, 跳过两字节的 Prolog 得到自定义属性类型的构造函数的参数值; b ) 如果步 4聚 a中得到的信息存在自定义属性有类型为 Transaction 那
将属性标识为 0x40,即第第 7位置为 1; c )如果步骤 a中得到的类型存在自定义属性类型 GCControl, 并 居由 步骤 a中得到的定义类型的构造函数参数的值,即 GCControlMode类型的值, 并相应的设置垃圾回收控制标识的值对应关系为: If the flag of the current exception structure is a filter exception or a final exception, the Finally count is incremented by one until all abnormal structures are analyzed, thereby obtaining a Finally count value. Step 5065: Obtain the garbage collection control attribute, where the step specifically includes: a) analyzing the CustomAttribute metadata table in the metadata table, if there is a data item in the row and the current analysis method (including the method header and method) Corresponding to the parameter, then analyze the value of the Type item, get the type information, the type name and its constructor, etc., and locate the offset value of the Blob stream in the Value item to locate the position in the blob stream, and analyze the first byte data. Get the length, skip the two-byte Prolog to get the parameter value of the constructor of the custom property type; b) If the information obtained in step 4 gathers a custom property has the type Transaction The attribute is identified as 0x40, that is, the seventh position is 1; c) if the type obtained in step a has a custom attribute type GCControl, and the value of the constructor parameter of the defined type obtained in step a, that is, the GCControlMode type The value of the corresponding value of the garbage collection control identifier is set accordingly:
Force = 1 , Force = 1 ,
Skip = 2, Skip = 2,
步骤 5066: 居上述获取的值, 按表 6所示结构组织压缩后的大头方法 头结构表: 表 6 Step 5066: The value obtained above is organized according to the structure shown in Table 6. The header structure table is as follows: Table 6
其中, 垃圾回收标识即垃圾回收控制标识, 上述大头方法头结构表中的 各项数据没有次序, 前后顺序可以任意调整。 步骤 507: 对参数表进行压缩; 根据步骤 502中获取的定义方法数据项 ParamList (参数表)中的值定位 到元数据表参数表对应的参数行中,并根据步骤 502中获取的参数个数信息, 读取相应的参数行信息,该信息包括以下几个数据项 Flags( 2字节), Sequence 和 Name项, 对这些数据项进行压缩, 具体为舍弃 Sequence和 Name项, 将 Flags项中的内容压缩成 1个字节; 从读取的参数行信息的 Flags项的值分析 出此参数的标识; 将上述标识和步骤 502中得到的参数类型在压缩文件中类 型存储区中的偏移组合成参数信息, 格式为: 参数标识'参数类型的偏移。 步骤 508: 如果步骤 502中获取的参数个数大于 1 ,返回步骤 507; 否则, 执行步骤 509; 步骤 509: 按照预设规则组织步骤 503、 504、 506和 507中压缩的数据, 得到定义方法的压缩结构, 优选的, 本实施例中的预设规则具体参见表 7所 示 (参数计数即为参数个数):
表 7
Among them, the garbage collection mark is the garbage collection control mark, and the data in the head structure table of the above-mentioned big head method has no order, and the order can be arbitrarily adjusted. Step 507: compress the parameter table; locate the value in the parameter data item ParamList (parameter table) obtained in step 502 to the parameter row corresponding to the metadata table parameter table, and according to the number of parameters obtained in step 502 Information, read the corresponding parameter line information, the information includes the following data items Flags (2 bytes), Sequence and Name items, compress these data items, specifically to discard the Sequence and Name items, and in the Flags item Compressing the content into 1 byte; analyzing the identifier of the parameter from the value of the Flags item of the read parameter row information; combining the above identifier and the parameter type obtained in step 502 in the type storage area in the compressed file The parameter information is in the format: Parameter identification 'offset of the parameter type. Step 508: If the number of parameters acquired in step 502 is greater than 1, return to step 507; otherwise, perform step 509; Step 509: organize the compressed data in steps 503, 504, 506, and 507 according to a preset rule to obtain a definition method. The compression structure, preferably, the preset rules in this embodiment are specifically shown in Table 7 (the parameter count is the number of parameters): Table 7
其中, 表 7中各数据项的顺序是任意的, 前后次序可以随意调整。 其中的大头数据块只有在标识中标识此方法为大头时才会有, 大头方法 头压缩结构如表 4所示; 这里的异常信息是根据异常结构计数来确定的, 依次将所有的异常结构 排列出来, 异常结构信息的压缩结构如表 8所示: 表 8 Among them, the order of each data item in Table 7 is arbitrary, and the order can be adjusted at will. The big header data block only exists when the identifier is identified as a big header in the identifier. The header header compression structure is as shown in Table 4; the exception information is determined according to the abnormal structure count, and all the exception structures are arranged in turn. The compressed structure of the abnormal structure information is shown in Table 8: Table 8
当前定义方法压缩完成后, 将方法计数值加 1 ; 步骤 510: 如果方法计数值小于步骤 501 中获取的表头的行数, 返回步 l 502; 否则, 结束所有操作。 步骤 501中获取的表头的行数代表着数据表中存储的定义方法个数, 因 此方法计数值小于表头的行数, 表明还有定义方法没压缩。 本实施例通过对. net文件中定义方法部分进行压缩,节省了 .net文件的存 储空间, 进而使 .net 文件可以在小容量存储设备上运行; 同时节省了资源, 提高了资源的利用率。 实施例 6 本实施例提供了一种 .net文件的定义方法的方法体的压缩装置, 如图 12 所示, 该装置包括: 方法获取模块 602, 用于获取. net文件中使用的定义方法的方法头; 该方 法头可以通过先读取元数据表 MethodDef中的信息得到方法头的位置信息, 例如 RVA ( Relative Virtual Address, 相对虚拟地址), 才艮据该位置信息得到 该方法头的数据,并通过该方法头中记录的 ILcode的长度读取该方法头后记 录的 ILcode; 压缩模块 604, 用于压缩方法获取模块 602 获取的 ILcode, 通过压缩
ILcode对方法体进行压缩, 得到方法体的压缩结果。 本实施例通过对. net 文件的定义方法中的方法体进行压缩, 可以有效地 降低. net文件占用的存储空间, 使. net文件可以在小容量存储介质 (例如: 智 能卡)上存储并运行, 进而增强了小容量存储介质(例如: 智能卡)的功能。 实施例 7 本实施例提供了一种 .net文件的定义方法的方法体的压缩装置, 如图 13 所示, 该装置包括: 方法获取模块 702和压缩模块 704, 其中, 压缩模块 704 包括: 局部变量偏移确定单元 7042、 指令压缩与计算单元 7044和组合单元 7046, 其中, 各模块的功能如下: 方法获取模块 702, 用于获取 .net文件中使用的定义方法的方法头, 并才艮 据该方法头获取该定义方法对应的 ILcode; 方法头的具体获取方式和实施例 6相同, 这里不再详述。 压缩模块 704, 用于压缩方法获取模块 702 获取的 ILcode, 通过压缩 ILcode对方法体进行压缩, 得到方法体的压缩结果。 压缩模块 704具体包括: 局部变量偏移确定单元 7042,用于根据方法获取模块 702获取的方法头 获取该定义方法的局部变量,并 4艮据此局部变量的类型确定局部变量的偏移, 其中,局部变量的偏移指局部变量在上述 .net文件对应的压缩结构中的偏移; 当定义方法为小头方法( Tiny Headers )时, 该定义方法中不存在局部变 量, 此时, 获取的局部变量为空, 对应的局部变量的偏移也为空, 当定义方 法为大头方法 (Fat Headers ) 时, 该定义方法中存在局部变量, 此时, 获取 该局部变量, 并确定该局部变量在压缩结构中的偏移; 具体判断该定义方法是否为小头方法的依据是: 读取上述方法头的第一 个字节, 根据上述第一个字节判断定义方法是大头方法还是小头方法, 当所 读取的第一个字节的氏两位为 10 时, 该定义方法为小头方法, 否则该定义 方法为大头方法; 指令压缩与计算单元 7044, 用于对方法获取模块 702获取的 ILcode进 行压缩, 计算压缩后的 ILcode的长度;
本实施例提到的 ILcode包括操作指令和操作参数, 其中, 操作参数可以 为空, 也可以为标识 token、指向局部变量在该方法体中的偏移或者为兆转的 偏移量, 对 ILcode进行压缩时, 可以才艮据具体的 ILcode确定压缩方式, 例 如: 对于 ILcode中的操作指令不带操作参数的情况, 可以直接记录该操作指 令; 对于带操作参数的操作指令, 可以针对以下三种情况分别处理: After the current definition method is compressed, the method count value is incremented by one; Step 510: If the method count value is smaller than the number of rows of the header obtained in step 501, return to step l 502; otherwise, all operations are ended. The number of rows of the headers obtained in step 501 represents the number of defined methods stored in the data table, so the method count value is smaller than the number of rows in the header, indicating that there is also a definition method that is not compressed. This embodiment saves the storage space of the .net file by compressing the method part defined in the .net file, so that the .net file can be run on the small-capacity storage device; at the same time, resources are saved and the resource utilization is improved. Embodiment 6 This embodiment provides a compression method of a method body for defining a .NET file. As shown in FIG. 12, the apparatus includes: a method obtaining module 602, configured to obtain a definition method used in a .NET file. Method header; the method header can obtain the location information of the method header by first reading the information in the metadata table MethodDef, for example, RVA (relative virtual address), according to the location information, the data of the method header is obtained. And reading the ILcode recorded after the method header by the length of the ILcode recorded in the method header; the compression module 604 is configured to obtain the ILcode obtained by the module 602 by the compression method, by compressing ILcode compresses the method body to obtain the compression result of the method body. In this embodiment, by compressing the method body in the definition method of the .net file, the storage space occupied by the .net file can be effectively reduced, and the .net file can be stored and run on a small-capacity storage medium (for example, a smart card). This enhances the functionality of small-capacity storage media such as smart cards. Embodiment 7 This embodiment provides a compression method for a method body of a method for defining a .NET file. As shown in FIG. 13, the device includes: a method obtaining module 702 and a compression module 704, where the compression module 704 includes: The variable offset determining unit 7042, the instruction compressing and calculating unit 7044, and the combining unit 7046, wherein the functions of the modules are as follows: The method obtaining module 702 is configured to obtain a method header of the definition method used in the .net file, and The method header obtains the ILcode corresponding to the definition method; the specific acquisition manner of the method header is the same as that of the embodiment 6, and is not described in detail herein. The compression module 704 is configured to compress the ILcode obtained by the module 702, and compress the method body by compressing the ILcode to obtain a compression result of the method body. The compression module 704 specifically includes: a local variable offset determining unit 7042, configured to acquire a local variable of the defined method according to a method header acquired by the method obtaining module 702, and determine an offset of the local variable according to the type of the local variable, where The offset of the local variable refers to the offset of the local variable in the compression structure corresponding to the above .net file; when the definition method is the Tiny Headers method, there is no local variable in the defined method, and at this time, the obtained The local variable is empty, and the offset of the corresponding local variable is also empty. When the definition method is a Fat Headers, there is a local variable in the defined method. At this time, the local variable is obtained, and the local variable is determined to be The offset in the compression structure; the basis for judging whether the definition method is a small header method is: reading the first byte of the method header, and determining whether the definition method is a big header method or a small header method according to the first byte above When the two bytes of the first byte read are 10, the definition method is a small header method, otherwise the definition method is a big header. ; Compression instruction computing unit 7044, a method for acquiring module 702 acquires the compressed ILcode, ILcode length after compression calculation; The ILcode mentioned in this embodiment includes an operation instruction and an operation parameter, wherein the operation parameter may be empty, or may be an identifier token, an offset of a local variable in the method body, or an offset of mega-transition, for ILcode. When compressing, the compression mode can be determined according to the specific ILcode. For example: For the operation instruction without the operation parameter in ILcode, the operation instruction can be directly recorded; for the operation instruction with the operation parameter, the following three types can be used. The situation is handled separately:
( 1 ) 当操作参数为跳转的偏移量 (即, 该操作指令为跳转指令) 时, 才艮据跳转的偏移量确定被跳过的 ILcode部分, 艮据被跳过的 ILcode部分重 新计算此操作指令 (即, 上述跳转指令) 跳转的偏移量, 记录此操作指令和 重新计算出的跳转偏移量; 例如: 该 ILcode中有 10个操作指令, 其中, 第 3 个操作指令为跳转指令, 其操作参数为 2, 即指向后跳转两个字节, 假如 跳转两个字节后为跳转到第 5个操作指令处,此时被跳过的 ILcode部分就是 第 4个操作指令和它的操作参数, 按照本实施例的方法, 需要先对被跳过的 ILcode部分进行压缩,根据压缩后的被跳过的 ILcode部分的长度重新修改第 3 个操作指令的操作参数, 将原来的跳转偏移量修改为压缩后的被跳过的 ILcode部分的长度。 进而实现 ILcode中全部操作指令和操作参数均压缩后, 不改变原来 ILcode的含义。 (1) When the operation parameter is the offset of the jump (that is, the operation instruction is a jump instruction), the skipped ILcode portion is determined according to the offset of the jump, and the skipped ILcode is determined. Partially recalculating the offset of the jump of the operation instruction (ie, the above jump instruction), recording the operation instruction and the recalculated jump offset; for example: there are 10 operation instructions in the ILcode, wherein, The three operation instructions are jump instructions, and their operation parameters are 2, that is, two bytes are jumped after pointing. If two bytes are jumped, the jump to the fifth operation instruction is skipped. The ILcode part is the fourth operation instruction and its operation parameters. According to the method of the embodiment, the skipped ILcode part needs to be compressed first, and the third part is re-modified according to the length of the compressed skipped ILcode part. The operation parameters of the operation instruction modify the original jump offset to the length of the compressed skipped ILcode portion. In turn, after all the operation instructions and operation parameters in ILcode are compressed, the meaning of the original ILcode is not changed.
( 2 ) 当操作参数为指向局部变量在该方法体中的偏移量时, 记录该操 作指令和操作参数; (2) when the operation parameter is an offset to the local variable in the body of the method, the operation instruction and the operation parameter are recorded;
( 3 ) 当操作参数为标识 token时, 确定该标识 token在上述压缩结构中 对应的偏移量, 并记录该操作指令和确定的偏移量。 上述记录的内容即为 IL指令的压缩结果。 组合单元 7046, 用于按照预先确定的格式对指令压缩与计算单元 7044 计算出的压缩后的 ILcode的长度、 局部变量偏移确定单元 7042确定的局部 变量的偏移和指令压缩与计算单元 7044压缩后的 ILcode进行组合, 得到方 法体的压缩结果。 优选的, 预先确定的格式为将计算出的压缩后的 ILcode的长度、 局部变 量的偏移和压缩后的 ILcode依次排列的格式,也可以任意改变这三部分的前 后位置。 本实施例提到的. net 文件对应的压缩结构指. net 文件中的数据是按顺序 排列的, 每行数据的偏移与该行数据的标识 token相对应, 例如: 命名空间
中 列在第一行的数据的偏移为 0, 对应的 token为 0, 4 列在第二行的数据 的偏移为 1 , 对应的 token为 1 , 依次类 4舞, 为. net文件的压缩结构; 需要说明的是, 在类型、 方法、 字段这三种数据中, 存在引用和定义两 种类型的数据, 引用的数据在压缩结构中排在定义数据前, 并且定义数据的 token也排在引用数据后, 偏移应与 token相对应, 优选地, 该偏移为一个字 节, 例如, 引用类型已经存在 6个, 分别排在偏移为 0-5的位置, 那么定义 类型模块中排列的第一个定义类型的 token应该为 6, 偏移也为 6。 例如, .net 文件的源代码如下: public String My S ampleMethod() { (3) When the operation parameter is the identification token, the corresponding offset of the identification token in the above compression structure is determined, and the operation instruction and the determined offset are recorded. The content of the above record is the compression result of the IL instruction. The combining unit 7046 is configured to compress the length of the compressed ILcode calculated by the instruction compression and calculation unit 7044, the offset of the local variable determined by the local variable offset determining unit 7042, and the compression of the instruction compression and calculation unit 7044 according to a predetermined format. After the ILcode is combined, the compression result of the method body is obtained. Preferably, the predetermined format is a format in which the calculated length of the compressed ILcode, the offset of the local variable, and the compressed ILcode are sequentially arranged, and the front and rear positions of the three parts may be arbitrarily changed. The compression structure corresponding to the .net file mentioned in this embodiment means that the data in the .net file is arranged in order, and the offset of each row of data corresponds to the identifier token of the row data, for example: Namespace The offset of the data listed in the first row is 0, the corresponding token is 0, the offset of the data listed in the second row is 1 , the corresponding token is 1, and the class 4 dances in turn, as the .net file. Compressed structure; It should be noted that in the three types of data: type, method, and field, there are two types of data that are referenced and defined. The referenced data is arranged in the compressed structure before the definition data, and the token defining the data is also arranged. After the data is referenced, the offset should correspond to the token. Preferably, the offset is one byte. For example, there are already 6 reference types, which are respectively placed at offsets of 0-5, so in the definition type module. The first defined type of token should be 6 and the offset is 6. For example, the source code for a .net file is as follows: public String My S ampleMethod() {
String strHello = "Hello World!"; if (strHello != null) strHello += "Y"; return strHello + callCount.ToString(); } 转换后的压缩结构为: 命名空间 Namespace: String strHello = "Hello World!"; if (strHello != null) strHello += "Y"; return strHello + callCount.ToString(); } The converted compression structure is: Namespace Namespace:
(0) 0100 5E5F8700 (1) 0100 35F47B00 (0) 0100 5E5F8700 (1) 0100 35F47B00
(2) 0100 A17EC300 (2) 0100 A17EC300
(3) 0700 00F64D00 (A) 0200 ACE6EB00 引用类型 TypeRef:
(0) F55B 0100 0000(3) 0700 00F64D00 (A) 0200 ACE6EB00 Reference Type TypeRef: (0) F55B 0100 0000
(1) D9F8 0100 0000(1) D9F8 0100 0000
(2) CBOF 0100 0000(2) CBOF 0100 0000
(3) 8 ICE 0000 0000(3) 8 ICE 0000 0000
(4) 2711 0100 0000(4) 2711 0100 0000
(5) 2722 0000 0000(5) 2722 0000 0000
(6) 4970 0100 0000(6) 4970 0100 0000
(7) C061 0100 0000(7) C061 0100 0000
(8) AIFA OIOO 0000(8) AIFA OIOO 0000
(9) 3064 0100 0000 引用方法 MethodRef: (9) 3064 0100 0000 Reference Method MethodRef:
(0) E8CB (0) E8CB
(1) CE72 (1) CE72
(2) 9080 (2) 9080
(3) 3973 (3) 3973
(4) 9080 (4) 9080
(5) 6A4D (5) 6A4D
(6) 6AF2 (6) 6AF2
(7) 9080 (7) 9080
引用字段 FieldRef:
Blob: Reference field FieldRef: Blob:
(0) 0D00 (0) 0D00
(1) 0100 (1) 0100
(2) 0D00 定义类型 TypeDef: (2) 0D00 Definition Type TypeDef:
(A) A70914 0009000200 定义方法 MethodDef: (A) A70914 0009000200 Definition Method MethodDef:
(8) DFEE000600040104 (8) DFEE000600040104
(9) 9080020600000103 上述数据中, 每行数据第一个小括号里为该行数据的偏移, 后面为该行 具体数据。 例如, 在命名空间 Namespace中, 每一行数据记录一个命名空间 信息, 第一行数据 (0) 0100 D93DEB00, 其中, 小括号中的 0为该命名空间的 偏移,该偏移并不是实际存在的, 0100 D93DEB00为该命名空间的具体信息, 其他数据的结构与此行数据类似。 上述数据仅为 .net文件压缩结构的一部分,仅是为了说明 .net文件中的标 识 token与压缩结构中偏移的对应关系。 本实施例通过指令压缩与计算单元 7044对 ILcode进行压缩并计算压缩 后的 ILcode的长度,由组合单元 7046组合压缩后的 ILcode,压缩后的 ILcode 的长度以及局部变量的偏移作为方法体的压缩结果, 可以有效地降氏. net 文 件占用的存储空间, 使. net 文件可以在 ,j、容量存储介质 (例如: 智能卡) 上 存储并运行, 进而增强了小容量存储介质 (例如: 智能卡) 的功能。 实施例 8
本实施例提供了一种 .net 文件的定义方法的方法体的压缩方法, 该方法 以使用实施例 6提供的压缩装置实现为例进行说明, 如图 14所示, 该方法 包括: 步骤 S20: 获取. net 文件中使用的定义方法的方法头, 并才艮据该方法头 获取该定义方法的 ILcode; 步骤 S30: 压缩上述方法中的 ILcode, 通过压缩 ILcode压缩方法体, 并 得到方法体的压缩结果。 本实施例通过对. net 文件的定义方法中的方法体进行压缩, 可以有效地 降低. net文件占用的存储空间, 使. net文件可以在小容量存储介质 (例如: 智 能卡)上存储并运行, 进而增强了小容量存储介质 (例如: 智能卡)的功能。 实施例 9 本实施例提供了一种 .net文件的定义方法的方法体的压缩方法, 如图 15 所示, 该方法包括: 步骤 802: 获取. net文件中使用的定义方法的方法头, 并根据该方法头获 取该定义方法的 ILcode; 在经过 .net平台编译的 .net 文件中包括命名空间 ( Namespace ), 引用类 型 (TypeRef)、 定义类型 ( TypeDef )、 定义方法 ( MethodDef )、 字符串等, 并以表和流的形式进行存储,在对. net文件中的 IL进行压缩前,可以先将. net 文件的结构变换为压缩结构, 本实施例中的压缩结构与实施例 1中的压缩结 构一样, 这里不再赞述。 (9) 9080020600000103 In the above data, the first parenthesis of each row of data is the offset of the row data, followed by the row specific data. For example, in the namespace Namespace, each row of data records a namespace information, the first row of data (0) 0100 D93DEB00, where 0 in the parentheses is the offset of the namespace, the offset is not actually present , 0100 D93DEB00 is the specific information of the namespace, the structure of other data is similar to this row of data. The above data is only part of the .NET file compression structure, just to illustrate the correspondence between the identifier token in the .NET file and the offset in the compression structure. In this embodiment, the ILcode is compressed by the instruction compression and calculation unit 7044 and the length of the compressed ILcode is calculated. The combined unit 7046 combines the compressed ILcode, the length of the compressed ILcode, and the offset of the local variable as the compression of the method body. As a result, the storage space occupied by the .net file can be effectively reduced, so that the .net file can be stored and run on j, a storage medium (for example, a smart card), thereby enhancing the storage of a small-capacity storage medium (for example, a smart card). Features. Example 8 This embodiment provides a method for compressing a method body of a method for defining a .NET file. The method is described by using a compression device provided in Embodiment 6. As shown in FIG. 14, the method includes: Step S20: Obtaining the method header of the definition method used in the .net file, and obtaining the ILcode of the definition method according to the method header; Step S30: compressing the ILcode in the above method, compressing the method body by compressing the ILcode, and obtaining the compression of the method body result. In this embodiment, by compressing the method body in the definition method of the .net file, the storage space occupied by the .net file can be effectively reduced, and the .net file can be stored and run on a small-capacity storage medium (for example, a smart card). This enhances the functionality of small-capacity storage media such as smart cards. Embodiment 9 This embodiment provides a method for compressing a method body of a method for defining a .NET file. As shown in FIG. 15, the method includes: Step 802: Obtain a method header of a definition method used in a .NET file, and Obtain the ILcode of the defined method according to the method header; include a namespace (Namespace), a reference type (TypeRef), a definition type (TypeDef), a definition method (MethodDef), a string, etc. in a .net file compiled by the .net platform. And storing in the form of a table and a stream. Before compressing the IL in the .net file, the structure of the .net file can be first converted into a compressed structure. The compression structure in this embodiment is compressed in the first embodiment. The structure is the same and will not be mentioned here.
IL包括:操作指令和操作参数,操作参数可以为空,也可以为标识 token、 指向局部变量的偏移或者为兆转的偏移量。 其中, 标识 token的前半部分指 明了此操作指令对应的表, 后半部分指明了为该表的第几行数据。 获取方法头的具体实现可以是先获取. net文件中的元数据表 MethodDef; 从元数据表 MethodDef 中读取 .net文件中使用的定义方法的地址信息; 根据 该地址信息读取该定义方法的方法头。 才艮据方法头读取 ILcode 的步 4聚包括: 居方法头中的信息确定 ILcode 的长度;根据确定的 ILcode的长度读取 ILcode。当该定义方法为大头方法时,
方法头中的第 5-8字节为该方法对应的 IL的长度, 第 9-12字节为该方法中 局部变量的标识 token; 当该定义方法为小头方法时, 方法头中的高 6位为 ILcode的长度。 步骤 804: 居上述方法头获取该定义方法的局部变量, 并 居该局部 变量的类型确定该局部变量的偏移, 其中, 局部变量的偏移指该局部变量 在. net文件对应的压缩结构中的偏移; 因为小头方法中不存在局部变量, 此时, 获取的局部变量为空, 对应的 局部变量的偏移也为空, 判断该定义方法为小头方法还是大头方法, 判断的 依据可以釆用实施例 6中的方法实现, 这里不再详述。 步骤 806: 对读取的 ILcode进行压缩, 计算压缩后的 ILcode的长度; 对 ILcode进行压缩时, 可以才艮据具体的 ILcode确定压缩方式, 具体压 缩方式可以参见实施例 7中的 ILcode的压缩方法实现, 这里不再详述。 优选的, 当读取并压缩完一条操作指令后, 判断该 ILcode中所有的操作 指令和操作参数是否都被读取并压缩,如果是,执行计算压缩后的 IL的长度; 否则, 读取下一条操作指令并进行压缩。 步骤 808: 按照预先确定的格式对压缩后的 ILcode的长度、 上述局部变 量的偏移和压缩后的 ILcode进行组合, 得到 IL的压缩结果。 优选的, 预先确定的格式为将计算出的压缩后的 ILcode的长度、 局部变 量的偏移和压缩后的 ILcode依次排列的格式,也可以任意改变这三部分的前 后位置。 优选的, 本实施例提供的压缩方法还可以包括: 判断是否上述 .net 文件中使用的所有定义方法的方法头都已经被读取并 完成 IL 压缩, 如果不是, 继续读取下一个定义方法的方法头, 并执行步骤 802-808; 否则, 结束压缩。 本实施例通过对 ILcode进行压缩并计算压缩后的 ILcode的长度, 组合 压缩后的 ILcode,压缩后的 ILcode的长度以及局部变量的偏移作为方法体的 压缩结果, 可以有效地降氐. net文件占用的存储空间,使. net文件可以在小容 量存储介质 (例如: 智能卡)上存储并运行, 进而增强了小容量存储介质 (例
如: 智能卡) 的功能。 实施例 10 本实施例提供了一种 .net文件的定义方法的方法体的压缩方法, 如图 16 所示, 该方法包括: 步骤 902: 获取定义方法元数据表 MethodDef; 在. net文件中包含有多个表, 其中定义方法元数据表 MethodDef 中记录 了. net 文件中各个定义方法的方法头的位置信息; 本实施例以将下面的代码 编译后得到的 .net文件为例, 说明获取其元数据表 MethodDef的方法: public String My S ampleMethod() { The IL includes: an operation instruction and an operation parameter, and the operation parameter may be empty, or may be an identifier token, an offset to a local variable, or an offset of a mega-turn. The first half of the identifier token indicates the table corresponding to the operation instruction, and the second half indicates the data of the first row of the table. The specific implementation of the method header may be obtained by first obtaining the metadata table MethodDef in the .net file; reading the address information of the definition method used in the .net file from the metadata table MethodDef; reading the definition method according to the address information Method header. The step 4 of reading the ILcode according to the method header includes: The information in the method header determines the length of the ILcode; the ILcode is read according to the determined length of the ILcode. When the definition method is a big method, The 5-8 bytes in the method header are the length of the IL corresponding to the method, and the 9-12 bytes are the identifier token of the local variable in the method; when the definition method is the small header method, the method header is high 6 bits are the length of ILcode. Step 804: The method header obtains a local variable of the defined method, and the type of the local variable determines an offset of the local variable, where the offset of the local variable refers to the local variable in the compressed structure corresponding to the .net file. The offset is because there is no local variable in the small head method. At this time, the obtained local variable is empty, and the offset of the corresponding local variable is also empty. It is judged whether the definition method is a small header method or a large header method. It can be implemented by the method in Embodiment 6, and will not be described in detail here. Step 806: Compress the read ILcode and calculate the length of the compressed ILcode. When compressing the ILcode, the compression mode may be determined according to the specific ILcode. For the specific compression method, refer to the ILcode compression method in Embodiment 7. Implementation, no longer detailed here. Preferably, after reading and compressing an operation instruction, determining whether all operation instructions and operation parameters in the ILcode are read and compressed, and if so, performing calculation of the length of the compressed IL; otherwise, reading An operation instruction and compression. Step 808: Combine the length of the compressed ILcode, the offset of the local variable, and the compressed ILcode according to a predetermined format to obtain a compression result of the IL. Preferably, the predetermined format is a format in which the calculated length of the compressed ILcode, the offset of the local variable, and the compressed ILcode are sequentially arranged, and the front and rear positions of the three parts may be arbitrarily changed. Preferably, the compression method provided in this embodiment may further include: determining whether the method headers of all the defined methods used in the above .NET file have been read and completing IL compression, and if not, continuing to read the next definition method. Method header, and perform steps 802-808; otherwise, end compression. In this embodiment, by compressing ILcode and calculating the length of the compressed ILcode, combining the compressed ILcode, the length of the compressed ILcode, and the offset of the local variable as the compression result of the method body can effectively lower the net file. The occupied storage space enables the .net file to be stored and run on a small-capacity storage medium (for example, a smart card), thereby enhancing the small-capacity storage medium (for example) Such as: Smart card) features. Embodiment 10 This embodiment provides a method for compressing a method body of a .NET file definition method. As shown in FIG. 16, the method includes: Step 902: Obtain a definition method metadata table MethodDef; included in a .net file There are a plurality of tables, wherein the method metadata table MethodDef records the location information of the method headers of each definition method in the .net file; this embodiment takes the .net file obtained by compiling the following code as an example, and obtains the description thereof. Method of metadata table MethodDef: public String My S ampleMethod() {
String strHello = "Hello World!"; if (strHello != null) strHello += "Y"; return strHello + callCount.ToString(); } 对上述代码使用. net平台编译后得到 helloworldexe文件, 并以二进制的 形式存储在硬盘上, 该二进制文件为. net文件, 如图 17所示, 为本实施例提 供的 .net 文件的结构示意图, 该文件包括 Dos 头、 PE 特征以及元数据 ( MetaData ), 元数据中包括元数据头( MetaData Header ),元数据表 ( tables ) 等。 读取元数据表 MethodDef的过程如下: a.定位. net文件 Dos头, 本实施例得到的 Dos头为 0x5a4d; b.从 dos头后跳过第一约定个字节, 读出 PE特征的偏移地址, 得到 PE 特征的偏移地址 0x00000080; 在本实施例中, 第一约定个字节为 0x003a个字节;
c. 才艮据 PE特征偏移地址 0x00000080定位 PE特征, 定位得到 PE特征 0x4550; d. 在 PE特征向后第二约定个字节处读取四个字节, 在本实施例中, 以 32位机为例进行说明,第二约定个字节为从 PE特征向后偏移 0x0074字节后, 读出 4个字节的数据为 0x00000010 , 此值说明该二进制文件中存在 0x10个 目录且包含. net 数据; 其中, .net 文件的元数据头相对虚拟地址写在上述第 OxOF个目录中, 在 64位机中第二约定个字节为 0x0084个字节; e. 从上述数据 0x00000010 ,向后偏移第三约定个字节后读取八个字节数 据, 在本实施例中, 优选地, 第三约定个字节为 112个字节, 在此八个字节 数据中, 前四个字节为 0x00002008 , 为. net数据头的相对虚拟地址, 后四个 字节为 0x00000048, 为. net数据头的长度; String strHello = "Hello World!"; if (strHello != null) strHello += "Y"; return strHello + callCount.ToString(); } Use the .net platform to compile the helloworldexe file in binary code. The form is stored on the hard disk. The binary file is a .net file. As shown in FIG. 17, the structure of the .net file provided in this embodiment includes a Dos header, a PE feature, and metadata (metadata). It includes metadata headers (metadata headers), metadata tables (tables), and so on. The process of reading the metadata table MethodDef is as follows: a. Positioning. The net file Dos header, the Dos header obtained in this embodiment is 0x5a4d; b. The first agreed byte is skipped from the dos header, and the PE feature is read out. The address is shifted to obtain the offset address 0x00000080 of the PE feature. In this embodiment, the first agreed byte is 0x003a bytes. c. The PE feature is located according to the PE feature offset address 0x00000080, and the PE feature 0x4550 is obtained. d. Four bytes are read at the second agreed byte backward of the PE feature. In this embodiment, 32 The bit machine is taken as an example for description. After the second agreed byte is offset from the PE feature by 0x0074 bytes, the data of 4 bytes is read as 0x00000010. This value indicates that there are 0x10 directories in the binary file and includes . net data; wherein, the metadata header of the .net file is written in the above OxOF directory relative to the virtual address, and the second agreed byte in the 64-bit machine is 0x0084 bytes; e. from the above data 0x00000010, After the third predetermined byte is read, the eight bytes of data are read. In this embodiment, preferably, the third agreed byte is 112 bytes, and among the eight bytes of data, the first four are The byte is 0x00002008, which is the relative virtual address of the .net data header, and the last four bytes are 0x00000048, which is the length of the .net data header;
£ 才艮据 .net数据头的相对虚拟地址得到线性地址 0x00000208,并读取 .net 数据头得到如下数据: The data base of the .net data header gets the linear address 0x00000208, and reads the .net header to get the following data:
48000000020005008C210000A0090000 090000000500000600000000000000005 48000000020005008C210000A0090000 090000000500000600000000000000005
020000080000000000000000000000000 020000080000000000000000000000000
0000000000000 需要说明的是, 上述数据釆用小端的存储方式, 例如, 上述数据前 4个 字节 0x48000000为该数据的长度, 转换成大端的存储方式为 0x0000048; 在本实施例中, 线性地址为. net数据在 .net文件中的地址,相对虚拟地址 为相对于 PE载入点的内存偏移, 线性地址和相对虚拟地址的转换关系为: 线性地址=相对虚拟地址-节相对虚拟地址 +节的文件偏移, 在本实施例中,读 取. net文件中 .net数据目录的节的相对虚拟地址为 0x00002000 , 节的文件偏 移为 0x00000200 , 则线性地址 =0x00002008-0x00002000 +0x00000200 =0x00000208; g. 由. net数据头向后偏移第四约定个字节处读取八个字节数据, 在本实
施例中第四约定个字节为从 .net数据头向后偏移 8个字节后, 读取共 8个字 节, 在这 8 个字节中, 前四个字节为 0x0000218c, 为元数据头 ( MetaData Header ) 的相对虚拟地址, 后四个字节为 0x000009a0, 为元数据的长度; h. 将元数据头的相对虚拟地址 0x0000218c 转换得到线性地址 0x0000038c, 根据线性地址和元数据长度, 得到元数据内容; i. 由元数据头向后读取, 当读取到标志" #〜 "时, 读取标志" #〜"前的八个 字节, 其中前四个字节为" #〜"的地址, 通过该地址得到" #〜"流, 在" #〜"流中 第五约定个字节开始读取长度为 8个字节的数据, 即 0x0000000920021c57, 其二进制形式为 100100100000000000100001110001010111; 在本实施例中,第五约定个字节为" #〜"流中起始位开始算起第 9个字节; j.根据步骤 i中得到的二进制数据, 从低位开始读取, 例如, 第 1位代表元数 据表 Module是否存在, 如果是 1则证明存在元数据表 Module, 如果是 0证 明不存在, 在本实施例中, 存在元数据表 Module, 并且第 2位为 1 , 表示元 数据表 TypeRef存在, 按此规律以位为单位由低位读向高位, 第 7位为 1 , 表示元数据表 MethodDef存在; 其中, 在步骤 i中所得到的数据中, 从低位 开始, 每一位代表. net文件中是否存在对应的表; a) k. 在数据 0x0000000920021c57后偏移第六约定个字节读取元数据表 MethodDef的数据行数, 在本实施例中第六约定个字节为 24个字节, 具体的 为向后偏移 24个字节后读取 4个字节, 得到数据 0x00000002, 判断得出元 数据表 MethodDef中存在 2个数据行; 其中, 在元数据中, 数据 0x0000000900021C57向后偏移 8个字节后的 数据中以每 4个字节为一个单位依次存储了在. net文件中存在的元数据表的 数据行数,在表示数据行数的数据后,依次存储了每个元数据表的具体内容, 为元数据表区域。在本实施例中,元数据表 MethodDef前存在 4个元数据表, 因此在数据 0x0000000900021C57后偏移 8+4*4=24 个字节后读取元数据表 MethodDef的数据行数, 上述为第六约定个字节的计算方法; b) 1. 根据约定的方法读取元数据表 MethodDef的具体内容; 其中, 约定的方法如下, 以本实施例中的. net 文件为例进行说明, 由步 骤 k中所述的方法, 可得到元数据表 MethodDef前的五个表中共存在 28个 数据行, 其中元数据表 Module包括数据行 1行, 元数据表 TypeRef 包括数
据行 23行, 元数据表 TypeDef包括数据行 3行, 元数据表 Field包括数据行 1个, 其中, 元数据表 Module的数据行每行为 10个字节, 元数据表 TypeRef 的数据行每行为 6个字节, 元数据表 TypeDef的数据行每行为 14个字节, 元数据表 Field的数据行每行为 6个字节, 因此元数据表 MethodDef在元数 据表区 i或的偏移为 1* 10+23*6+3* 14+1*6=196个字节, 元数据表 MethodDef 的数据行为 2行, 每行数据为 14个字节, 因此元数据表 MethodDef的数据 长度为 2* 14=28个字节; 元数据表 MethodDef的数据如下: 0000000000000 It should be noted that the above data is stored by the little endian. For example, the first 4 bytes of the above data 0x48000000 is the length of the data, and the storage mode converted to the big end is 0x0000048; in this embodiment, the linear address is. The address of the net data in the .net file, the relative virtual address is the memory offset relative to the PE load point, and the conversion relationship between the linear address and the relative virtual address is: linear address = relative virtual address - section relative virtual address + section File offset. In this embodiment, the relative virtual address of the section of the .net data directory in the .net file is 0x00002000, the file offset of the section is 0x00000200, and the linear address=0x00002008-0x00002000 +0x00000200 =0x00000208; Reading eight bytes of data from the .net data header backward by the fourth contracted byte, in this case In the example, the fourth agreed byte is offset from the .net data header by 8 bytes, and reads a total of 8 bytes. Among the 8 bytes, the first four bytes are 0x0000218c, which is The relative virtual address of the metadata header (MetaData Header), the last four bytes are 0x000009a0, which is the length of the metadata; h. The relative virtual address 0x0000218c of the metadata header is converted to the linear address 0x0000038c, according to the linear address and the length of the metadata. , get the metadata content; i. read backward by the metadata header, when reading the flag "#~", read the eight bytes before the flag "#~", where the first four bytes are " The address of #~", the "#~" stream is obtained by the address, and the data of length 8 bytes is read in the fifth agreed byte in the "#~" stream, that is, 0x0000000920021c57, and its binary form is 100100100000000000100001110001010111; In this embodiment, the fifth agreed byte is the ninth byte starting from the start bit in the "#~"stream; j. according to the binary data obtained in step i, starting from the lower bit, for example, The first bit represents whether the metadata table Module exists. If it is 1, it proves that there is a metadata table Module. If it is 0, the certificate does not exist. In this embodiment, there is a metadata table Module, and the second bit is 1, indicating that the metadata table TypeRef exists, according to this rule, the bit is The unit reads from the low position to the high position, and the seventh bit is 1, indicating that the metadata table MethodDef exists; wherein, in the data obtained in step i, starting from the lower position, each bit represents whether there is a corresponding table in the .net file; a) k. After the data 0x0000000920021c57, the sixth predetermined number of bytes is read and the number of data rows of the metadata table MethodDef is read. In this embodiment, the sixth agreed byte is 24 bytes, specifically backward biased. After shifting 24 bytes, 4 bytes are read, and the data 0x00000002 is obtained. It is judged that there are 2 data rows in the metadata table MethodDef; wherein, in the metadata, the data 0x0000000900021C57 is shifted backward by 8 bytes. In the data, the number of data rows of the metadata table existing in the .net file is sequentially stored in units of 4 bytes, and after the data indicating the number of data rows, the specific contents of each metadata table are sequentially stored. Yuan According to the table area. In this embodiment, there are four metadata tables before the metadata table MethodDef, so after the data 0x0000000900021C57 is offset by 8+4*4=24 bytes, the number of data rows of the metadata table MethodDef is read, the above is the first The calculation method of the six-byte byte; b) 1. The specific content of the metadata table MethodDef is read according to the agreed method; wherein, the agreed method is as follows, and the .net file in this embodiment is taken as an example for explanation. In the method described in k, there are 28 data rows in the five tables before the metadata table MethodDef, wherein the metadata table Module includes 1 row of the data row, and the metadata table TypeRef includes the number According to line 23, the metadata table TypeDef includes 3 rows of data rows, and the metadata table Field includes 1 data row, wherein the data row of the metadata table Module per transaction has 10 bytes, and the data row of the metadata table TypeRef per behavior 6 bytes, the data row of the metadata table TypeDef is 14 bytes per behavior, and the data row of the metadata table Field is 6 bytes per behavior, so the offset of the metadata table MethodDef in the metadata table area i is 1 * 10+23*6+3* 14+1*6=196 bytes, the data of MethodDef in metadata table is 2 lines, each line of data is 14 bytes, so the data length of the metadata table MethodDef is 2* 14=28 bytes; the data of the metadata table MethodDef is as follows:
0C210000 0000 8600 8900 3000 0100 0C210000 0000 8600 8900 3000 0100
47210000 0000 8618 8300 2C00 0100 其中, 每行数据表示一个定义方法对应的信息, 每行数据的前四个字节 为该定义方法的方法头的相对虚拟地址 RVA, 为便于说明, 将上述数据以列 表的形式表示, 参见表 9: 表 9 47210000 0000 8618 8300 2C00 0100 where each row of data represents information corresponding to a defined method, and the first four bytes of each row of data are relative virtual addresses RVA of the method header of the defined method. For convenience of explanation, the above data is listed. Formal representation, see Table 9: Table 9
表 9 中的数据使用的是大端的表示方法, 例如第一个数据行的数据为 The data in Table 9 uses the big endian representation, for example, the data of the first data row is
0C210000 0000 8600 8900 3000 0100, 对应的小端表示方法为 0000210C 0000 0086 0089 0030 0001 ; 步骤 904: 读取定义方法中的方法头, 才艮据方法头中的信息获取局部变 量信息; 在本实施例中以元数据表 MethodDef中第一个定义方法为例进行说明, 0C210000 0000 8600 8900 3000 0100, the corresponding small end representation method is 0000210C 0000 0086 0089 0030 0001; Step 904: Read the method header in the definition method, and obtain the local variable information according to the information in the method header; The first definition method in the metadata table MethodDef is taken as an example.
步骤 904a:获取定义方法的方法头的地址,并读取方法头的第一个字节; 根据步骤 902中得到的元数据表 MethodDef中的数据, 第一个定义方法 的方法头的 RVA为 0x0000210C, 由 RVA转换得到线性地址为 0x0000030C, 根据地址 0x0000030C 在. net 文件中读取数据, 读取数据字节长度为一个字
节; 在本实施例中, 读取得到的第一个字节为: 0x13; 步骤 904b:判断该定义方法是大头方法还是小头方法,如果是大头方法, 执行步骤 904d, 如果是小头方法, 执行步骤 904c; 其中, 判断该定义方法为大头方法还是小头方法的方法和实施例 6、 实 施例 7相同, 这里不再详述。 例如, 在本实施例中, 将步骤 904a中获得的数据 0x13以二进制形式表 示为: 00010011 , 最氐两位为 11 , 为大头方法。 步骤 904c: 该定义方法为小头方法, 不存在局部变量, 将局部变量置为 空, 并获取完整的定义方法头信息和与该方法对应的 ILcode; 步骤 904d: 获取完整的定义方法头信息和与该方法对应的 ILcode, 并获 得局部变量 token; 参见图 19, 为本实施例提供的大头方法的数据格式示意图, 由图 19可 ^p, 大头方法标 i只由 12 个字节组成, 前两个字节为标 i只信息, 对应着图中 的 Flags项;之后的两个字节为最大栈大小信息,对应着图中的 MaxStack项; 再后的 4个字节为与该方法对应的 ILcode大小信息,对应着图中的 Code Size 项; 最后 4个字节为局部变量 token , 对应着图中的 Local Variables Signature Token项。 分析大头方法标识中的局部变量 token的数据, 如果该数据为 0, 则表明局部变量个数为 0; 否则, 居该值定位到元数据表的 StandAloneSig 表(独立特征描述符表, 该表具有作为方法局部变量的复合特征), 从该表中 的 Value项的内容读取到数据项签名的偏移, 并读取局部变量个数; 由大头方法的格式可知, 方法头信息第 5-8字节为 ILcode长度, 第 9-12 字节为局部变量 token , 局部变量 token 后为 ILcode , 本实施例中为: 0x11000002, 局部变量 token使用的是大端的表示方法。 在本实施例中, 获 取方法头, 并才艮据方法头获取与该方法对应的 ILcode如下: Step 904a: Obtain the address of the method header of the definition method, and read the first byte of the method header. According to the data in the metadata table MethodDef obtained in step 902, the RVA of the method header of the first definition method is 0x0000210C. , the linear address obtained by RVA conversion is 0x0000030C, the data is read in the .net file according to the address 0x0000030C, and the length of the read data byte is one word. In this embodiment, the first byte read is: 0x13; Step 904b: Determine whether the definition method is a big header method or a small header method, if it is a big header method, perform step 904d, if it is a small header method Step 904c is performed. The method for determining whether the definition method is a big head method or a small head method is the same as that of Embodiment 6 and Embodiment 7, and will not be described in detail herein. For example, in the present embodiment, the data 0x13 obtained in step 904a is expressed in binary form as: 00010011, and the last two digits are 11, which is a big header method. Step 904c: The definition method is a small header method, there is no local variable, the local variable is set to null, and a complete definition method header information and ILcode corresponding to the method are obtained; Step 904d: Obtain a complete definition method header information and The IL code corresponding to the method, and obtains the local variable token; see FIG. 19, which is a schematic diagram of the data format of the big-head method provided by the embodiment, which can be composed of FIG. 19, and the large-head method label i consists of only 12 bytes. The two bytes are the information of the standard i, which corresponds to the Flags item in the figure; the next two bytes are the maximum stack size information, corresponding to the MaxStack item in the figure; the next four bytes are corresponding to the method. The ILcode size information corresponds to the Code Size item in the figure; the last 4 bytes are the local variable token, corresponding to the Local Variables Signature Token item in the figure. Analyze the data of the local variable token in the big-head method identifier. If the data is 0, it indicates that the number of local variables is 0; otherwise, the value is located in the StandAloneSig table of the metadata table (independent feature descriptor table, the table has As a composite feature of the method local variable), the offset of the data item signature is read from the content of the Value item in the table, and the number of local variables is read; the format of the big head method is known, the method header information is 5-8 The byte is the ILcode length, the 9th-12th byte is the local variable token, and the local variable token is ILcode. In this embodiment: 0x11000002, the local variable token uses the big endian representation. In this embodiment, the method header is obtained, and the ILcode corresponding to the method is obtained according to the method header as follows:
参见图 20, 为本实施例提供的小头方法的数据格式示意图, 由图 20可
知, 小头方法的方法头的高 6位为 ILcode的长度, 低 2位为说明其为小头方 法的标识位。 步骤 S904e: 居局部变量 token获取局部变量, 居局部变量的类型确 定该局部变量在. net文件的压缩结构中的偏移; 获取局部变量方法为: 根据局部变量 token定位元数据表 StandAloneSig 的数据行, 该表中记录了定义方法的局部变量信息在 Blob流中的偏移,根据 该偏移读取局部变量的信息; 在本实施例中,局部变量 token为 Ox 11000002 , Ox 11说明指向序号为 Ox 11 的元数据表, 即元数据表 StandAloneSig, 0x000002 为该表第二行数据, 则 该定义方法的局部变量对应元数据表 StandAloneSig 的第二行数据, 得到的 数据为: 0x0100 , 0x0100为局部变量在 Blob流中的偏移,才艮据该偏移在 Blob 流中读取局部变量信息为: 0x0607040E080E02 , 其中, 0x06 表示局部变量 信息的长度, 0x07为局部变量标识, 0x04表示局部变量的长度, OxOE表示 局部变量的类型为引用类型 String, 才艮据 .net的压缩结构可知, 以 String在引 用类型 TypeRef中的偏移为 0x04为例说明, 则压缩程序将 .net文件中存储该 定义方法的局部变量替换为: 0x04 , 同理, 08表示局部变量的类型为引用类 型 Int, 在. net文件的压缩结构中偏移为 0x07; 02表示局部变量的类型为引用 类型 Boolean, 在. net文件的压缩结构中偏移为 0x05;在该定义方法中存在两 个 OxOE, 说明该定义方法中存在两个为类型 String的局部变量。 其中, Blob ¾ϊ的定位方法 ¾口下: 压缩程序定位 Blob流的位置, 在步骤 S302中的步骤 h中获得元数据头 的地址 0x0000038c后, 从元数据头开始向后读取, 当发现标记" #Blob"后, 读取" #Blob"的前 8个字节, 得到数据 Ox 2006000098010000,其中高 4个字节 为 Blob流相对于元数据头的偏移, 低 4个字节为 Blob流的长度, 高 4个字 节转换成大端的表示方式为 Ox 00000620,^ 4个字节转换成大端的表示方式 为 0x00000198; 压缩程序根据元数据头的地址 0x0000038c , 向后偏移 Ox 00000620得到 Blob流的数据区域; 步骤 906: 读取 ILcode, 并对 ILcode进行压缩; 在本实施例中, 步骤 904d获取的上述定义方法的方法头和 ILcode信息 如下:
其中, 局部变量 tokenOx 11000002 后为 ILcode , 20 is a schematic diagram of a data format of the small head method provided by this embodiment. It is known that the upper 6 bits of the method header of the small header method are the length of ILcode, and the lower 2 bits are the identification bits indicating that it is the small header method. Step S904e: The local variable token acquires a local variable, and the type of the local variable determines the offset of the local variable in the compressed structure of the .net file; the method of obtaining the local variable is: positioning the data row of the metadata table StandAloneSig according to the local variable token The table records the offset of the local variable information of the defined method in the Blob stream, and reads the information of the local variable according to the offset. In this embodiment, the local variable token is Ox 11000002, and the Ox 11 indicates that the pointing sequence is The metadata table of Ox 11, that is, the metadata table StandAloneSig, 0x000002 is the second row of data of the table, then the local variable of the definition method corresponds to the second row data of the metadata table StandAloneSig, and the obtained data is: 0x0100, 0x0100 is local The offset of the variable in the Blob stream is based on the offset. The local variable information is read in the Blob stream: 0x0607040E080E02 , where 0x06 represents the length of the local variable information, 0x07 is the local variable identifier, and 0x04 represents the length of the local variable. , OxOE indicates that the type of the local variable is the reference type String, only the compression structure of .net can be known, to S The tring offset in the reference type TypeRef is 0x04. For example, the compression program replaces the local variable storing the definition method in the .net file with: 0x04. Similarly, 08 indicates that the type of the local variable is the reference type Int. The net file compression structure has an offset of 0x07; 02 indicates that the local variable type is a reference type Boolean, and the offset is 0x05 in the compression structure of the .net file; there are two OxOEs in the definition method, indicating the definition method There are two local variables in the type String. Wherein, the positioning method of Blob 3⁄4ϊ is: the compression program locates the position of the blob stream, and after obtaining the address 0x0000038c of the metadata header in step h in step S302, the backward reading is started from the metadata header, when the tag is found. After #Blob", the first 8 bytes of "#Blob" are read, and the data Ox 2006000098010000 is obtained, where the upper 4 bytes are the offset of the Blob stream relative to the metadata header, and the lower 4 bytes are the Blob stream. The length, the high 4 bytes are converted to the big end representation of Ox 00000620, ^ 4 bytes are converted to the big end representation is 0x00000198; the compression program is based on the metadata header address 0x0000038c, backward offset Ox 00000620 to get the Blob stream Step 906: Read ILcode, and compress ILcode; In this embodiment, the method header and ILcode information of the above defined method obtained in step 904d are as follows: Among them, the local variable tokenOx 11000002 is followed by ILcode,
0x0000002F表示 ILcode的长度, 确定 ILcode为: 0x0000002F indicates the length of ILcode, and determines ILcode as:
061201281300000A281200000A0C2B00082 A 061201281300000A281200000A0C2B00082 A
.net文件中的 IL包括一个或多个操作指令和操作参数, 其中包括两种组 合形式, 一种为 "操作指令 opcode+操作参数"的形式, 另一种只有操作指令。 其中, 操作参数主要包括三种: 操作对象的 token、 偏移或局部变量在本方法 ILcode中的偏移, 一般情况下, 跳转指令后的操作参数为偏移, 局部变量在 本方法 ILcode中的偏移为局部变量在本定义方法中的局部变量中的序号。 参见图 21 , 为本实施例对 ILcode进行压缩的方法流程图, 对上述 IL指 令进行压缩的方法如下: 步骤 906a: 获取 ILcode中的操作指令; 步骤 906b: 判断该操作指令后是否有操作参数, 如果没有, 执行步骤The IL in the .net file includes one or more operation instructions and operation parameters, including two combinations, one in the form of "operation instruction opcode+operation parameter" and the other only operation instruction. Among them, the operation parameters mainly include three types: the operation object's token, offset or local variable offset in the ILcode of the method. In general, the operation parameter after the jump instruction is an offset, and the local variable is in the ILcode of the method. The offset is the sequence number of the local variable in the local variable in the defined method. Referring to FIG. 21, which is a flowchart of a method for compressing ILcode according to the embodiment, the method for compressing the IL command is as follows: Step 906a: Acquire an operation instruction in ILcode; Step 906b: Determine whether there is an operation parameter after the operation instruction, If not, perform the steps
906c, 如果有, 执行步骤 906d; 步骤 906c: 直接记录该操作指令, 执行步 4聚 906h; 步骤 906d: 根据操作指令判断其后的操作参数类型, 如果操作参数为跳 转偏移量, 执行步骤 906e, 如果操作参数为局部变量在其所属方法中的偏移 量, 执行步骤 906f, 如果操作参数为 token, 执行步骤 906g; 步骤 906e: 重新计算操作指令的偏移,替换原有的偏移,执行步骤 906h; 在实施例中, 跳转指令后的操作参数为跳转偏移量, 并且在本实施例中 因对 ILcode进行了压缩, 导致了数据位置的改变, 因此需重新计算偏移, 替 换原有偏移; 例如, 在本实施例所提供的 ILcode 中, 存在一个兆转操作: 0χ2Β00, 其中 0χ2Β为兆转指令, 0x00为偏移, 表示兆转到 0x00后, 在对 本实施例所提供的 ILcode压缩后, 计算得知偏移仍为 0x00, 因此该 ILcode 压缩后仍为: 0x2B00;
在本实施例中的另一个跳转操作: 0x2D0C , 其中, 0x2D为跳转操作, OxOC为跳转偏移, 表示跳转到数据 OxOC后 12个字节的后面, 即将 ILcode 部分: 0x0C067239000070281200000A0A跳过, 在将该被跳转的 ILcode部分 压缩后, 本实施例中 ILcode0x0C067239000070281200000A0A 被压缩为 06720128030A, 为 6个字节, 因此操作指令 0x2D后的偏移应被重新计算为 0x06 , ILcode: 0x2D0C被压缩为 0x2D06^ 步骤 906f: 记录该操作指令和操作参数, 执行步骤 906h; 在本步 4聚中, 说明一种特殊情况, 操作指令 0x0a、 0x0b、 0x0c、 OxOd 也为指向局部变量的操作指令,但是该操作指令本身已经指明局部变量偏移, 操作指令 0x0a指向该指令所属的方法的第一个局部变量, 操作指令 0x0b指 向该指令所属的方法的第二个局部变量,操作指令 0x0c指向该指令所属的方 法的第三个局部变量, 操作指令 OxOd指向该指令所属的方法的第 4个局部 变量, 因此, 这些指令后没有操作参数, 按照步骤 906c操作即可; 当定义方法中的局部变量大于 4个时,直接记录原操作指令和操作参数, 例如 ILcode: 0x1104 , 操作指令 0x11指向该方法中第 5个局部变量, 压缩 时对类似、 ILcode直接 ΐ己录; 需要说明的是, 在步骤 906d和步骤 906f 中, 局部变量的偏移量中的偏 移量指的是该局部变量在其所属的定义方法中所包括的所有局部变量中的排 列序号; 步骤 906g: 记录该操作指令, 并将操作参数中的 token替换为压缩结构 中的偏移, 执行步骤 906h; 本步骤具体为, 将操作参数中指向具体数据的 token 替换为偏移,在. net 文件中, token为四个字节, 高一个字节记录所指向的元数据表, 低三个字节 标明了指向元数据表的哪一行; 例 ¾口, 在本实施例中的 ILcode: 大端表示方法为: Ox 72 1D00007 , Ox 72 为操作指令, 0xlD00007为所指向数据的 token,分析可知, 高一个字节 OxlD 为元数据表序号, 指向元数据表 FieldRVA , 0x000007为行数, 应为第 7行, 则该 IL指令指向. net文件中元数据表 FieldRVA第 7行数据,根据. net文件的 压缩结构判断该行数据的偏移, 例如 0x02 , 则原 ILcode: 72 1D00007可以 替换如下: Ox 72 02;
步骤 906h: 判断该定义方法的 ILcode 中是否所有操作指令和操作参数 都已读取并压缩完成, 如果是执行步骤 906i, 如果不是, 返回步骤 906a; 在. net 文件中, 方法头中记录有 ILcode 长度, 例如本实施例中长度为 0x0000002F , 当读取完该长度的 ILcode数据后即认为该方法的 ILcode中所 有操作指令和操作参数都已读取并压缩完成; 步骤 906Ϊ: 计算该定义方法中压缩后的 ILcode的长度; 在本实施例中, 对上述定义方法中 ILcode压缩后得到的数据为: 906c, if yes, executing step 906d; step 906c: directly recording the operation instruction, performing step 4 906h; step 906d: determining the following operation parameter type according to the operation instruction, if the operation parameter is a jump offset, performing steps 906e, if the operation parameter is an offset of the local variable in the method to which it belongs, step 906f is performed, if the operation parameter is token, step 906g is performed; step 906e: recalculating the offset of the operation instruction, replacing the original offset, Step 906h is performed; in the embodiment, the operation parameter after the jump instruction is a jump offset, and in this embodiment, the ILcode is compressed, resulting in a change in the data position, so the offset needs to be recalculated. The original offset is replaced. For example, in the ILcode provided in this embodiment, there is a mega-transition operation: 0χ2Β00, where 0χ2Β is a mega-instruction, and 0x00 is an offset, indicating that the mega-to 0x00 is in the present embodiment. After the provided ILcode is compressed, it is calculated that the offset is still 0x00, so the ILcode is still compressed after: 0x2B00; Another jump operation in this embodiment: 0x2D0C, where 0x2D is a jump operation, OxOC is a jump offset, indicating that after jumping to the 12 bytes after the data OxOC, the ILcode part is: 0x0C067239000070281200000A0A skip After compressing the jumped ILcode portion, in this embodiment, ILcode0x0C067239000070281200000A0A is compressed to 06720128030A, which is 6 bytes, so the offset after the operation instruction 0x2D should be recalculated to 0x06, ILcode: 0x2D0C is compressed to 0x2D06^ Step 906f: Record the operation instruction and the operation parameter, and execute step 906h; In this step, a special case is illustrated, and the operation instruction 0x0a, 0x0b, 0x0c, OxOd is also an operation instruction pointing to the local variable, but the operation instruction The operation instruction itself has indicated the local variable offset. The operation instruction 0x0a points to the first local variable of the method to which the instruction belongs. The operation instruction 0x0b points to the second local variable of the method to which the instruction belongs. The operation instruction 0x0c points to the instruction to which the instruction belongs. The third local variable of the method, the operation instruction OxOd points to the fourth method of the method to which the instruction belongs. Part variables, therefore, there is no operation parameter after these instructions, according to step 906c operation; when the local variables in the definition method are greater than 4, directly record the original operation instruction and operation parameters, such as ILcode: 0x1104, operation instruction 0x11 points to The fifth local variable in the method is similar to that of ILcode when it is compressed. It should be noted that in steps 906d and 906f, the offset in the offset of the local variable refers to the local variable. The sequence number of all the local variables included in the definition method to which it belongs; Step 906g: Record the operation instruction, and replace the token in the operation parameter with the offset in the compression structure, and perform step 906h; Replace the token pointing to the specific data in the operation parameter with the offset. In the .net file, the token is four bytes, the higher one byte records the metadata table pointed to, and the lower three bytes indicate the pointing to the metadata table. Which line of the example; Example 3⁄4, ILcode in this embodiment: The big end representation method is: Ox 72 1D00007, Ox 72 is the operation instruction, 0xlD00007 is pointed Data token, analysis shows that the high byte OxlD is the metadata table number, pointing to the metadata table FieldRVA, 0x000007 is the number of rows, should be the 7th line, then the IL instruction points to the .net file metadata table FieldRVA 7th Row data, according to the compression structure of the .net file to determine the offset of the row data, such as 0x02, then the original ILcode: 72 1D00007 can be replaced as follows: Ox 72 02; Step 906h: Determine whether all operation instructions and operation parameters of the defined method are read and compressed, if step 906i is performed, if not, return to step 906a; in the .net file, ILcode is recorded in the method header. The length, for example, is 0x0000002F in this embodiment. After reading the ILcode data of the length, all the operation instructions and operation parameters in the ILcode of the method are read and compressed. Step 906: Calculate the definition method. The length of the compressed ILcode; In this embodiment, the data obtained by the ILcode compression in the above defined method is:
[00] [72](02)[0A] [19] [0B] [06] [14] [FE01] [0D] [09] [2D] [06] [06] [72](01)[28]( 03) [OA] [06] [ 12] [01] [28] (05)[28](03) [0C] [2B] [00] [08] [2A] 为了方便说明, 将数据中的操作指令加上中括号表示, 操作参数使用小 括号表示, 计算长度为 0x0020; 步骤 908 : 判断是否所有定义方法的方法头均已读取并完成了对应的 ILcode的压缩, 如果是, 执行步 4聚 910, 如果不是, 返回步 4聚 904, 继续读 取下一个定义方法头; 在. net文件中, 如步骤 902中步骤 k中所述, 记录了定义方法的行数, 即定义方法的个数, 在本实施例中为 4个定义方法, 当 4个定义方法头均读 取并完成对应的 ILcode压缩时,认为所有定义方法头均已读取并完成对应的 ILcode压缩; 步骤 910: 按照预先确定的格式组织压缩后的 ILcode、 压缩后的 ILcode 的长度和局部变量偏移, 得到压缩后的方法体; 在本实施例中, 优选地, 预先确定的格式如表 10所示: 表 10 压缩后的 ILcode的长度 局部变量的偏移 压缩后的 ILcode [00] [72](02)[0A] [19] [0B] [06] [14] [FE01] [0D] [09] [2D] [06] [06] [72](01)[28 ]( 03) [OA] [06] [ 12] [01] [28] (05)[28](03) [0C] [2B] [00] [08] [2A] For the convenience of explanation, the data will be The operation instruction is indicated by brackets, the operation parameters are represented by parentheses, and the calculation length is 0x0020. Step 908: Determine whether the method headers of all defined methods have been read and completed the compression of the corresponding ILcode, and if so, the execution step 4 910, if not, return to step 4 904, continue to read the next definition method header; in the .net file, as described in step k of step 902, the number of rows defining the method is recorded, that is, the method is defined. The number, in this embodiment, is four definition methods. When all four defined method headers read and complete the corresponding ILcode compression, it is considered that all the defined method headers have been read and completed the corresponding ILcode compression; Step 910: The compressed ILcode, the length of the compressed ILcode, and the local variable offset are organized according to a predetermined format to obtain a compressed method body. In this embodiment, preferably, The predetermined format is shown in Table 10: Table 10 The length of the compressed ILcode The offset of the local variable The compressed ILcode
例如在本实施例中, 方法体的压缩结果为: For example, in this embodiment, the compression result of the method body is:
[2000] [(04)(07)(04)(05)] [00] [72] (02) [0A] [19] [0B] [06] [14] [FE01][0D] [09] [[2000] [(04)(07)(04)(05)] [00] [72] (02) [0A] [19] [0B] [06] [14] [FE01][0D] [09] [
2D] [06] [06] [72] (01)[28](03)[0A] [06] [ 12] [01] [28](05)[28](03)[0C] [2B] [00] Γ081 Γ
2A] 上述压缩后的方法体使用小端的表示方法, 其中, 0x2000标识压缩后的 ILcode的长度,(04)(07)(04)(05)分别标识 4个局部变量在元数据表中的偏移, 后面的数据为压缩后的 ILcode , 包括压缩后的操作指令及其参数。 下面给出本实施例中第二个定义方法的方法体的压缩结果: 2D] [06] [06] [72] (01)[28](03)[0A] [06] [ 12] [01] [28](05)[28](03)[0C] [2B] [00] Γ 081 Γ 2A] The compressed method body uses the little end representation method, where 0x2000 identifies the length of the compressed ILcode, and (04)(07)(04)(05) identifies the bias of the four local variables in the metadata table. After the shift, the following data is the compressed ILcode, including the compressed operation instructions and their parameters. The compression result of the method body of the second definition method in this embodiment is given below:
[0400] [] [02] [28] (07) [2 A] 上述压缩结果中的压缩格式仅为最优的结构, 可以故相应的变换, 例如 将局部变量的偏移放到压缩后的 IL指令的长度前,或将局部变量的偏移放到 压缩后的 IL指令后等。 本实施例所提供的方法通过对. net 文件中的方法体进行压缩, 有效地减 小了 .net文件占用的存储空间, 可以降低. net文件的使用受存储空间的限制, 有利于 .net文件在小存储容量的设备上推广应用。 实施例 11 本实施例提供了一种 .net文件中命名空间的压缩方法, 如图 22所示, 该 方法包括: 步 4聚 1002, 获取. net文件中当前类型所属的命名空间名称; 优选的, 本实施例中命名空间名称的获取方法可以具体为: 获取. net 文 件中包含命名空间名称偏移的元数据表; 从上述元数据表中获取上述当前类 型所属的命名空间名称偏移,根据所述命名空间名称的偏移在" #Strings"流中 读取命名空间名称。 其中, 在. net 文件中包含有多个表, 其中包含命名空间名称偏移的元数 据表有定义类型或接口表 TypeDef, 引用类型表 TypeRef。 步骤 1004, 按照预先确定的算法对上述命名空间名称进行压缩; 本实施例釆用下述方法进行命名空间名称的压缩: 将上述命名空间名称组成命名空间字符串; 对该命名空间字符串进行散 列运算得到散列值; 取该散列值中预定的字节作为压缩后的命名空间名称。
其中, 散列运算釆用的算法可以是: MD5、 SHA-1或 SHA-2等。 将上述命名空间名称组成命名空间字符串包括: 使用连接符将. net 文件 的公钥标记与命名空间名称连接得到命名空间字符串。 步骤 1006, 确定上述命名空间名称对应的类型计数, 其中, 类型计数是 指在该命名空间中包括的类型的个数; 优选的, 类型计数可以通过下述方法确定: 当上述命名空间名称是第一次获取时 (即: 之前没有获取过此命名空间 名称), 将上述命名空间名称对应的类型计数置 1 , 以后每获取一次上述命名 空间名称, 就将该类型计数加 1 , 直至遍历完上述元数据表。 步骤 1008,按照预先确定的格式对压缩后的上述命名空间名称和上述类 型计数进行组合, 得到上述命名空间名称对应的命名空间的压缩结果。 其中, 预先确定的格式可以为固定长度的字节, 在此固定长度的字节中 包含两部分, 一部分字节为上述类型计数, 另一部分剩余字节为上述压缩后 的命名空间名称。 优选的, 上述获取. net 文件中当前类型所属的命名空间名称的步骤 (步 骤 1002 ) 之后还包括: 判断当前获取的上述命名空间名称是否已被获取过, 如果没有, 执行上 述步骤 1004, 如果上述命名空间名称已被获取过, 将该命名空间的类型计数 力口 1。 优选的, 本实施例提供的. net文件中命名空间的压缩方法还可以包括: 判断是否所有类型所属的命名空间名称都已经读取到; 如果是, 执行上述按照预先确定的格式对压缩后的上述命名空间名称和 上述类型计数进行组合的步 4聚 (即, 步 4聚 1008 ); 否则, 读取下一个类型所属的命名空间名称 (即返回步 4聚 1002 )。 本实施例釆用对获取到的命名空间名称进行压缩, 并将压缩后的命名空 间名称与对应的类型计数组合,得到压缩后的命名空间,可以有效地降低. net
文件占用的存储空间, 使. net 文件可以在 'j、容量存储介质 (例如: 智能卡) 上存储并运行, 进而增强了小容量存储介质 (例如: 智能卡) 的功能。 实施例 12 本实施例提供了一种 .net 文件中命名空间的压缩方法, 该方法中将经 过. net平台编译后的未进行命名空间压缩的文件称为. net文件,并通过压缩程 序完成命名空间的压缩过程。 如图 23所示, 该方法包括: 步骤 1102 , 获取. net文件中包含命名空间名称偏移的元数据表; 在. net 文件中包含有多个元数据表, 其中包含命名空间名称偏移的元数 据表有 TypeDef (定义类型、 接口表)、 TypeRef (引用类型表), 下面以在. net 文件中获取元数据表 TypeDef为例说明元数据表的获取过程; t据表的 方法: namespace MyCompany. MyOnCardApp [0400] [] [02] [28] (07) [2 A] The compression format in the above compression result is only an optimal structure, so the corresponding transformation can be performed, for example, the offset of the local variable is put into the compressed Before the length of the IL instruction, or after shifting the offset of the local variable to the compressed IL instruction. The method provided in this embodiment compresses the method body in the .net file, effectively reduces the storage space occupied by the .net file, and can reduce the use of the net file by the storage space, which is beneficial to the .net file. Promote applications on devices with small storage capacity. Embodiment 11 This embodiment provides a method for compressing a namespace in a .NET file. As shown in FIG. 22, the method includes: Step 4: Collecting 1002, obtaining a namespace name of a current type in the .NET file; The method for obtaining the namespace name in the embodiment may be specifically: obtaining a metadata table containing a namespace name offset in the .net file; obtaining a namespace name offset of the current type from the metadata table, according to The offset of the namespace name reads the namespace name in the "#Strings" stream. The .net file contains multiple tables, and the metadata table including the namespace name offset has a definition type or an interface table TypeDef, and a reference type table TypeRef. Step 1004: compress the namespace name according to a predetermined algorithm. In this embodiment, the namespace name is compressed by using the following method: forming the namespace name into a namespace string; and scattering the namespace string. The column operation obtains a hash value; the predetermined byte in the hash value is taken as the compressed namespace name. The algorithm used in the hash operation may be: MD5, SHA-1 or SHA-2. The inclusion of the above namespace names into a namespace string includes: Using a connector to concatenate the public key token of the .net file with the namespace name to get a namespace string. Step 1006, determining a type count corresponding to the namespace name, where the type count refers to the number of types included in the namespace; preferably, the type count can be determined by the following method: When the above namespace name is When obtaining once (that is, the namespace name has not been obtained before), the type count corresponding to the above namespace name is set to 1, and each time the above namespace name is obtained, the type count is incremented by 1 until the above is traversed. Metadata table. Step 1008: Combine the compressed namespace name and the type count according to a predetermined format to obtain a compression result of the namespace corresponding to the namespace name. The predetermined format may be a fixed length byte. The fixed length byte includes two parts, a part of the byte is the above type count, and the other part of the remaining byte is the compressed namespace name. Preferably, the step of obtaining the namespace name of the current type in the .NET file (step 1002) further includes: determining whether the currently obtained namespace name has been obtained, and if not, performing step 1004 above, if The namespace name has been obtained, and the type of the namespace is counted as 1. Preferably, the method for compressing the namespace in the .NET file provided in this embodiment may further include: determining whether the namespace name to which all types belong has been read; if yes, performing the above compression according to a predetermined format. The above-mentioned namespace name and the above-mentioned type count are combined in step 4 (ie, step 4 is gathered 1008); otherwise, the namespace name to which the next type belongs is read (ie, returning step 4 is gathered 1002). In this embodiment, the obtained namespace name is compressed, and the compressed namespace name is combined with the corresponding type count to obtain a compressed namespace, which can effectively reduce . The storage space occupied by the file enables the .net file to be stored and run on 'j, a capacity storage medium (for example, a smart card), thereby enhancing the function of a small-capacity storage medium (for example, a smart card). Embodiment 12 This embodiment provides a method for compressing a namespace in a .NET file. In this method, a file that has not been compressed by a .net file is called a .net file, and is named by a compression program. The compression process of space. As shown in FIG. 23, the method includes: Step 1102: Obtain a metadata table that includes a namespace name offset in a .net file; and include a plurality of metadata tables in the .net file, where the namespace name is offset The metadata table has TypeDef (definition type, interface table), TypeRef (reference type table). The following is an example of obtaining the metadata table TypeDef in the .net file to illustrate the acquisition process of the metadata table; t according to the table method: namespace MyCompany . MyOnCardApp
public class My Service: MarshalByRefObj ect Public class My Service: MarshalByRefObj ect
static Version ver = new Version(l, 1, 1, 1); static Int32 callCount = 0; static ClassB classb = new ClassB(); String strResult = Boolean.FalseString; public String My S ampleMethod() Static Version ver = new Version(l, 1, 1, 1); static Int32 callCount = 0; static ClassB classb = new ClassB(); String strResult = Boolean.FalseString; public String My S ampleMethod()
String strHello = "Hello World! return strHello + callCount. ToString();
} String strHello = "Hello World! return strHello + callCount. ToString(); }
} public class ClassB{} public struct StructB{ } } 对上述代码使用. net平台编译后得到 helloworldexe文件, 并以二进制的 形式存储在硬盘上, 该二进制文件为 .net文件, .net文件结构如图 24所示, 包括: Dos头 (Dos Header ), PE特征等。 其中, 压缩程序获取元数据表的过程如下: a. 压缩程序定位 .net文件的 Dos头, 得到 Dos头 0x5a4d; b. 压缩程序从 Dos头后跳过第一约定个字节,读出 PE特征的偏移地址, 得到 PE特征的偏移地址 0x00000080; 在本实施例中, 第一约定个字节为 0x003a个字节; c. 压缩程序才艮据 PE特征的偏移地址 0x00000080定位 PE特征, 定位得 到 PE特征 0x4550; d. 从 PE特征开始, 偏移第二约定个字节后读取四个字节, 在本实施例 中, 以 32位机为例进行说明, 第二约定个字节为从 PE特征向后偏移 0x0074 字节后, 读出 4个字节的数据为 0x00000010, 此值说明该二进制文件中存在 0x10个目录, 且包含 .net数据; 其中. net文件的元数据头地址写在上述第 OxOF个目录中, 在 64位机中 第二约定个字节为 0x0084个字节; e. 从上述数据 0x00000010向后偏移第三约定个字节后读取八个字节数 据, 在本实施例中, 优选地, 第三预定个字节为 112个字节, 在此八个字节 数据中, 前四个字节为 0x00002008, 为. net数据头的相对虚拟地址, 后四个 字节为 0x00000048, 为. net数据头的长度; } public class ClassB{} public struct StructB{ } } After the above code is compiled using the .net platform, the helloworldexe file is obtained and stored in binary form on the hard disk. The binary file is a .net file, and the .net file structure is shown in Figure 24. As shown, including: Dos Header, PE features, etc. The process of obtaining the metadata table by the compression program is as follows: a. The compression program locates the Dos header of the .net file, and obtains the Dos header 0x5a4d; b. The compression program skips the first agreed byte from the Dos header and reads the PE feature. In the embodiment, the first agreed byte is 0x003a bytes; c. The compression program locates the PE feature according to the offset address 0x00000080 of the PE feature. Positioning to obtain the PE feature 0x4550; d. Starting from the PE feature, after reading the second predetermined number of bytes, read four bytes. In this embodiment, a 32-bit machine is taken as an example for description, and the second agreed byte is used. After offsetting 0x0074 bytes from the PE feature, the data of 4 bytes is read as 0x00000010. This value indicates that there are 0x10 directories in the binary file, and contains .net data; where the metadata header of the .net file The address is written in the above OxOF directory, and the second agreed byte in the 64-bit machine is 0x0084 bytes; e. The eight bytes are read after shifting the third agreed byte backward from the above data 0x00000010 Data, in this In the embodiment, preferably, the third predetermined number of bytes is 112 bytes, and among the eight bytes of data, the first four bytes are 0x00002008, which is the relative virtual address of the .net data header, and the last four words. The section is 0x00000048, which is the length of the .net data header;
£ 压缩程序才艮据. net数据头的相对虚拟地址得到线性地址 0x00000208,
并读取 .net数据头得到如下数据: The compression program only obtains the linear address 0x00000208 from the relative virtual address of the net data header. And read the .net header to get the following data:
48000000020005008C210000A0090000 090000000500000600000000000000005 020000080000000000000000000000000
需要说明的是, 上述数据釆用小端的存储方式, 例如, 上述数据前 4个 字节 0x48000000为该数据的长度, 转换成大端的存储方式为 0x0000048; 在本实施例中, 线性地址为. net数据在 .net文件中的地址,相对虚拟地址 为相对于 PE载入点的内存偏移, 线性地址和相对虚拟地址的转换关系为: 线性地址=相对虚拟地址 -节的相对虚拟地址 +节的文件偏移, 在本实施例 中, 读取. net文件中 .net数据目录的节的相对虚拟地址为 0x00002000 , 节的 文 件 偏 移 为 0x00000200 , 则 线 性 地 址 =0x00002008 - 0x00002000+0x00000200=0x00000208; g. 压缩程序由. net数据头开始向后偏移第四约定个字节后读出 8个字节 数据, 在本实施例中第四约定个字节为从 .net数据头向后偏移 8个字节后, 读取共 8个字节数据, 在这 8个字节中, 前四个字节为 0x0000218c, 为元数 据头( MetaData Header )的相对虚拟地址, 后四个字节为 0x000009a0 , 为元 数据的长度; h. 才艮据元数据头的相对虚拟地址 0x0000218c 得到线性地址48000000020005008C210000A0090000 090000000500000600000000000000005 020000080000000000000000000000000 It should be noted that the above data is stored in a small endian. For example, the first four bytes of the data are 0x48000000, and the data is converted to a large end. The storage mode is 0x0000048. In this embodiment, the linear address is .net. The address of the data in the .net file. The relative virtual address is the memory offset relative to the PE load point. The conversion relationship between the linear address and the relative virtual address is: linear address = relative virtual address - relative virtual address of the section + section File offset. In this embodiment, the relative virtual address of the section of the .net data directory in the .net file is 0x00002000, the file offset of the section is 0x00000200, and the linear address=0x00002008 - 0x00002000+0x00000200=0x00000208; The compression program reads 8 bytes of data after shifting from the .net data header backward by the fourth predetermined byte. In this embodiment, the fourth agreed byte is offset backward from the .net data header. After a byte, a total of 8 bytes of data are read. Among the 8 bytes, the first four bytes are 0x0000218c, which is the relative virtual address of the metadata header (MetaData Header), and the last four bytes are 0x000. 009a0, which is the length of the metadata; h. The linear address is obtained from the relative virtual address 0x0000218c of the metadata header.
0x0000038c, 根据线性地址和元数据长度得到元数据内容; i. 压缩程序由元数据头向后读取, 当读取到标志" #〜 "时, 读取标志" #〜" 前的八个字节, 其中前四个字节为" #〜,,的地址, 通过该地址得到" #〜"流, 在 "#〜"流中第五约定个字节开始读取长度为 8 个字节的数据, 即 0x0000000920021c57, 其二进制形式为 0x0000038c, the metadata content is obtained according to the linear address and the metadata length; i. The compression program is read backward by the metadata header, and when the flag "#~" is read, the first eight words of the flag "#~" are read. Section, where the first four bytes are "#~,, the address, the "#~" stream is obtained by the address, and the fifth agreed byte in the "#~" stream starts to read the length of 8 bytes. Data, ie 0x0000000920021c57, whose binary form is
100100100000000000100001110001010111 ; 在本实施例中,第五约定个字节为" #〜"流中起始位开始算起第 9个字节;
j. 将步骤 i中得到的二进制数据从低位开始读取, 例如, 第 1位代表元 数据表 Module是否存在, 如果是 1 , 则证明存在元数据表 Module, 如果是 0, 证明不存在, 在本实施例中, 存在元数据表 Module, 并且第 2位为 1 , 表示元数据表 TypeRef存在, 第 3位为 1 , 表示元数据表 TypeDef存在; 其中, 在步骤 i 中所得到的数据中, 从低位开始, 每一位代表. net文件 中是否存在对应的表; k. 压缩程序在数据 0x0000000920021c57后偏移第六约定个字节后读取 元数据表 TypeDef的数据行数,在本实施例中为向后偏移 16个字节后读取 4 个字节, 得到数据 0x00000006 , 判断得出元数据表 TypeDef中存在 6个数据 行; 其中,在元数据中,数据 0x000000092002 lc57向后偏移 8个字节后的数 据中, 以每 4个字节为一个单位依次存储了在. net文件中存在的元数据表的 数据行数,在表示数据行数的数据后,依次存储了每个元数据表的具体内容, 为元数据表区域; 1. 压缩程序根据约定的规则读取得到元数据表 TypeDef的内容。 100100100000000000100001110001010111; In this embodiment, the fifth agreed byte is the ninth byte starting from the start bit of the "#~"stream; j. The binary data obtained in step i is read from the lower bit. For example, the first bit represents whether the metadata table Module exists. If it is 1, it proves that the metadata table Module exists. If it is 0, the proof does not exist. In this embodiment, there is a metadata table Module, and the second bit is 1, indicating that the metadata table TypeRef exists, and the third bit is 1, indicating that the metadata table TypeDef exists; wherein, in the data obtained in step i, Starting from the lower position, each bit represents whether there is a corresponding table in the .net file; k. The compression program reads the number of data lines of the metadata table TypeDef after shifting the sixth agreed byte after the data 0x0000000920021c57, in this embodiment In the middle, offsetting 16 bytes backwards and reading 4 bytes, the data 0x00000006 is obtained, and it is judged that there are 6 data rows in the metadata table TypeDef; wherein, in the metadata, the data 0x000000092002 lc57 is backward shifted In the data after 8 bytes, the number of data rows of the metadata table existing in the .net file is sequentially stored in units of 4 bytes, and after the data indicating the number of data rows, each of them is sequentially stored. Specific contents data table, the metadata table region; 1. Compress the program reads the content metadata table TypeDef obtained according to the agreed rule.
其中 , 在本实施例 中 约定的规则 如下 , 压缩程序对数据 0x000000092002 lc57 后记录元数据表个数的数据依次进行读取, 即为 0x00000001和 OxOOOOOOld, 相力口后可以得出元数据表 TypeDef前的两个元 数据表 Module和 TypeRef 中共存在 31个数据行, 元数据表 Module的数据 行每行数据为 10个字节, 元数据表 TypeRef中的数据行为每行 6个字节, 因 此在元数据表区域中, 向后偏移 10* 1+6*30=190个字节后, 第 191个字节开 始为元数据表 TypeDef的内容, 元数据表 TypeDef 中的数据行每行为 14个 字节, 因此元数据表 TypeDef 的长度为 14*6=84 个字节, 得到元数据表 TypeDef的数据如下: 000000000100 0000 0000 0100 0100 The rules agreed in this embodiment are as follows. The compression program sequentially reads the data of the number of metadata records after the data 0x000000092002 lc57, that is, 0x00000001 and OxOOOOOOld, and the phase table can be obtained after the metadata table TypeDef. There are 31 data rows in the two metadata tables Module and TypeRef. The data row of the metadata table Module is 10 bytes per row, and the data behavior in the metadata table TypeRef is 6 bytes per row. In the data table area, after shifting backward by 10* 1+6*30=190 bytes, the 191st byte starts as the content of the metadata table TypeDef, and the data row in the metadata table TypeDef acts 14 words. Therefore, the length of the metadata table TypeDef is 14*6=84 bytes, and the data of the metadata table TypeDef is obtained as follows: 000000000100 0000 0000 0100 0100
010010001900 2300 0500 0100 0100 010010001900 2300 0500 0100 0100
010010003900 2300 0900 0600 0400 010010003900 2300 0900 0600 0400
010110004000 2300 0D00 0600 0500
010110004000 2300 0D00 0600 0500 010110004000 2300 0D00 0600 0500 010110004000 2300 0D00 0600 0500
000000007102 0000 0900 0700 0700 上述数据中规定的字节标识不同信息,其中,每一行为元数据表 TypeDef 中的一个数据行, 记录了一个类型的名称和属性, 对于每一行, 从高位开始 读取, 前 4个字节为 Flags (定义类型标识), 5、 6字节为定义类型名称相对于 "#Strings"流的偏移量, 7、 8 字节为该定义类型所属的命名空间名称相对于 "#Strings"流的偏移量, 具体参见表 11 : 表 11 000000007102 0000 0900 0700 0700 The byte specified in the above data identifies different information, wherein one data row in each behavior metadata table TypeDef records a type name and an attribute, and for each line, reads from the high position, The first 4 bytes are Flags (definition type identifier), 5, 6 bytes are the offsets of the defined type name relative to the "#Strings" stream, and 7, 8 bytes are the namespace names to which the definition type belongs. The offset of the "#Strings" stream, see Table 11 for details: Table 11
在本实施例中, 元数据表 TypeDef、 TypeRef为类型表, 上述类型表中包 含有命名空间( namespace )信息, 压缩程序需从元数据表 TypeDef、 TypeRef 中获取命名空间信息。 在本实施例中,命名空间的类型计数是指该命名空间中存在的类型个数, 在压缩程序开始执行压缩操作前, 设置所有命名空间的类型计数为 0。 在一 个. net 文件中, 每个命名空间的类型计数是独立的, 即每个命名空间对应一 个类型计数。 步骤 1104 , 压缩程序获取元数据表中的类型的命名空间名称; 本实施例以读取元数据表 TypeDef为例进行说明, 例如, 压缩程序读取 上述元数据表 TypeDef中第二个类型, 按照上述元数据表的结构, 读出其所 属的命名空间的偏移量, 即 0x0023 , 其中, 该命名空间的偏移量为相对于元 数据中 "#Strings"流的偏移量, 压缩程序根据偏移量得到命名空间名称的过程 如下: 压缩程序在元数据中定位" #Strings"流的位置, 并通过命名空间偏移量 0x0023读取命名空间信息, 读取过程中, 由偏移地址 0x0023读起, 遇到第
一个 0x00结束, 得到命名空间名称为: In this embodiment, the metadata table TypeDef and TypeRef are type tables, and the type table includes namespace information, and the compression program needs to obtain namespace information from the metadata tables TypeDef and TypeRef. In this embodiment, the type count of the namespace refers to the number of types existing in the namespace, and the type count of all the namespaces is set to 0 before the compression program starts performing the compression operation. In a .net file, the type count for each namespace is independent, that is, each namespace corresponds to a type count. Step 1104: The compression program obtains a namespace name of a type in the metadata table. In this embodiment, the reading metadata table TypeDef is taken as an example. For example, the compression program reads the second type in the metadata table TypeDef, according to the second type. The structure of the above metadata table reads the offset of the namespace to which it belongs, that is, 0x0023, where the offset of the namespace is an offset from the "#Strings" stream in the metadata, and the compression program is based on The process of getting the namespace name for the offset is as follows: The compression program locates the location of the "#Strings" stream in the metadata, and reads the namespace information by the namespace offset 0x0023, during the reading process, by offset address 0x0023 Read, encounter the first At the end of a 0x00, the namespace name is:
4D79436F6D70616E792E4D794F6E4361726441707000 上述命名空间对应的 ASC II码为 MyCompany.MyOnCardApp 其中, 压缩程序每次只读出一个类型的命名空间名称; 压缩程序在元数据中定位" #Strings,,流的位置的方法为: 在步骤 1102 中 的步骤 h中获得元数据头的地址 0x0000038c后, 从元数据头开始向后读取, 当发现标记"" #Strings,,,,后, 读取"" #Strings,,,,的前 8 个字节, 得到数据 0x5C0300003C040000,其中高 4 个字节为" #Strings"流相对于元数据头的偏 移, 低 4个字节为" #Strings"流的长度, 高 4个字节转换成大端的表示方式为 0x0000035 (^氐 4个字节转换成大端的表示方式为 0x0000043c;压缩程序才艮据 元数据头的地址 0x0000038c, 向后偏移 0x0000035c得到" #Strings"流的数据 区域; 步骤 1106, 判断所读出的命名空间名称是否与已经读出过的类型所属的 命名空间重复, 如果是, 执行步骤 1108, 如果不是, 执行步骤 1110; 由于步骤 1104 中压缩程序读出的为元数据表中第二个类型所属的命名 空间名称, 在本实施例中, 元数据表 TypeDef中第一个类型不存在命名空间 名称, 因此不存在与已经读出的类型所属的命名空间重复的问题, 继续执行 步骤 1110及其之后的操作; 步骤 1108, 将上述重复的命名空间的类型计数增加 1 , 执行步 4聚 1114; 步骤 1110, 将命名空间名称按约定的格式组成命名空间字符串; 为了便于区别命名空间所属的文件和降低数据的碰撞率, 需要将步骤 1104中得到的命名空间名称按约定的格式组成命名空间字符串, 在本实施例 中, 优选地, 约定的命名空间字符串格式如表 12所示: 表 12 4D79436F6D70616E792E4D794F6E4361726441707000 The ASC II code corresponding to the above namespace is MyCompany.MyOnCardApp, where the compression program reads out a type of namespace name each time; the compression program locates in the metadata "#Strings, the location of the stream is: After obtaining the address 0x0000038c of the metadata header in step h in 1102, it reads backward from the metadata header, and when it finds the tag "" #Strings,,,,, after reading ""#Strings,,,,, 8 bytes, get the data 0x5C0300003C040000, where the upper 4 bytes are the offset of the " #Strings" stream relative to the metadata header, the lower 4 bytes are the length of the " #Strings" stream, and the high 4 bytes are converted. The representation of the big end is 0x0000035 (^氐4 bytes are converted to big end, the representation is 0x0000043c; the compression program is based on the address 0x0000038c of the metadata header, offset 0x0000035c backwards to get the data area of the "#Strings" stream; Step 1106: Determine whether the read namespace name is duplicated with the namespace to which the type that has been read belongs. If yes, go to step 110. 8. If not, step 1110 is performed; because the compression program read in step 1104 is the namespace name of the second type in the metadata table, in this embodiment, the first type in the metadata table TypeDef does not exist. The name of the namespace, so there is no problem with the namespace to which the type already read belongs, and the operation of step 1110 and subsequent steps is continued; step 1108, the type of the above-mentioned repeated namespace is incremented by 1, and step 4 is performed. Step 1110: The namespace name is formed into a namespace string according to the agreed format. In order to distinguish the file to which the namespace belongs and reduce the collision rate of the data, the namespace name obtained in step 1104 needs to be named according to the agreed format. The space string, in this embodiment, preferably, the agreed namespace string format is as shown in Table 12: Table 12
PublicKey Token 连接符 ( . ) Namespace
其中, PublicKeyToken为公钥标 ΐ己, 在. net编译器对 HelloWorld程序进 行强签名时,会生成一个 HelloWorld. snk文件, HelloWorld. snk中包含有公钥 和私钥, 上述 HelloWorld代码被编译器编译后得到 PE文件, 并对 PE文件 计算散列值, 编译器使用私钥对上述散列值进行签名, 并将公钥嵌入到 PE 文件中, 嵌入的公钥即为 PublicKey, PublicKeyToken为对 PublicKey进行散 列运算, 并取后八位所得到, 在本实施例中, 以 PublicKeyToken是 38 05 F8 26 9D 52 A5 B2为例, 在本实施例中, 优选地, 连接符釆用 " "表示, 由此得 到命名空间字符串为 3805F8269D52A5B2. MyCompany.MyOnCardApp; 上述连接符还可以为 "-"、 "_"、 空格等, 并不局限于",,; 在本步骤中, 还需将该命名空间名称的类型计数置为 1 , 用以表示该命 名空间至少存在一个类型; 步骤 1112, 对命名空间字符串进行散列运算, 并从中取约定的位数作为 压缩后的命名空间名称; 其中, 进行散列运算可以釆用 MD5、 SHA-1、 SHA-2等算法, 在本实施 例中优选地釆用 MD5 算法, 对命名空间字符串 "3805F8269D52A5B2. MyCompany.MyOnCardApp"进行散列计算得到一个 120位的计算结果; 在本 实施例中, 优选地, 将此 120的计算结果取前三个字节作为压缩结果, 为了 使得数据字节对齐, 还可以在最后补上" 00" , 得到压缩后的命名空间名称 "ACE6EB00"; 上述压缩后的命名空间名称釆用小端的方式排列, 如釆用大端的排列方 式为 00EBE6AC, —般在计算机中的存储釆用大端的方式排列, 基于 X86的 智能卡芯片中釆用小端的方式排列; 这里需要说明的是,命名空间字符串中的 PublicKeyToken需要区分大小 写,避免相同的 PublicKeyToken因为大小写的问题导致散列计算结果运算不 一致问题; 步骤 1114, 判断是否所有类型所属的命名空间名称都已经被读取过, 如 果是, 执行步骤 1116, 如果不是, 读取下一个类型所属的命名空间的命名空 间名称, 即, 返回步 4聚 1104; 判断是否所有类型所属的命名空间名称都已经被读取过的依据是看是否
将元数据表中的所有行都读完, 如果都读过, 说明所有类型所属的命名空间 名称都已经被读取过, 否则, 说明不是所有类型所属的命名空间名称都被读 耳又过; 步骤 1116, 压缩程序按照约定的格式组织压缩后的命名空间名称和该命 名空间的类型计数, 得到命名空间的压缩结果; 在本实施例中, 优选地, 约定的格式, 即经过压缩程序压缩后的命名空 间的结构为: PublicKey Token connector ( . ) Namespace Among them, PublicKeyToken is the public key label. When the .net compiler strongly signs the HelloWorld program, it will generate a HelloWorld.snk file. HelloWorld.snk contains the public and private keys. The above HelloWorld code is compiled by the compiler. After obtaining the PE file, the hash value is calculated for the PE file. The compiler uses the private key to sign the hash value and embed the public key in the PE file. The embedded public key is PublicKey, and the PublicKeyToken is for the PublicKey. The hash operation is obtained by taking the last eight bits. In this embodiment, the PublicKeyToken is 38 05 F8 26 9D 52 A5 B2 as an example. In this embodiment, preferably, the connector is represented by "", The resulting namespace string is 3805F8269D52A5B2. MyCompany.MyOnCardApp; The above connector can also be "-", "_", space, etc., not limited to ",,; In this step, the namespace name is also required. The type count is set to 1 to indicate that there is at least one type in the namespace; Step 1112, hash the namespace string, and take the agreed number of bits as the The compressed namespace name; wherein, the hash operation may use an algorithm such as MD5, SHA-1, SHA-2, etc. In this embodiment, the MD5 algorithm is preferably used, and the namespace string "3805F8269D52A5B2. MyCompany. MyOnCardApp" performs a hash calculation to obtain a 120-bit calculation result; in this embodiment, preferably, the calculation result of the 120 takes the first three bytes as a compression result, and in order to make the data byte alignment, it is also possible to Fill in "00", get the compressed namespace name "ACE6EB00"; the above-mentioned compressed namespace names are arranged in a little endian way, such as the big endian arrangement is 00EBE6AC, the storage in the computer is generally used. Big-end mode arrangement, X86-based smart card chip is arranged in a small endian manner; here, it should be noted that the PublicKeyToken in the namespace string needs to be case-sensitive to avoid the same PublicKeyToken result of hash calculation due to the case of upper and lower case. Inconsistent operation problem; Step 1114, determining whether the namespace names belonging to all types have been read If yes, go to step 1116. If not, read the namespace name of the namespace to which the next type belongs, that is, return step 4 to 1104; determine whether the namespace names to which all types belong have been read. See if After reading all the rows in the metadata table, if you have read it, it means that the namespace names of all types have been read. Otherwise, it means that not all the namespace names to which the type belongs are read and passed; Step 1116, the compression program organizes the compressed namespace name and the type count of the namespace according to the agreed format, and obtains a compression result of the namespace. In this embodiment, preferably, the agreed format, that is, after compression by the compression program The structure of the namespace is:
表 13 Table 13
类型计数 命名空间名称 表 13中的类型计数为该命名空间所包含的类型的类型计数; 其中, 在步骤 1114中, 当元数据表中包含的所有类型所属的命名空间名 称都被读出并压缩完成后, 可以得到所有的命名空间名称的压缩结果, 并得 到了所有命名空间所包含的类型的类型计数; 以步骤 1112中所得到的命名空间的压缩结果为例进行说明, 其中,命名 空间 MyCompany.MyOnCardApp存在 3个类型,可以得到压缩后的命名空间 如下: The type count in the type count namespace name table 13 is the type count of the type contained in the namespace; wherein, in step 1114, the namespace names belonging to all types included in the metadata table are read and compressed. After the completion, you can get the compression result of all the namespace names, and get the type count of the types contained in all the namespaces. The compression result of the namespace obtained in step 1112 is taken as an example, where the namespace MyCompany There are 3 types of .MyOnCardApp, and the compressed namespace can be obtained as follows:
0300 ACE6EB00 上述压缩后的命名空间釆取的是小端的表示方法。 上述压缩结果中命名空间的压缩结构仅为最优的结构, 该结构可以做相 应的变换, 例如将该命名空间的类型计数置于压缩后的命名空间名称后, 或 将类型计数的值进行同等编码变换等, 这里不再详述。 本实施例对命名空间的压缩过程, 仅以读取元数据表中的一个类型的命 名空间进行了说明, 在实际的操作过程中, 一个. net 文件中可能包含一个或 多个命名空间, 并且每个命名空间中包括多个类型, 实际应用中应对元数据 表中所有的类型逐个读取对应的命名空间, 取出其命名空间名称, 并对取出 的不相同的命名空间名称进行压缩, 同时对每个命名空间的类型进行类型计 数, 并按照上述方法得到压缩后的命名空间。 其中, 对. net 文件中元数据表 中的每个类型对应的命名空间进行读取时都可以釆用本实施例提供的方法。
下面给出对一个. net文件的命名空间压缩后得到的结果: 0300 ACE6EB00 The above compressed namespace is derived from the little endian representation. The compression structure of the namespace in the above compression result is only an optimal structure, and the structure can be transformed accordingly, for example, the type count of the namespace is placed after the compressed namespace name, or the value of the type count is equal. Coding transformation, etc., will not be described in detail here. The compression process of the namespace in this embodiment is only described by reading a type of namespace in the metadata table. In the actual operation, a .net file may contain one or more namespaces, and Each namespace includes multiple types. In practice, all types in the metadata table should be read one by one for the corresponding namespace, the namespace name should be taken out, and the different named namespace names should be compressed. Type the type of each namespace and get the compressed namespace as described above. The method provided in this embodiment may be used when reading the namespace corresponding to each type in the metadata table in the .net file. The following gives the result of compressing the namespace of a .net file:
Namespace: Namespace:
(0)0100 (0)0100
D93DEB00//367DB8A346085E5D.System.Runtime.Remoting (1)0100 6E880000 D93DEB00//367DB8A346085E5D.System.Runtime.Remoting (1)0100 6E880000
//367DB8A346085E5D.System.Runtime.Remoting.Channels //367DB8A346085E5D.System.Runtime.Remoting.Channels
(2) 0100 1178D900// (2) 0100 1178D900//
367DB8A346085E5D.SmartCard.Runtime.Remoting.Channels.APDU 367DB8A346085E5D.SmartCard.Runtime.Remoting.Channels.APDU
(3) 0200 00F64D00 //D9E1E811B0CFFB39. System (5) 0100 1C5DD200 //367DB8A346085E5D. System (3) 0200 00F64D00 //D9E1E811B0CFFB39. System (5) 0100 1C5DD200 //367DB8A346085E5D. System
(6) 0400 00F64D00 //D9E1E811B0CFFB39. System (6) 0400 00F64D00 //D9E1E811B0CFFB39. System
(A) 0100 1C5DD200〃367DB8A346085E5D.System (A) 0100 1C5DD200〃367DB8A346085E5D.System
(B) 0200 C438E300 // (B) 0200 C438E300 //
CB18FlDFA0E7655B.MyCompany.MyOnCardApp (D) 0100 00F64D00 //D9E 1E811B0CFFB39. System 上述结果均釆用小端的表示方式, 例如第一条压缩结果, 0100为命名空 间 System.Runtime.Remoting 所包含的类型的类型计数, 即命名空间 System.Runtime.Remoting包括 0001个类型,即一个类型,后 4个字节为该. net 文件中对命名空间名称 System.Runtime.Remoting的压缩结果, "//"后为组成 的命名空间字符串, 以下(1)至 (D)均釆用的上述结构, 不再——解释; 本实施例提供的命名空间的压缩方法, 通过获取命名空间名称, 并按照 约定格式将其压缩, 可以对命名空间进行较好的压缩, 进而能够节省存储. net 文件需要的空间, 尤其对于将. net 文件运行在智能卡中, 而该智能卡的数据 存储又比较有限的情况下, 可以通过本实施例提供的压缩方法实现. net 文件 的运行, 增强了智能卡的性能。
实施例 13 参见图 25 , 本实施例提供了一种 .net文件中定义类型的压缩装置, 该装 置包括: 定义类型信息获取模块 1202, 用于获取 .net文件中使用的定义类型包含 的信息;根据该定义类型包含的信息获取该定义类型的指定信息和计数信息; 压缩模块 1204, 用于对定义类型信息获取模块 1202获取的指定信息进 行压缩; 压缩结果存储模块 1206, 用于将压缩模块 1204压缩后的指定信息和定 义类型信息获取模块 1202 获取的计数信息作为该定义类型的压缩结果进行 存储。 其中, 定义类型信息获取模块 1202的具体实现可以是: 先读取 .net文件 中的定义类型所在的元数据表, 即元数据表 TypeDef; 再从元数据表 TypeDef 中获取 .net文件中使用的定义类型包含的信息。 本实施例中该定义类型包含的信息包括: 该定义类型的标识、 该定义类 型的名称的偏移量、 该定义类型中的方法的偏移量; 当该定义类型中使用了字段时, 该定义类型包含的信息中还可以包括该 定义类型中的字段的偏移量; 这些信息都可以通过读取元数据表 TypeDef中 的数据得到, 在元数据表 TypeDef中, 每行数据表示一个定义类型, 每行数 据共 14个字节, 这 14个字节记录的信息为: 前 4 个字节为 Flags (定义类型标识), 5、 6 字节为定义类型名称在CB18FlDFA0E7655B.MyCompany.MyOnCardApp (D) 0100 00F64D00 //D9E 1E811B0CFFB39. System The above results are represented by little endian, such as the first compression result, 0100 is the type count of the type contained in the namespace System.Runtime.Remoting. That is, the namespace System.Runtime.Remoting includes 0001 types, that is, one type, and the last 4 bytes are the compression result of the namespace name System.Runtime.Remoting in the .net file, and the composition is named after "//". The space string, the following structures (1) to (D) are not used for explanation. The compression method of the namespace provided by this embodiment obtains the namespace name and compresses it according to the agreed format. The namespace can be better compressed, and the space required for storing the .net file can be saved. Especially when the .net file is run on the smart card, and the data storage of the smart card is limited, the embodiment can be used. The provided compression method implements the operation of the .net file and enhances the performance of the smart card. Embodiment 13 Referring to FIG. 25, this embodiment provides a compression device of a type defined in a .NET file, the device includes: a definition type information obtaining module 1202, configured to acquire information included in a definition type used in a .NET file; And obtaining the specified information and the counting information of the definition type according to the information included in the definition type; the compression module 1204 is configured to compress the specified information acquired by the definition type information acquiring module 1202; and the compression result storage module 1206 is configured to: compress the module 1204 The compressed designation information and the count information acquired by the definition type information acquisition module 1202 are stored as a compression result of the definition type. The specific implementation of the definition type information obtaining module 1202 may be: first reading the metadata table where the definition type in the .NET file is located, that is, the metadata table TypeDef; and obtaining the .net file from the metadata table TypeDef. Define the information that the type contains. The information included in the definition type in this embodiment includes: an identifier of the definition type, an offset of the name of the definition type, and an offset of the method in the definition type; when a field is used in the definition type, The information contained in the definition type may also include an offset of the field in the definition type; the information may be obtained by reading the data in the metadata table TypeDef. In the metadata table TypeDef, each row of data represents a definition type. Each line of data has a total of 14 bytes. The information recorded by these 14 bytes is: The first 4 bytes are Flags (definition type identifier), and 5, 6 bytes are defined type names.
"#Strings,,流中的偏移量, 7、 8字节为该定义类型所属的命名空间名称在 .net 文件里" #Strings"流中的偏移量, 9、 10字节为该定义类型的所继 的父类的 偏移量, 11、 12字节为该定义类型包含的字段在元数据表 Field表中的偏移 量, 13、 14字节为该定义类型所包含的方法在元数据表 Method表中的偏移 量。 根据本实施例中该定义类型包含的信息, 上述指定信息包括: 该定义类 型的标识、该定义类型的名称,还可以包括该定义类型中的字段对应的信息; 其中, 该定义类型的名称可以 居上述该定义类型的名称的偏移量在. net 文 件对应的数据流中查找到, 该定义类型的字段对应的信息也可以通过上述字
段的偏移量在元数据表 Field表中查找到; 计数信息包括: 该定义类型的方法重载信息、 该定义类型中包含的方法 计数和字段计数等。 本实施例的压缩模块 1204 对指定信息进行压缩时, 可以根据压缩的具 体对象选择压缩方法, 例如: 对定义类型的标识进行压缩时,可以先将定义类型的标识分为类型标识、 访问标识和描述性标识; 然后, 将类型标识、 访问标识和描述性标识进行或 运算, 得到的数据作为该定义类型的标识的压缩结果; 对定义类型的名称进行压缩时, 可以对定义类型的名称进行哈希运算, 从运算结果中提取约定的字节作为该定义类型的名称的压缩结果; 对定义类型中的字段对应的信息 (字段的名称、 字段的标识和字段的类 型) 进行压缩时, 根据字段对应的信息内容分为以下三种情况: "#Strings,, the offset in the stream, 7, 8 bytes for the namespace name of the definition type in the .net file. The offset in the #Strings" stream, 9, 10 bytes for this definition The offset of the succeeding parent class of the type, 11, 12 bytes is the offset of the field contained in the definition type of the field in the metadata table, 13 and 14 bytes are the methods included in the definition type The offset information in the metadata table. According to the information included in the definition type in the embodiment, the specified information includes: an identifier of the definition type, a name of the definition type, and a field corresponding to the definition type. The information of the name of the definition type can be found in the data stream corresponding to the .net file, and the information corresponding to the field of the definition type can also pass the above words. The offset of the segment is found in the Field table of the metadata table; the counting information includes: method overloading information of the defined type, method count and field count included in the definition type, and the like. When the compression module 1204 of the embodiment compresses the specified information, the compression method may be selected according to the specific object to be compressed. For example, when the identifier of the defined type is compressed, the identifier of the defined type may be first classified into a type identifier, an access identifier, and Descriptive identification; Then, the type identifier, the access identifier, and the descriptive identifier are ORed, and the obtained data is used as the compression result of the identifier of the definition type; when the name of the definition type is compressed, the name of the definition type can be performed. The Greek operation, extracting the agreed byte from the operation result as the compression result of the name of the definition type; when compressing the information corresponding to the field in the definition type (the name of the field, the identifier of the field, and the type of the field), according to the field The corresponding information content is divided into the following three cases:
1 ) 对上述字段的名称进行哈希运算, 从运算结果中提取约定的字节作 为该字段的名称的压缩结果; 2 ) 将上述字段的标识分为访问标识和描述性标识, 对上述字段的标识 中的访问标识和描述性标识进行或运算, 得到的结果作为该字段的标识的压 缩结果; 1) hashing the name of the above field, extracting the agreed byte from the operation result as a compression result of the name of the field; 2) dividing the identifier of the above field into an access identifier and a descriptive identifier, for the above field The access identifier and the descriptive identifier in the identifier are ORed, and the obtained result is used as a compression result of the identifier of the field;
3 ) 将上述字段的类型在压缩后的类型中的偏移量作为该字段的类型的 压缩结果。 由上述内容可知,压缩后的定义类型中包括: 定义类型名称的压缩结果、 类型标识的压缩结果、 类型中所包含的字段计数、 方法计数、 方法重载信息、 类型中的字段对应的信息等。 这些信息可以按照预先确定的格式排列, 也可 以任意排列。 本实施例通过对. net 文件中的定义类型的各个部分进行压缩, 并将各部 分的压缩结果按照预先确定的格式存储, 可以有效地降低. net 文件占用的存 储空间, 使. net文件可以在小容量存储介质 (例如: 智能卡)上存储并运行, 进而增强了小容量存储介质 (例如: 智能卡) 的功能。
实施例 14 参见图 26, 本实施例提供了一种 .net文件中定义类型的压缩方法, 该方 法以通过实施例 13通过的压缩装置实现为例进行说明, 该方法包括: 步骤 1302:定义类型信息获取模块 1202获取. net文件中使用的定义类型 包含的信息; 步骤 1304: 定义类型信息获取模块 1202根据该定义类型包含的信息获 取该定义类型的指定信息和计数信息; 步骤 1306: 压缩模块 1204对上述指定信息进行压缩; 步骤 1308: 压缩结果存储模块 1206将压缩后的指定信息和计数信息作 为该定义类型的压缩结果进行存储。 其中, 定义类型信息获取模块 1202 获取的定义类型包含的信息是通过 如下方式得到的: 先读取 .net文件中的定义类型元数据表, 即元数据表 TypeDef; 再从元数 据表 TypeDef中获取. net文件中使用的定义类型包含的信息。 本实施例中该定义类型包含的信息包括: 该定义类型的标识、 该定义类 型的名称的偏移量、 该定义类型中的方法的偏移量; 当该定义类型中使用了字段时, 该定义类型包含的信息中还可以包括该 定义类型中的字段的偏移量; 这些信息都可以通过读取元数据表 TypeDef中 的数据得到, 在元数据表 TypeDef中, 每行数据表示一个定义类型, 每行数 据共 14个字节, 这 14个字节记录的信息为: 前 4 个字节为 Flags (定义类型标识), 5、 6 字节为定义类型名称在 "#Strings,,流中的偏移量, 7、 8字节为该定义类型所属的命名空间名称在 .net 文件里" #Strings"流中的偏移量, 9、 10字节为该定义类型的所继 的父类的 偏移量, 11、 12字节为该定义类型包含的字段的偏移量, 13、 14字节为该定 义类型所包含的方法的偏移量。 上述指定信息和计数信息与实施例 13 中对应的信息相同, 这里不再详 述。
优选地, 本实施例中将定义类型的标识分为类型标识、 访问标识和描述 性标识; 相应地, 步骤 1306中的压缩模块 1204对上述指定信息进行压缩的步骤 包括: 对所述类型标识、 访问标识和描述性标识进行或运算, 将得到的数据作 为定义类型的标识的压缩结果; 对所述定义类型的名称进行哈希运算, 从运 算结果中提取约定的字节作为定义类型的名称的压缩结果。 当该. net 文件中的当前定义类型中包含字段时, 该定义类型包含的信息 还包括: 该定义类型中的字段的偏移量; 相应地, 上述指定信息还包括: 该定义类型中的字段对应的信息; 计数 信息还包括: 该定义类型的字段计数。 优选地, 本实施例中的字段对应的信息包括: 字段的名称、 字段的标识 和字段的类型; 其中, 所述字段的标识分为访问标识和描述性标识; 相应地, 步骤 1306中的压缩模块 1204对上述指定信息进行压缩的步骤 包括: 对该字段的名称进行哈希运算, 从运算结果中提取约定的字节作为该字 段的名称的压缩结果; 对该字段的标识中的访问标识和描述性标识进行或运 算, 得到的结果作为该字段的标识的压缩结果; 将该字段的类型在压缩后的 类型中的偏移量作为该字段的类型的压缩结果。 优选地, 定义类型包含的信息包括: 该定义类型所继承的父类的偏移量; 相应地, 上述方法还包括: 判断该定义类型所继承的父类是否已压缩, 如果是, 获取所继承的父类 的偏移量; 否则, 对所继 的父类进行压缩并为压缩后的所继 的父类分配 偏移量; 相应地, 上述定义类型的压缩结果中还包括压缩后的所述父类的偏 移量。 优选地, 上述方法还包括: 判断该定义类型是否有继承的接口; 如果有, 获取该定义类型所继承的
接口压缩后的偏移量和继 7 的接口的个数; 相应地, 该定义类型的压缩结果 中还包括上述继承的接口压缩后的偏移量和继承的接口的个数。 优选地, 上述方法还包括: 判断该定义类型是否是嵌套类型, 如果是, 获取该定义类型所在的类型 压缩后的偏移量; 相应地, 该定义类型的压缩结果中还包括上述定义类型所 在的类型压缩后的偏移量。 通过上述方法, 对每个定义类型压缩后的数据进行组织, 其格式为: 定 义类型名称的 hash值 (定义类型名称的压缩结果)、 类型标识的压缩结果、 类型所继承的接口计数、 类型父类的偏移量、 类型中所包含的字段计数、 类 型中方法重载信息、 定义类型所在类型压缩后的偏移量、 类型所继 7 接口压 缩后的偏移量、 类型中字段对应的信息等; 其中, 类型所继承的接口压缩后的偏移量、 类型中字段对应的信息可以 有多条, 当有多条时, 依次排列当前定义类型所继承的接口偏移量、 字段对 应的信息; 另外, 定义类型所在类型压缩后的偏移量、 类型所继^接口压缩 后的偏移量、 类型中字段对应的信息这三者中可能部分存在或都不存在, 此 3) The offset of the type of the above field in the compressed type is taken as the compression result of the type of the field. As can be seen from the above, the compressed definition types include: a compression result of the definition type name, a compression result of the type identifier, a field count included in the type, a method count, a method overload information, a field corresponding to the type, and the like. . The information may be arranged in a predetermined format or may be arranged arbitrarily. In this embodiment, by compressing each part of the definition type in the .net file, and storing the compression result of each part according to a predetermined format, the storage space occupied by the .net file can be effectively reduced, so that the .net file can be Storage and operation on small-capacity storage media (for example, smart cards) enhances the functionality of small-capacity storage media such as smart cards. Embodiment 14 Referring to FIG. 26, this embodiment provides a compression method for defining a type in a .NET file. The method is described by using a compression device implemented in Embodiment 13, and the method includes: Step 1302: Defining a type The information obtaining module 1202 obtains the information included in the definition type used in the .NET file. Step 1304: The definition type information obtaining module 1202 obtains the specified information and the counting information of the definition type according to the information included in the definition type. Step 1306: The compression module 1204 Compressing the specified information; Step 1308: The compression result storage module 1206 stores the compressed designation information and the count information as a compression result of the definition type. The information included in the definition type obtained by the definition type information obtaining module 1202 is obtained by: first reading the definition type metadata table in the .net file, that is, the metadata table TypeDef; and then obtaining from the metadata table TypeDef The definition type used in the .net file contains information. The information included in the definition type in this embodiment includes: an identifier of the definition type, an offset of the name of the definition type, and an offset of the method in the definition type; when a field is used in the definition type, The information contained in the definition type may also include an offset of the field in the definition type; the information may be obtained by reading the data in the metadata table TypeDef. In the metadata table TypeDef, each row of data represents a definition type. Each row of data has a total of 14 bytes. The information of the 14 bytes is: The first 4 bytes are Flags (definition type identifier), and the 5th and 6 bytes are defined type names in "#Strings,, stream The offset, 7 or 8 bytes is the offset of the namespace name to which the defined type belongs in the "#Strings" stream in the .net file, and 9, 10 bytes are the succeeded parent of the defined type. The offset of 11, 12 bytes is the offset of the field contained in the definition type, and the 13 and 14 bytes are the offset of the method included in the definition type. The above specified information and counting information and the embodiment 13 Corresponding information Same here not elaborate. Preferably, in this embodiment, the identifier of the defined type is divided into a type identifier, an access identifier, and a descriptive identifier. Accordingly, the step of compressing the specified information by the compression module 1204 in step 1306 includes: The access identifier and the descriptive identifier are ORed, and the obtained data is used as a compression result of the identifier of the definition type; the name of the definition type is hashed, and the agreed byte is extracted from the operation result as the name of the definition type. Compress the result. When the current definition type in the .net file contains a field, the definition type contains information including: an offset of the field in the definition type; correspondingly, the above specified information further includes: a field in the definition type Corresponding information; The counting information also includes: a field count of the defined type. Preferably, the information corresponding to the field in this embodiment includes: a name of the field, an identifier of the field, and a type of the field; wherein, the identifier of the field is divided into an access identifier and a descriptive identifier; and correspondingly, the compression in step 1306 The module 1204 compresses the specified information by: performing a hash operation on the name of the field, extracting the agreed byte from the operation result as a compression result of the name of the field; and accessing the identifier in the identifier of the field The descriptive identifier is ORed, and the result is the compression result of the identifier of the field; the offset of the type of the field in the compressed type is the compression result of the type of the field. Preferably, the information included in the definition type includes: an offset of the parent class inherited by the definition type; correspondingly, the method further includes: determining whether the parent class inherited by the definition type is compressed, and if so, obtaining the inherited Offset of the parent class; otherwise, compressing the succeeded parent class and assigning an offset to the compressed succeeding parent class; correspondingly, the compressed result of the above defined type further includes the compressed The offset of the parent class. Preferably, the method further includes: determining whether the defined type has an inherited interface; if yes, obtaining the inherited by the defined type The offset of the interface and the number of interfaces following the 7th; correspondingly, the compression result of the defined type also includes the offset of the above-mentioned inherited interface and the number of inherited interfaces. Preferably, the method further includes: determining whether the definition type is a nested type, and if yes, obtaining a compressed offset of the type of the definition type; correspondingly, the compression result of the definition type further includes the definition type The type of offset after the type. Through the above method, the data compressed by each definition type is organized in the following format: a hash value defining the type name (the compression result of defining the type name), a compression result of the type identifier, an interface count inherited by the type, and a type parent The offset of the class, the field count contained in the type, the method overload information in the type, the compressed offset of the type where the type is defined, the offset of the type after the 7-port compression, and the information corresponding to the field in the type. Etc.; The offset of the interface inherited by the type, and the information corresponding to the field in the type may have multiple. When there are multiple, the interface offset and the corresponding information of the current definition type are sequentially arranged. In addition, the offset type after the type of the defined type, the offset of the type after the compression of the interface, and the information corresponding to the field in the type may exist or not exist.
本实施例通过对. net 文件中的定义类型的各个部分进行压缩, 并将各部 分的压缩结果按照预先确定的格式存储, 可以有效地降低. net 文件占用的存 储空间, 使. net文件可以在小容量存储介质 (例如: 智能卡)上存储并运行, 进而增强了小容量存储介质 (例如: 智能卡) 的功能。 实施例 15 本实施例提供了一种 .net 文件中定义类型的压缩方法, 该压缩方法以一 个具体应用实例为例进行说明, 在 jtb应用实例中, 涉及到 .net 文件中的引用 类型的压缩及存储的部分, 本实例中引用类型的压缩结果为已知的内容, 可 以直接使用。 本实施例以将下面的代码编译后的文件为例, 说明. net 文件中定义类型 的压缩方法。 部分代码如下: namespace MyCompany. MyOnCardApp
public class My Service: MarshalByRefObjectIn this embodiment, by compressing each part of the definition type in the .net file, and storing the compression result of each part according to a predetermined format, the storage space occupied by the .net file can be effectively reduced, so that the .net file can be Storage and operation on small-capacity storage media (for example, smart cards) enhances the functionality of small-capacity storage media such as smart cards. Embodiment 15 This embodiment provides a compression method for defining a type in a .NET file. The compression method is described by taking a specific application example as an example. In the jtb application instance, the reference type compression in the .NET file is involved. And the stored part, the compression result of the reference type in this example is known content, and can be used directly. This embodiment takes the file compiled by the following code as an example to illustrate the compression method of the type defined in the .net file. Part of the code is as follows: namespace MyCompany. MyOnCardApp Public class My Service: MarshalByRefObject
{ {
public String MySampleMethodQ Public String MySampleMethodQ
String strHello = "Hello World!"; return strHello + callCount.ToStringQ; String strHello = "Hello World!"; return strHello + callCount.ToStringQ;
public class ClassA Public class ClassA
public class ClassC: ClassB,IA,IB Public class ClassC: ClassB, IA, IB
static String strField; Static String strField;
Int32 iField; Int32 iField;
public ClassC(String strl, int i) Public ClassC(String strl, int i)
strField = strl: strField = strl:
iField = i; iField = i;
public String TestC()
return null; Public String TestC() Return null;
} }
} private struct StructB { } } public class ClassB } private struct StructB { } } public class ClassB
{ > public interface IA { } public interface IB { > public interface IA { } public interface IB
{ > { >
} 对上述代码使用. net平台编译后得到 helloworldexe文件, 并以二进制的 形式存储在硬盘上,该二进制文件为 .net文件, .net文件可以运行在 Windows 环境下并且符合 PE ( Portable Executable, 可移植可执行) 文件格式, PE格 式是 Windows的可执行文件的格式, Windows中的. exe文件、 .dll文件都是 PE格式。 参见图 27, 为. net文件的结构示意图, 该文件包括 Dos头、 PE特 征以及元数据 ( MetaData ), 元数据中包括元数据头 ( MetaData Header )、 元 数据表 ( MetaData Tables ) 等。 参见图 28, 本实施例提供的. net文件中定义类型的压缩方法包括: 步骤 1401 : 定位. net文件中的元数据表(Metadata Tables ) 的起始地址, 并获取现存表位向量; 其中, 元数据表为. net 文件的一部分, 定位元数据表 的过程 ¾口下: } The above code is compiled with the .net platform to get the helloworldexe file, which is stored in binary form on the hard disk. The binary file is a .net file. The .net file can run under Windows environment and is compatible with PE (Portable Executable, portable). Executable) File format, PE format is the format of Windows executable file. The .exe file and .dll file in Windows are all in PE format. See Figure 27, which is a schematic diagram of the structure of a .net file. The file includes a Dos header, a PE feature, and metadata (metadata). The metadata includes a metadata header (MetaData Header), a metadata table (MetaData Tables), and the like. Referring to FIG. 28, the compression method for defining a type in the .NET file provided by this embodiment includes: Step 1401: Locating a starting address of a metadata table (Metadata Tables) in a net file, and acquiring an existing table bit vector; The metadata table is part of the .net file, and the process of locating the metadata table is as follows:
1)定位. net文件 Dos头, 得到 Dos头标 i只 0x5a4d;
2)从 Dos头标识后开始跳过第一约定个字节,读出 PE特征的偏移地址, 得到 PE 特征的偏移地址 0x00000080;在本实施例中, 第一约定个字节为 0x003 a个字节; 1) Positioning. Net file Dos header, get Dos header i only 0x5a4d; 2) After the Dos header is identified, the first agreed byte is skipped, and the offset address of the PE feature is read to obtain the offset address 0x00000080 of the PE feature. In this embodiment, the first agreed byte is 0x003 a. Bytes;
3)才艮据 PE特征偏移地址 0x00000080定位 PE特征, 定位得到 PE特征 0x00004550; 3) According to the PE feature offset address 0x00000080, the PE feature is located, and the positioning is PE feature 0x00004550;
4)从 PE特征后开始, 偏移第二约定个字节后读取四个字节, 在本实施 例中, 以 32位机为例进行说明, 第二约定个字节为从 PE特征处向后偏移 0x0074 字节后, 读出的数据为 0x00000010, 此值说明该二进制文件中存在 0x10个目录, 且包含 .net数据; 其中, .net文件的元数据头地址写在上述第 OxOF个目录中;若是在 64位机中, 则第二约定个字节为 0x0084个字节; 4) After the PE feature is started, four bytes are read after the second predetermined byte is offset. In this embodiment, a 32-bit machine is taken as an example for description, and the second agreed byte is from the PE feature. After offsetting 0x0074 bytes backward, the read data is 0x00000010. This value indicates that there are 0x10 directories in the binary file and contains .net data. The metadata header address of the .net file is written in the above OxOF. In the directory; if it is in a 64-bit machine, the second agreed byte is 0x0084 bytes;
5)从上述数据 0x00000010开始, 向后偏移第三约定个字节读取八个字 节数据, 在本实施例中, 优选地, 第三约定个字节为 0x0070 个字节, 在此 八个字节数据中, 前四个字节为 0x00002008 , 为. net文件中 .net数据头的相 对虚拟地址, 后四个字节为 0x00000048 , 为 .net数据头的长度; 6)才艮据 .net文件中 .net数据头的相对虚拟地址 0x00002008得到线性地址5) Starting from the above data 0x00000010, the third predetermined byte is read backward by eight bytes of data. In this embodiment, preferably, the third agreed byte is 0x0070 bytes, where In the byte data, the first four bytes are 0x00002008, which is the relative virtual address of the .net data header in the .net file, and the last four bytes are 0x00000048, which is the length of the .net data header; 6). The relative virtual address of the .net data header in the net file is 0x00002008 and the linear address is obtained.
0x00000208并读取 .net数据头得 口下数据: 0x00000208 and read the .net data header.
48000000 02000500 0C220000 9C0A0000 48000000 02000500 0C220000 9C0A0000
09000000 01000006 00000000 00000000 09000000 01000006 00000000 00000000
50200000 80000000 00000000 00000000 00000000 00000000 00000000 00000000 50200000 80000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 需要说明的是, 上述数据釆用小端的存储方式, 例如, 上述数据前 4个 字节 0x48000000转换成大端的存储方式为 0x0000048, 表示该数据的长度; 其中, 才艮据 .net元数据头的长度 0x00000048读取 72个字节的数据。 在本实施例中, 线性地址为. net数据在 .net文件中的地址,相对虚拟地址 为相对于 PE载入点的内存偏移, 线性地址和相对虚拟地址的转换关系为: 线性地址=相对虚拟地址-节相对虚拟地址 +节的文件偏移, 在本实施例中,读
取. net文件中 .net数据目录的节的相对虚拟地址为 0x00002000, 节的文件偏 移 为 0x00000200 , 则 线 性 地 址00000000 00000000 It should be noted that the above data is stored in a small endian. For example, the first 4 bytes of the above data 0x48000000 are converted into a big end storage mode of 0x0000048, indicating the length of the data; wherein, according to the .net metadata The length of the header 0x00000048 reads 72 bytes of data. In this embodiment, the linear address is the address of the .net data in the .net file, and the relative virtual address is the memory offset relative to the PE load point. The conversion relationship between the linear address and the relative virtual address is: Linear address=relative Virtual address - section relative virtual address + section file offset, in this embodiment, read The relative virtual address of the section of the .net data directory in the .net file is 0x00002000, and the file offset of the section is 0x00000200, then the linear address
=0x00002008-0x00002000+0x00000200=0x00000208; =0x00002008-0x00002000+0x00000200=0x00000208;
7) 由. net数据头开始向后偏移第四约定个字节后, 在本实施例中第四约 定个字节为从 .net数据头开始向后偏移 8个字节后, 读取共 8个字节数据, 在这 8个字节中,前四个字节为 0x0000220c,为元数据头( MetaData Header ) 的相对虚拟地址, 后四个字节为 0x00000a9c, 为元数据的长度; 7) After the .net data header is shifted backward by the fourth predetermined byte, in the embodiment, the fourth agreed byte is offset from the .net data header by 8 bytes, and then read. A total of 8 bytes of data, among the 8 bytes, the first four bytes are 0x0000220c, which is the relative virtual address of the metadata header (MetaData Header), and the last four bytes are 0x00000a9c, which is the length of the metadata;
8)才艮据元数据头的相对虚拟地址 0x0000220c 得到线性地址 0x0000040c, 根据线性地址和元数据长度读取元数据内容; 9) 由元数据头向后读取, 当读取到标志 "#〜"( 0x237E )时,读取标志" #〜" 之前的八个字节, 其中前四个字节为" #〜"流数据相对元数据头的偏移, 后四 个字节为" #〜"流的长度; 通过" #〜"流的相对偏移得到" #〜"流的数据区域, 在 "#〜"流中第五约定个字节开始读取长度为 8个字节的数据, 得到现存表的位 向量(MaskValid )在本实施例中,现存表的位向量值为 0x000002092002 lf57, 其二进制形式为: 8) According to the relative virtual address 0x0000220c of the metadata header, the linear address 0x0000040c is obtained, and the metadata content is read according to the linear address and the metadata length; 9) read backward by the metadata header, when the flag "#~ is read "(0x237E), read the flag "#~" before the eight bytes, where the first four bytes are "#~" stream data relative to the metadata header offset, the last four bytes are "#~ "The length of the stream; the data area of the stream of "#~" is obtained by the relative offset of the stream of "#~", and the data of the length of 8 bytes is read in the fifth contracted byte in the stream of "#~". Obtaining the bit vector of the existing table (MaskValid) In this embodiment, the bit vector value of the existing table is 0x000002092002 lf57, and its binary form is:
100000100100100000000000100001111101010111 在本实施例中,第五约定个字节为" #〜"流中起始位开始算起第 9个字节; 从低位开始读取现存表的位向量, 每一位代表一个元数据表, 若该表存 在, 则相应位上的值为 1 , 否则为 0; 例如, 从低位开始, 第 1位代表元数据表 Module是否存在, 如果是 1 , 则证明存在元数据表 Module, 如果是 0, 证明不存在, 在本实施例中, 存在 元数据表 Module, 并且第 2位为 1 , 表示元数据表 TypeRef存在, 第 3位为 1 , 表示元数据表 TypeDef存在; 步骤 1402: 定位元数据表 TypeDef (定义类型); 才艮据步 4聚 1401 中读取出的现存表的位向量, 从氐位到高位依次记录 了 .net文件中对应的元数据表是否存在, 其中, 第 3位代表元数据表 TypeDef 是否存在, 如果第 3位为 1 , 则元数据表 TypeDef存在, 如果第 3位为 0, 则元数据表 TypeDef不存在, 在本实施例中, 元数据表 TypeDef存在;
在现存表位向量 0x000002092002 lf57之后的第 9个字节开始,以四个字 节为一个单位对应的记录了 .net 文件中存在的元数据表中所包含的数据行 数, 跳过在 TypeDef 前面的两个元数据表数据行数信息, 在数据 0x000002092002 lf57 后的第 17 个字节处开始读取 4 个字节, 得到数据 0x0000000a, 此数据表示元数据表 TypeDef中有 10个数据行; 在表示数据行数的数据后, 依次存储了每个元数据表的具体内容, 为元 数据表区域; 读取元数据表 TypeDef中内容的过程如下: 才艮据步骤 1401 中读取的现存表的位向量可知, 在本发明实施例中元数 据表 TypeDef 前存在元数据表 Module和元数据表 TypeRef, 其中元数据表 Module有 1个数据行、 10个字节, 元数据表 TypeRef有 31个数据行、 186 个字节, 跳过表示元数据表中所包含的数据行个数的 60 个字节数据后, 再 跳过元数据表 Module和元数据表 TypeRef后读取元数据表 TypeDef的 10个 数据行、 每个数据行有 14个字节, 具体数据如下: 100000100100100000000000100001111101010111 In this embodiment, the fifth agreed byte is the ninth byte starting from the start bit of the "#~"stream; the bit vector of the existing table is read from the lower bit, and each bit represents a metadata. Table, if the table exists, the value of the corresponding bit is 1, otherwise 0; for example, starting from the low bit, the first bit represents whether the metadata table Module exists, and if it is 1, it proves that the metadata table Module exists, if If it is 0, the proof does not exist. In this embodiment, there is a metadata table Module, and the second bit is 1, indicating that the metadata table TypeRef exists, and the third bit is 1, indicating that the metadata table TypeDef exists; Step 1402: Positioning The metadata table TypeDef (definition type); according to the bit vector of the existing table read out in step 4 1401, records whether the corresponding metadata table exists in the .net file from the clamp to the high position, wherein, 3 bits represent the existence of the metadata table TypeDef. If the third bit is 1, the metadata table TypeDef exists. If the third bit is 0, the metadata table TypeDef does not exist. In this embodiment, the metadata table TypeDef is stored. ; Starting with the 9th byte after the existing epitope vector 0x000002092002 lf57, the number of data rows contained in the metadata table existing in the .net file is recorded in units of four bytes, skipped in front of TypeDef The two metadata table data row number information, starting to read 4 bytes at the 17th byte after the data 0x000002092002 lf57, get the data 0x0000000a, this data indicates that there are 10 data rows in the metadata table TypeDef; After the data representing the number of data rows, the specific content of each metadata table is sequentially stored as a metadata table area; the process of reading the contents of the metadata table TypeDef is as follows: According to the existing table read in step 1401 The bit vector is known to have a metadata table Module and a metadata table TypeRef before the metadata table TypeDef in the embodiment of the present invention, wherein the metadata table Module has 1 data row and 10 bytes, and the metadata table TypeRef has 31 data. Row, 186 bytes, skip 60 bytes of data representing the number of data rows contained in the metadata table, then skip the metadata table Module and the metadata table TypeRef and read the number of bytes According to the 10 data lines of the TypeDef table, each data line has 14 bytes. The specific data is as follows:
0x0000000001000000000001000100 0x0100100019002200050001000100 0x0000000001000000000001000100 0x0100100019002200050001000100
0x0100100038002200090002000300 0x0100100038002200090002000300
0x0100100042002200050007000600 0x0100100042002200050007000600
0x0100100049002200050007000700 0x0100100049002200050007000700
0xA100000050002200000007000800 0xA100000053002200000007000800 0xA100000050002200000007000800 0xA100000053002200000007000800
0x0200100056000000140007000800 0x0200100056000000140007000800
0x030110005D0000000D0009000A00 0x030110005D0000000D0009000A00
0x00000000C8030000050009000A00 在元数据表 TypeDef中, 每个数据行中的数据代表一个定义类型, 每个 数据行的前 4个字节为 Flags (定义类型标识), 5、 6字节为定义类型名称在. net 文件里" #Strings"流中的相对偏移, 7、 8字节为该定义类型所属的命名空间名
称在" #Strings"流中的相对偏移, 9、 10字节为该定义类型的所继 的父类的 信息, 11、 12字节为该定义类型包含的第一个字段在元数据表 Field中的数 据行号, 13、 14 字节为该定义类型所包含的第一个方法在元数据表 Method 中的数据行号。 在本实施例中,每一个元数据表中每一个数据行中数据的长度是固定的, 才艮据现存表位向量及元数据表行数通过上述定位元数据表的方法可计算出其 它元数据表在. net文件中的偏移地址。 步骤 1403 : 才艮据该元数据表 TypeDef中的数据读取定义类型的标识和名 称, 并分别对定义类型的标识和名称进行压缩; 首先, 根据读取到元数据表 TypeDef中的数据, 读取定义类型的标识并 进行压缩; 从元数据表 TypeDef中读取四个字节的定义类型的标识, 居该定义类 型的标识可知该定义类型的各项标识属性; 本实施例将定义类型的标识分为 3部分: 类型标识、 访问标识、 描述性 标识, 并重新定义了各项标识属性的值,如类型标识包括: 预定义类型 0x00、 值类型 0x01、枚举类型 0x02、数组类型 0x03、 类类型 0x04、接口类型 0x05、 非托管指针类型 0x06等;访问标识包括: 非公共访问类型 NotPublic ( 0x00 )、 公共访问类型 Public ( 0x10 )、 如果为嵌套类型的话还有: 访问修饰标识公共 嵌套类型 NestedPublic ( 0x20 )、 私有嵌套类型 NestedPrivate(0x30)、 家族嵌 套类型 NestedFamily(0x40)、 程序集嵌套类型 NestedAssembly(0x50)、 程序集 与家族嵌套类型 NestedFamANDAssem(0x60)、 程序集或家族嵌套类型 NestedFamORAssem(0x70); 描述性标识用于描述当前类型中字段的一些属 性, 如存在不可序列化的字段 0x08 , 否则为 0x00; 本实施例对定义类型的标识进行压缩的方法具体为: 将类型标识、 访问 标识及描述性标识进行或运算, 得到的 1个字节的数据即为定义类型的标识 压缩结果。 例如, 在步骤 1402 中读取出的元数据表 TypeDef的第 8个数据行的定 义类型 ClassC, 根据前 4个字节的定义类型标识 0x00100002可知, 其类型 标识为类 0x04、 访问爹饰标识为 NestedPublic ( 0x20 )、 描述性标识为 0x00, 将以上属性的值进或运算 0x04|0x20|0x00得到 0x24 , 将 0x24作为定义类型
ClassC的标识压缩结果。 然后, 根据读取到的元数据表 TypeDef中的数据, 读取定义类型名称并 进行压缩; 即根据定义类型名称在" #Strings"流中的相对偏移读取出该定义类 型的名称; 例如, 读取出的元数据表 TypeDef 中第二个数据行数据中, 第 5、 6个 字节中的值为 0x0019, 定位到 .net文件的" #Strings"流后跳过 0x0019个字节 后开始读取数据, 读到 0x00后结束, 得到 0x4D79536572766572, 该数据即 元数据表 TypeDef中当前定义类型的名称 MyServer; 对读取出的定义类型名称进行哈希 ( hash )运算, 从 hash运算结果中取 第六约定个字节作为定义类型压缩后的结果, 在本发明实施例中第六约定字 节为 hash值的前两个字节。 釆用的 hash算法可以是 MD5、 SHA-1、 SHA-2 等。 例如,在本发明实施例中釆用 MD5算法对 MyServer进行 hash运算,得 到: 0x0CBEFBClEF0639BA18485104440F399C, 取第六约定个字节 OxOCBE 作为元数据表 TypeDef中当前定义类型 MyServer名称的压缩结果。 另外, 如果 -据当前类型的标识判断出当前类型为嵌套类型, 则再读取 出当前类型所在类型的名称, 对当前类型所在类型的名称与当前类型的名称 组合进行 hash运算, 从 hash运算结果中取第六约定个字节作为该当前类型 的压缩结果; 组合的方法可以是使用连接符将二者拼接到一起, 或直接对这 两个类型进行拼接, 或两个类型的名称进行相加、 相减、 异或等数学运算。 例如, 以本实施例中所给的代码为例: 类型 ClassC嵌套在类型 ClassA 中, 因此对 ClassC进行压缩时, 先将 ClassA的类型名称与 ClassC的类型名 称进行拼接, 在本实施例中使用连接符" +"进行拼接, 对 ClassA+ClassC进行 hash运算, 然后取 hash值的前两个字节作为 ClassC的类型名称压缩结果, 这样可以避免定义类型名称的压缩结果出现重复的情况; 其中, 读取出 ClassA 的名称为 0x0042, ClassC 的名称为 0x0056, 对 ClassA+ClassC 使 用 MD5 算 法 进 行 hash 运 算 , 得 到 : 0x9B 8DE23910B330 AD80BDB76E7 AC 19092 , 取前两个字节 0x9B8D作为元 数据表 TypeDef中当前定义类型 ClassC名称的压缩结果。
其中, 本实施例在元数据中定位" #Strings"流的位置的方法为: 在步骤 1401 中的获得元数据头的地址 0x0000040c后, 从元数据头开始向后读取, 当发现标记" #Strings"(0x23537472696E6773)后,读取" #Strings"之前的 8个字 节数据, 得到数据 0x0004000080040000,其中高位 4 个字节转换成大端的表 示方式为 0x00000400, 表示" #Strings"流相对于元数据头的偏移, 低位 4 个 字节转换成大端的表示方式为 0x00000480, 表示" #Strings"流的长度, ,低 4 个字节; 从元数据头的地址 0x0000040c, 向后偏移 0x00000400个字节得到 "#Strings"流的数据区域。 步骤 1404: 获取当前定义类型中的方法重载信息, 以及获取当前定义类 型中所包含的方法的方法计数和字段计数; 本实施例中方法计数的获取方法如下: 读取元数据表 TypeDef 中当前定义类型所在数据行的 13、 14字节中的 数据, 该数据为定义类型所包含第一个方法在元数据表 Method 中的数据行 号, 然后再读取当前定义类型的下一个定义类型所包含第一个方法在元数据 表 Method 中的数据行号, 下一个定义类型所包含的第一个方法的数据行号 减去当前定义类型所包含的第一个方法的数据行号, 所得到的结果即为当前 类型中所包含的所有方法计数。 元数据表 TypeDef中最后一个数据行所包含 的所有方法计数的获取方法为: 元数据表 Method 中的数据行数减去最后一 个定义类型数据行所包含的第一个方法的数据行号, 得到的结果即为元数据 表 TypeDef中最后一个数据行所包含的所有方法计数。 本实施例中读取到的元数据表 TypeDef中定义类型 ClassC所包含的第一 个方法在元数据表 Method中的数据行号为 0x0008, 定义类型 ClassC的下一 个数据行中定义类型 StructB所包含的第一个方法在元数据表 Method中的数 据行号为 OxOOOA, 由 OxOOOA - 0x0008可知, 定义类型 ClassC 中包含的方 法数是 0x0002个。 定义类型的方法重载信息用于描述当前类型中是否存在虚方法如果存 在, 则将该方法重载信息加 1 , 其中, 每个定义类型方法重载信息的初值为 0。 首先, 定位元数据表 Method (方法); 具体的方法类似步骤 1402中定位 元数据表 TypeDef的过程, 简单描述如下:
根据步骤 1401 中读取出的现存表的位向量, 从低位到高位依次记录 了 .net文件中对应的元数据表是否存在, 其中, 第 7位代表元数据表 Method 是否存在, 在本实施例中, 第 7位的数据为 1 , 元数据表 Method存在; 在现存表位向量 0x000002092002 lf57之后的第 9个字节开始,以四个字 节为一个单位对应的记录了 .net 文件中存在的元数据表中所包含的数据行个 数, 根据现存表位向量可知在元数据表 Method之前还存在其它 4个元数据 表, 跳过在 Method前面的 4个元数据表, 在数据 0x000002092002 lf57后的 第 25个字节处开始读取 4个字节, 得到数据 0x00000009, 此数据表示元数 据表 Method中有 9个数据行; 读取得到元数据表 Method 的内容: 根据现存表的位向量以及其后表示 元数据表行数的数据可知, 在本实施例中元数据表 Method之前存在元数据 表 Module有 10个字节、元数据表 TypeRef有 186个字节、元数据表 TypeDef 有 140个字节和元数据表 Field有 54个字节, 跳过前面的 3个元数据表后, 读取元数据表 Method的 9个数据行、 共 126个字节。 然后, 根据当前定义类型所包含第一个方法在元数据表 Method 中的数 据行号从元数据表 Method中读取相应数据行的数据,元数据表 Method中每 一个数据行的第 7、 8 个字节是方法标识, 根据该方法标识判断当前类型中 的方法是否是虚方法, 如果是, 则方法重载信息加 1 ; 若当前定义类型的方 法计数大于 1 , 则继续在元数据表 Method中读取下一个数据行, 根据上述方 法执行操作。 直至将当前定义类型所包含的方法全部读取并判断完毕; 具体地, 判断当前类型中的方法是否是虚方法、 且需要开辟新的存储槽 的方法为: 读取当前类型中方法的标识, 夺该标识与 0x0100 进行与运算, 若运算的结果为 0x0100 则可以判断出当前类型中的方法是虚方法、 且需要 开辟新的存储槽。 获取当前定义类型中所包含的字段计数与获取当前类型中所包含方法计 数的方法类似, 简单描述如下: 读取元数据表 TypeDef中当前定义类型所在行的第 11、 12字节存储的数 据, 得到当前定义类型所包含的第一个字段在元数据表 Field中的数据行号, 然后读取下一数据行中的第 11、 12字节所存储的当前定义类型的下一个定义 类型包含的第一个字段在元数据表 Field 中的数据行号, 后者的数据行号减
去前者的数据行号所得到的结果即为当前类型中所包含的字段计数。 本实施例中读取到的元数据表 TypeDef中定义类型 ClassC所包含的第一 个字段在元数据表 Field中的数据行号为 0x0007, 定义类型 ClassC的下一个 定义类型 StructB 所包含的第一个字段数据行号为 0x0009 , 由 0x0009 - 0x0007可知, 当前定义类型 ClassC中所包含的字段个数是 0x0002个, 即定 义类型 ClassC的字段计数为 0x0002。 步骤 1405 : 获取当前定义类型中的字段对应的信息并压缩。 读取元数据表 Field的方法与步骤 1402类似; 元数据表 Field中每个数 据行的长度为 6个字节; 其中第 1、 2个字节存储的是元数据表 Field中该数 据行中字段的 Flags (字段标识;), 第 3、 4个字节存储的是字段的 name (字 段名 ), 第 5、 6个字节存储的是字段的 Signature信息。 对当前定义类型中的字段对应的信息进行压缩的过程如下: 0x00000000C8030000050009000A00 In the metadata table TypeDef, the data in each data row represents a definition type, the first 4 bytes of each data row are Flags (definition type identifier), 5, 6 bytes are defined type names in .net The relative offset in the "#Strings" stream in the file, 7 or 8 bytes is the namespace name to which the definition type belongs The relative offset in the "#Strings" stream, 9, 10 bytes is the information of the succeeding parent class of the defined type, 11, 12 bytes are the first field contained in the definition type in the metadata table The data line number in the Field, 13, 14 bytes is the data line number of the first method contained in the definition type in the metadata table Method. In this embodiment, the length of the data in each data row in each metadata table is fixed, and other elements can be calculated by the method of locating the metadata table according to the existing table bit vector and the number of metadata table rows. The offset address of the data table in the .net file. Step 1403: The identifier and the name of the definition type are read according to the data in the metadata table TypeDef, and the identifier and the name of the definition type are respectively compressed; first, according to the data read into the metadata table TypeDef, read The identifier of the definition type is taken and compressed; the identifier of the definition type of the four bytes is read from the metadata table TypeDef, and the identifier of the definition type is known for each identifier attribute of the definition type; The identifier is divided into three parts: type identifier, access identifier, descriptive identifier, and redefines the values of each identifier attribute. For example, the type identifier includes: predefined type 0x00, value type 0x01, enumeration type 0x02, array type 0x03, Class type 0x04, interface type 0x05, unmanaged pointer type 0x06, etc.; access identifiers include: non-public access type NotPublic (0x00), public access type Public (0x10), if nested type: access modifier identification public embedded Set type NestedPublic ( 0x20 ), private nested type NestedPrivate (0x30), family nested type NestedFamily (0x40) , Assembly nested type NestedAssembly (0x50), assembly and family nested type NestedFamANDAssem (0x60), assembly or family nested type NestedFamORAssem (0x70); Descriptive identifier is used to describe some properties of the field in the current type, such as There is a non-serializable field 0x08, otherwise 0x00. The method for compressing the identifier of the defined type in this embodiment is specifically: performing a OR operation on the type identifier, the access identifier, and the descriptive identifier, and the obtained 1-byte data is Compress the result for the identity of the defined type. For example, the definition type ClassC of the 8th data line of the metadata table TypeDef read in step 1402 is 0x00100002 according to the definition of the first 4 bytes, and the type identifier is 0x04, and the access identifier is NestedPublic ( 0x20 ), the descriptive identifier is 0x00, the value of the above attribute is ORed 0x04|0x20|0x00 to get 0x24, 0x24 as the definition type ClassC's identity compression results. Then, according to the data in the read metadata table TypeDef, the definition type name is read and compressed; that is, the name of the definition type is read according to the relative offset of the definition type name in the "#Strings"stream; for example , in the second data row data in the metadata table TypeDef read out, the value in the 5th and 6th bytes is 0x0019, after positioning the "#Strings" stream of the .net file, skipping 0x0019 bytes. Start reading data, read 0x00 and end, get 0x4D79536572766572, the data is the name of the currently defined type in the metadata table TypeDef MyServer; hash the hash of the read definition type name, from the hash operation result The sixth agreed byte is taken as the result of the compression of the definition type. In the embodiment of the present invention, the sixth agreed byte is the first two bytes of the hash value. The hash algorithm used can be MD5, SHA-1, SHA-2, and so on. For example, in the embodiment of the present invention, the MD5 algorithm is used to hash the MyServer, and the following is obtained: 0x0CBEFBClEF0639BA18485104440F399C, and the sixth agreed byte OxOCBE is taken as the compression result of the currently defined type MyServer name in the metadata table TypeDef. In addition, if - according to the current type of identifier to determine that the current type is a nested type, then read the name of the type of the current type, hash the name of the type of the current type and the name of the current type, from the hash operation The result is the sixth agreed byte as the compression result of the current type; the combined method may be to use the connector to stitch the two together, or directly splicing the two types, or two types of names Math operations such as addition, subtraction, XOR. For example, the code given in this embodiment is taken as an example: The type Class C is nested in the type Class A. Therefore, when class C is compressed, the type name of the Class A is first spliced with the type name of the Class C, and is used in this embodiment. The connector "+" is spliced, hashing the ClassA+ClassC, and then taking the first two bytes of the hash value as the type name compression result of the ClassC, so as to avoid duplication of the compression result of the defined type name; The name of the ClassA is 0x0042, the name of the ClassC is 0x0056, and the hash operation is performed on the ClassA+ClassC using the MD5 algorithm. The result is: 0x9B 8DE23910B330 AD80BDB76E7 AC 19092 , taking the first two bytes 0x9B8D as the currently defined type in the metadata table TypeDef The compression result of the ClassC name. The method for locating the location of the "#Strings" stream in the metadata in the embodiment is: after obtaining the address 0x0000040c of the metadata header in step 1401, reading backward from the metadata header, when the tag is found "# After Strings" (0x23537472696E6773), the 8 bytes of data before "#Strings" are read, and the data 0x0004000080040000 is obtained. The high-order 4 bytes are converted to big end and the representation is 0x00000400, indicating that the "#Strings" stream is relative to the metadata. The offset of the header, the lower 4 bytes are converted to the big end and the representation is 0x00000480, which means the length of the "#Strings" stream, which is 4 bytes lower; from the address 0x0000040c of the metadata header, 0x00000400 words backwards Section gets the data area of the "#Strings" stream. Step 1404: Obtain the method overload information in the current definition type, and obtain the method count and the field count of the method included in the current definition type. The method for obtaining the method count in this embodiment is as follows: Read the current data table TypeDef Defines the data in the 13th and 14th bytes of the data row where the type is located. The data is the data row number of the first method in the metadata table Method defined by the definition type, and then reads the next definition type of the current definition type. Contains the data line number of the first method in the metadata table Method, the data line number of the first method contained in the next definition type minus the data line number of the first method contained in the current definition type. The result is the count of all the methods contained in the current type. The method for obtaining all the method counts contained in the last data row in the metadata table TypeDef is: The number of data rows in the metadata table Method minus the data row number of the first method included in the last defined type data row, The result is the count of all the methods contained in the last data row in the metadata table TypeDef. In the metadata table TypeDef read in this embodiment, the first method included in the type Class C is defined in the metadata table Method. The data line number in the metadata table is 0x0008, and the definition type of the type ClassC is included in the definition of the type StructB. The first method in the metadata table Method is the data line number OxOOOA. As can be seen from OxOOOA - 0x0008, the number of methods included in the definition type ClassC is 0x0002. The method overloaded information of the defined type is used to describe whether there is a virtual method in the current type. If it exists, the method overload information is incremented by 1, wherein the initial value of the overloaded information of each defined type method is 0. First, locate the metadata table Method; the specific method is similar to the process of locating the metadata table TypeDef in step 1402, which is briefly described as follows: According to the bit vector of the existing table read in step 1401, whether the corresponding metadata table exists in the .net file is sequentially recorded from the lower to the upper, wherein the seventh bit represents whether the metadata table Method exists, in this embodiment. In the middle, the data of the 7th bit is 1, and the metadata table Method exists; the ninth byte after the existing epitope vector 0x000002092002 lf57 starts, and the four bytes are recorded as one unit corresponding to the record in the .net file. The number of data rows included in the metadata table. According to the existing table vector, there are four other metadata tables before the metadata table Method. The four metadata tables in front of the Method are skipped. After the data 0x000002092002 lf57 The 25th byte starts to read 4 bytes, and the data 0x00000009 is obtained. This data indicates that there are 9 data rows in the metadata table Method; the contents of the metadata table Method are read: According to the bit vector of the existing table and The data indicating the number of rows of the metadata table can be known. In the present embodiment, the metadata table Module has 10 bytes before the metadata table Method, and the metadata table TypeRef has 186 bytes. Data Sheet TypeDef 140 bytes and metadata table byte Field 54, after skipping the foregoing metadata table 3, Method read metadata table 9 data lines of 126 bytes. Then, according to the data line number of the first method included in the current definition type in the metadata table Method, the data of the corresponding data row is read from the metadata table Method, and the seventh and eighth of each data row in the metadata table Method The byte is the method identifier. According to the method identifier, it is determined whether the method in the current type is a virtual method. If yes, the method overload information is incremented by one; if the method count of the currently defined type is greater than 1, then the metadata table continues. Read the next data line and perform the operation according to the above method. Until all the methods included in the current definition type are read and judged; specifically, the method for determining whether the method in the current type is a virtual method and needs to open a new storage slot is: reading the identifier of the method in the current type, The ID is used to perform the AND operation with 0x0100. If the result of the operation is 0x0100, it can be determined that the method in the current type is a virtual method and a new storage slot needs to be opened. Obtaining the field count included in the current definition type is similar to the method of obtaining the method count included in the current type. The simple description is as follows: Read the data stored in the 11th and 12th bytes of the row of the currently defined type in the metadata table TypeDef, Get the data row number of the first field contained in the current definition type in the metadata table Field, and then read the next definition type of the current definition type stored in the 11th and 12th bytes in the next data row. The first field is the data line number in the metadata table Field, and the latter is the data line number minus The result of going to the former data line number is the field count contained in the current type. In the metadata table TypeDef read in this embodiment, the first field included in the type Class C is defined in the data table Field. The data line number in the metadata table Field is 0x0007, and the first definition type of the type ClassC is defined as the first type included in the StructB. The field data line number is 0x0009. It can be seen from 0x0009 - 0x0007 that the number of fields included in the current definition type ClassC is 0x0002, that is, the field count of the defined type ClassC is 0x0002. Step 1405: Acquire information corresponding to the field in the currently defined type and compress it. The method of reading the metadata table Field is similar to step 1402; the length of each data row in the metadata table Field is 6 bytes; wherein the first and second bytes are stored in the data row in the metadata table Field The Flags of the field (field identifier;), the 3rd and 4th bytes store the name of the field (field name), and the 5th and 6th bytes store the Signature information of the field. The process of compressing the information corresponding to the fields in the currently defined type is as follows:
1 ) 获取字段名并对其进行压缩; 对读取到的字段的 name (字段名)进行 hash运算, 取 hash值中第七约 定个字节作为该字段 name (字段名)的压缩结果, 在本实施例中第七约定个 字节为 hash值的前两个字节; 在本实施例中 ClassC中包含两个字段, name (字段名) 分别为 strField 和 iField , 以对字段 strField的字段名 "strField"进行压缩为例, 压缩后得到的 hash值为: 0x846461722F82E1CAB3D95632E8424089 取 hash值的前两个字节 0x8464作为该字段名 "strField"的压缩结果。 1) Get the field name and compress it; perform the hash operation on the name of the read field (the field name), and take the seventh agreed byte in the hash value as the compression result of the field name (field name). In this embodiment, the seventh agreed byte is the first two bytes of the hash value; in this embodiment, Class C includes two fields, and the name (field name) is strField and iField respectively, and the field name of the field strField The "strField" is compressed as an example. The hash value obtained after compression is: 0x846461722F82E1CAB3D95632E8424089 The first two bytes of the hash value 0x8464 are taken as the compression result of the field name "strField".
2 ) 对字段的标识进行压缩; 根据在元数据表 Field中读取到的字段的 Flags (字段标识;),根据该字段 的 Flags可以判断出该字段的标识信息, 本实施例中将字段的标识信息分为 两类: 访问标识和描述性标识; 对该字段的访问标识和描述性标识的值进行 或运算, 得到该字段 Flags的压缩结果; 其中, 访问标识包括:
私有范围类型 Privates cope=0x00 私有类型 Private=0x01 家族与程序集类型 FamANDAssem=0x02 程序集类型 Assembly=0x03 家族类型 Family=0x04 家族或程序集类型 FamORAssem=0x05 公共类型 Public=0x06 描述性标识有: 静态类型 Static=0xl0 初始化 InitOnly=0x20 不可序列 4匕 NotSerialized=0x80 在本实施例中,定义类型 ClassC中包含两个字段分别为 strField和 iField, 对 strField字段标识进行压缩时分析其 Flags (字段标识) 可知, 该字段的访 问标识为 Private=0x01 , 描述性标识为 Static=0xl0, 0x01与 0x10进行或运 算得到 0x11 , 则 strField字段的标识压缩结果为 0x11。 2) compressing the identifier of the field; according to the Flags (field identifier;) of the field read in the metadata table Field, the flag information of the field can be determined according to the Flags of the field, and the field is in this embodiment. The identification information is divided into two types: an access identifier and a descriptive identifier; performing an OR operation on the value of the access identifier and the descriptive identifier of the field to obtain a compression result of the flag Flags; wherein, the access identifier includes: Private range type Privates cope=0x00 Private type Private=0x01 Family and assembly type FamANDAssem=0x02 Assembly type Assembly=0x03 Family type Family=0x04 Family or assembly type FamORAssem=0x05 Public type Public=0x06 Descriptive identifiers are: Static Type Static=0xl0 Initialization InitOnly=0x20 Unsequence 4匕NotSerialized=0x80 In this embodiment, the definition type ClassC contains two fields, strField and iField respectively, and analyzes the Flags (field identifier) when compressing the strField field identifier. The access identifier of the field is Private=0x01, the descriptive identifier is Static=0xl0, and the OR operation of 0x01 and 0x10 is 0x11, and the result of the strField field is 0x11.
3 ) 获取字段的类型; 字段的类型存储在" #Blob"流中, 读取元数据表 Field中当前行第 4个字 节的 Signature信息, 该信息为该字段类型信息在" #Blob"流中的相对偏移地 址, 才艮据该相对偏移地址在 "#Blob"流中读取相应的数据; 其中, 读取到的 第一字节的数据表示其后的数据长度, 第 2字节表示数据的类型, 若第 2字 节为 0x06, 则表示该数据为字段的类型, 第 3字节数据表示字段类型、 或者 是在第 4字节中包含字段类型信息; 根据第 3字节所示的字段类型查找元数 据表中与之相对应的类型、 或者解析第 4字节中包含字段类型信息得到该字 段在元数据表中对应的类型, 并将元数据表中该类型压缩后的偏移作为该字 段的类型进行保存。 其中, 定位 "#Blob"流位置的方法与步 4聚 1402 中在元数据中定位
"#Strings,,流的位置的方法类似: 在步骤 1401 中的获得元数据头的地址 0x0000040c后,从元数据头开始向后读取,当发现标记" #Blob"(0x23426C6F6) 后, 读取" #Blob"之前的 8个字节数据, 得到数据 0xD4080000C8010000,其中 高 4个字节转换成大端的表示方式为 0x000008D4,表示" #Blob"流相对于元数 据头的偏移, 低 4个字节转换成大端的表示方式为 0x0001C8 , 表示" #Blob" 流的长度,; 从元数据头的地址 0x0000040c , 向后偏移 0x000008D4 得到 "#Blob"流的数据区域。 在本实施例中, 获取定义类型 ClassC中所包含的字段 strField的字段类 型的方法如下: 在元数据表 Field 中读取 strField的第 4个字节的数据得到 OxOOOA, 然后在" #Blob"流的数据区域中的偏移 OxOOOA处读取数据, 读取到 第 1字节的数据 0x02, 表示在该数据后需要读取 2个字节长度的数据, 得到 ΟχΟόΟΕ, 其中第 2个字节为 0x06 , 表示的是该字节后的数据表示的是字段类 型, 继续读取第 3个字节得到数据 ΟχΟΕ, 根据语言规范可知 ΟχΟΕ表示该字 段类型为 string类型, 查找元数据表 TypeRef 中 string类型压缩后的偏移, 查找得到的结果为 0x03 , 将 0x03作字段 strField的字段类型进行保存。 本实施例定义类型压缩后的字段对应的信息中包含 3个部分: 2个字节 长度的 name (字段名) 压缩值、 1 个字节的 Flags (字段标识) 压缩值和 1 个字节的 Signature信息压缩值; 若当前类型中包含有多个字段, 则将每个字 段压缩后顺序保存。 根据上述所示的方法, ClassC 中所包含的字段 strField和 iField的信息 压缩后的结果为: 0x84641103 F1EC0106。 若当前定义类型中不存在定义字段,则压缩后的定义类型中不存在此项。 步骤 1406: 判断当前定义类型是否存在父类, 若存在父类且其父类未压 缩, 则执行步 4聚 1403 , 通过递归的方法压缩当前类的父类; 否则执行步骤 1407; 从元数据表 TypeDef中当前类型所在数据行的第 9、 10字节读取数据; 若读取到的数据为 0x0000 , 则当前类型没有父类, 执行步 4聚 1407; 若 从元数据表 TypeDef中当前类型所在数据行的第 9、 10字节读取到的数据不 是 0x0000 , 则将读取到的以小端存储方式保存的数据转换成大端存储的格 式, 即转换后是高字节在前、 低字节在后的形式, 然后将转换后数据的二进
制数右移两位, 得到当前类型的父类在元数据表 TypeRef或 TypeDef中的数 据行号; 移位后得到的数据行号与 0x03进行与运算, 若运算结果为 0, 则当 前类型的父类在元数据表 TypeRef中, 执行步骤 1407; 若运算结果为 1 , 则 可知当前类型的父类在元数据表 TypeDef中, 查找元数据表 TypeDef中与该 数据行号相对应数据行的类型, 如果已经压缩, 执行步骤 1407, 否则, 执行 步骤 1403。 步骤 1407: 获取定义类型所继 7 的父类的类型压缩后的偏移量; 从元数据表 TypeDef中当前类型所在数据行的第 9、 10字节读取数据, 若读取到的数据为 0x0000, 则当前类型没有父类, 保存 OxFF表示当前类型 没有父类型; 若从元数据表 TypeDef中当前类型所在数据行的第 9、 10字节读取到的 数据不是 0x0000, 则将读取到的以小端存储方式保存的数据转换成大端存储 的格式, 即转换后是高字节在前、 低字节在后的形式, 然后将转换后数据的 二进制数右移两位, 得到当前类型的父类在元数据表 TypeRef或 TypeDef中 的数据行号; 移位后得到的数据行号与 0x03进行与运算, 若运算结果为 0, 则当前类 型的父类在元数据表 TypeRef中, 查找元数据表 TypeRef中与该数据行号相 对应数据行的类型压缩后的偏移, 并将其作为当前类型的父类的偏移存储下 来; 若运算结果为 1 , 则可知当前类型的父类在元数据表 TypeDef 中, 查找 元数据表 TypeDef中与该数据行号相对应数据行的类型压缩后的偏移, 并将 其作为当前类型的父类的偏移存储下来。 例如, 在本实施例中 ClassC 的父类的获取过程如下: 读取元数据表 TypeDef中 ClassC所在数据行的第 9、 10字节, 得到数据转换为大端存储格 式后为 0x0014, 转换成二进制为: 10100, 对二进制数 10100右移 2位得到 101 即 0x05 , 由于 0x05与 0x03进行与运算后的结果为 0x01 , 可知 ClassC 的父类在元数据表 TypeDef中第 0x05个数据行, 然后查找元数据表 TypeDef 中第 0x05个数据行的定义类型压缩后的偏移, 并将其作为 ClassC父类的偏 移存储下来。 步骤 1408: 为当前定义类型分配偏移量; 按照以上步骤所述的方法对每个定义类型进行压缩后, 获取. net 文件中
元数据表 TypeRef (引用类型) 的压缩信息, 继续引用类型压缩后的偏移量 连续地为定义类型分配偏移量。 例如: 若. net 文件中引用类型压缩后的最后 一个类型的偏移量为 OxlA , 则压缩后定义类型中第一个类型的偏移量为 OxlB , 压缩后第二个定义类型的偏移为 0xlC。 步骤 1409: 判断是否有等待父类偏移量的定义类型在緩存中, 若有, 则 执行步 4聚 1407, 否则执行步 4聚 1410; 步骤 1410: 判断是否所有定义类型的数据分析完毕, 若是则执行步骤 1411 , 否则,执行步骤 1403 , 继续压缩元数据表 TypeDef中其余的定义类型; 步骤 1411 : 获取当前定义类型所继承的接口偏移量及继承的接口数; 当所有定义类型的数据都经过上述处理之后, 可以通过查询元数据表3) Get the type of the field; the type of the field is stored in the "#Blob" stream, and the Signature information of the 4th byte of the current line in the metadata table Field is read. The information is the type information of the field in the "#Blob" stream. The relative offset address in the data, the corresponding data is read in the "#Blob" stream according to the relative offset address; wherein, the data of the first byte read indicates the length of the data after, the second word The section indicates the type of data. If the second byte is 0x06, it indicates that the data is the type of the field, the third byte data indicates the field type, or the field type information is included in the fourth byte; according to the third byte The field type shown finds the corresponding type in the metadata table, or parses the field type information contained in the 4th byte to get the corresponding type of the field in the metadata table, and compresses the type in the metadata table. The offset is saved as the type of the field. Wherein, the method of locating the "#Blob" stream position is located in the metadata in step 41402 "#Strings,, the method of the location of the stream is similar: after the address 0x0000040c of the metadata header is obtained in step 1401, the backward reading is started from the metadata header, and when the tag "#Blob" (0x23426C6F6) is found, the read is performed. The 8 bytes of data before "#Blob" get the data 0xD4080000C8010000, where the high 4 bytes are converted to big end and the representation is 0x000008D4, which means the offset of the "#Blob" stream relative to the metadata header, 4 words lower The representation of the section converted to big end is 0x0001C8, which means the length of the "#Blob" stream, and the data area of the "#Blob" stream is obtained by offsetting 0x0000040c4 from the address of the metadata header to 0x000008D4. In this embodiment, The method for defining the field type of the field strField contained in the type ClassC is as follows: Read the data of the 4th byte of strField in the metadata table Field to get OxOOOA, and then offset in the data area of the "#Blob" stream. The data is read at OxOOOA, and the data 0x02 of the first byte is read, indicating that it is necessary to read data of 2 bytes in length after the data, and ΟχΟόΟΕ, where the second byte is 0x06, indicating the word. Post-holiday data Shows the field type, continues to read the third byte to get the data ΟχΟΕ, according to the language specification, ΟχΟΕ indicates that the field type is string type, find the offset of the string type compressed in the metadata table TypeRef, the result of the search is 0x03, save 0x03 as the field type of the field strField. This embodiment defines that the information corresponding to the compressed field contains three parts: 2 bytes of length name (field name) Compressed value, 1 byte Flags (Field ID) Compressed value and 1-byte Signature information compression value; if the current type contains multiple fields, each field is compressed and saved in order. According to the method shown above, the ClassC contains The information of the fields strField and iField is compressed: 0x84641103 F1EC0106. If there is no defined field in the currently defined type, the item does not exist in the compressed definition type. Step 1406: Determine whether the current definition type has a parent class, if If there is a parent class and its parent class is uncompressed, then step 4 is aggregated 1403, and the parent class of the current class is compressed by a recursive method. Otherwise, step 1407 is performed; data is read from the 9th and 10th bytes of the data line of the current type in the metadata table TypeDef; if the read data is 0x0000, the current type has no parent class, and the execution step 4 is 1407; The data read from the 9th and 10th bytes of the data line of the current type in the metadata table TypeDef is not 0x0000, and the read data stored in the little endian storage mode is converted into the format of the big endian storage, that is, the conversion After the high byte is in the front, the low byte is in the back form, and then the data is converted into binary The system number is shifted to the right by two digits, and the data row number of the parent class of the current type in the metadata table TypeRef or TypeDef is obtained; the data row number obtained after the shift is ANDed with 0x03, and if the operation result is 0, the current type is In the metadata table TypeRef, the parent class performs step 1407; if the operation result is 1, it can be seen that the parent class of the current type is in the metadata table TypeDef, and the type of the data row corresponding to the data row number in the metadata table TypeDef is found. If it is already compressed, go to step 1407, otherwise, go to step 1403. Step 1407: Obtain the compressed type of the type of the parent class of the definition type 7; read the data from the 9th and 10th bytes of the data line of the current type in the metadata table TypeDef, if the read data is 0x0000, the current type has no parent class, save OxFF means that the current type has no parent type; if the data read from the 9th and 10th bytes of the data line of the current type in the metadata table TypeDef is not 0x0000, it will be read The data saved in the little endian storage mode is converted into a big endian storage format, that is, the high byte first and the low byte are after the conversion, and then the binary number of the converted data is shifted to the right by two, to obtain the current The data row number of the parent class of the type in the metadata table TypeRef or TypeDef; the data row number obtained after the shift is ANDed with 0x03, if the operation result is 0, the parent class of the current type is in the metadata table TypeRef, Find the offset of the type of the data row corresponding to the data row number in the metadata table TypeRef, and store it as the offset of the parent class of the current type; if the operation result is 1, the current type is known In the parent TypeDef metadata table, find the offset metadata table TypeDef row corresponding to the type of data compression and the line number data, and stores it as an offset down the current type of the parent class. For example, in this embodiment, the acquisition process of the parent class of ClassC is as follows: Read the 9th and 10th bytes of the data row of the ClassC in the metadata table TypeDef, and obtain the data converted to the big end storage format and then 0x0014, converted into binary For: 10100, shifting the binary number 10100 to the right by 2 bits to get 101 is 0x05. Since the result of the AND operation between 0x05 and 0x03 is 0x01, it can be seen that the parent class of ClassC is 0x05 data lines in the metadata table TypeDef, and then finds the element. The compressed offset of the 0x05 data row of the data table TypeDef is stored as the offset of the ClassC parent class. Step 1408: Assign an offset to the currently defined type; after compressing each defined type according to the method described in the above step, obtain the .net file. The compression information of the metadata table TypeRef (reference type) continues to reference the type-compressed offset to continuously assign an offset to the definition type. For example: If the offset of the last type after the reference type compression in the .net file is OxlA, the offset of the first type in the compressed definition type is OxlB, and the offset of the second defined type after compression is 0xlC. Step 1409: Determine whether there is a definition type waiting for the parent class offset in the cache, if yes, execute step 4 to gather 1407, otherwise execute step 4 to gather 1410; Step 1410: Determine whether all the defined types of data are analyzed, if Then, step 1411 is performed. Otherwise, step 1403 is executed to continue compressing the remaining definition types in the metadata table TypeDef. Step 1411: Obtain the interface offset and the number of inherited interfaces inherited by the currently defined type; when all the defined types of data are After the above processing, you can query the metadata table.
Interfacelmpl 中的数据得到每个定义类型所继承的接口偏移量和所继承的接 口数; 根据现存表位向量定位元数据表 Interfacelmpl (接口类), 具体读取元数 据表的方法与步骤 1402类似; 读取元数据表 Interfacelmpl。 在元数据表 Interfacelmpl中, 每行有 4个字节, 第 1、 2字节 Class表示 继承该接口的类型在元数据表 TypeDef中所在的数据行,第 3、4字节 Interface 中的值转换成二进制数右移 2位后得到的值为该接口类在元数据表 TypeDef 中所在的数据行。 根据元数据表 Interfacelmpl中读取的信息得到元数据表 TypeDef中每个 定义类型所继 7 的接口类和所继 7 的接口数; 其中, 还获取相应接口类的压 缩后的偏移量。 若当前定义类型没有继承任何接口, 则压缩后的定义类型中接口数为 0x00 , 接口对应的信息不存在。 步骤 1412: 获取嵌套类型所在的类型压缩后的偏移量; 根据当前定义类型的标识信息判断当前类型是否是嵌套类型, 若是嵌套 类型, 则获取当前嵌套类型所在的类型的偏移量; 若判断出当前类型不是嵌 套类型, 则压缩后的定义类型中不存在此项。
根据在步骤 1403 中从元数据表 TypeDef 中读取四个字节的定义类型标 识信息判断当前类型是否是嵌套类型, 若是嵌套类型则还要读取元数据表 NestedClass (嵌套类型 ) 中的信息; 才艮据现存表位向量定位元数据表 NestedClass, 具体的方法类似步骤 1402; 读取元数据表 NestedClass。 在元数据表 NestedClass中每行有 4个字节, 第 1、 2字节为 NestedClass 表示当前嵌套类型在元数据表 TypeDef 中所在的数据行, 第 3、 4 字节为 EnclosingClass表示当前定义类型所在的类型在元数据表 TypeDef 中所在的 数据行。 查找与当前类型所在类型在元数据表 TypeDef中所在的数据行, 获取当 前类型所在类型压缩后的偏移量, 并将其作为当前嵌套类型所在的类型; 步骤 1413 : 组织并存储压缩后的定义类型数据。 对每个定义类型压缩后的数据进行组织,其格式为:定义类型名称的 hash 值、 类型标识的压缩结果、 类型所继承的接口计数、 类型父类的偏移量、 类 型中所包含的字段计数、类型中方法重载信息、嵌套类型所在类型的偏移量、 类型所在接口的偏移量、 类型中字段对应的信息; 其中, 类型所的接口的偏移、 类型中字段对应的信息可以有多条, 当有 多条时, 依次排列当前定义类型中的接口偏移量、 字段信息; 另外, 嵌套类 型所在类型的偏移量、 类型所继承的接口的偏移量、 类型中字段对应的信息 这三者中可能部分存在或都不存在,若不存在则压缩结果中不写入这些信息。 例如, 本实施例提供的代码中, 定义类型 ClassC 压缩后的结果为:The data in Interfacelmpl gets the interface offset and the number of interfaces inherited by each defined type. According to the existing table vector, the metadata table Interfacelmpl (interface class) is used. The method of reading the metadata table is similar to step 1402. ; Read the metadata table Interfacelmpl. In the metadata table Interfacelmpl, each row has 4 bytes, the first and second bytes of Class indicate that the type of the interface inheriting the interface is in the data row of the metadata table TypeDef, and the value conversion in the third and fourth byte Interface The value obtained by shifting the binary number to the right by 2 bits is the data row of the interface class in the metadata table TypeDef. According to the information read in the metadata table Interfacelmpl, the interface class of each definition type in the metadata table TypeDef and the number of interfaces of the succeeding 7 are obtained; wherein, the compressed offset of the corresponding interface class is also obtained. If the current definition type does not inherit any interface, the number of interfaces in the compressed definition type is 0x00, and the information corresponding to the interface does not exist. Step 1412: Obtain the compressed offset of the type of the nested type. Determine whether the current type is a nested type according to the identifier information of the currently defined type. If the type is a nested type, obtain the offset of the type of the current nested type. Quantity; if it is determined that the current type is not a nested type, the item does not exist in the compressed definition type. According to the definition type identification information of reading four bytes in the metadata table TypeDef in step 1403, it is judged whether the current type is a nested type, and if it is a nested type, it is also read in the metadata table NestedClass (nested type). The information is based on the existing epitope vector positioning metadata table NestedClass, the specific method is similar to step 1402; read the metadata table NestedClass. There are 4 bytes in each row in the metadata table NestedClass, the first and second bytes are NestedClass, which indicates the data row of the current nested type in the metadata table TypeDef, and the 3rd and 4th bytes are EnclosingClass indicating the currently defined type. The data type in which the type is in the metadata table TypeDef. Find the data row of the type of the current type in the metadata table TypeDef, obtain the compressed offset of the type of the current type, and use it as the type of the current nested type; Step 1413: Organize and store the compressed Define type data. The compressed data for each defined type is organized in the following format: the hash value that defines the type name, the compression result of the type identifier, the interface count inherited by the type, the offset of the type parent class, and the fields contained in the type. Count, the method overload information in the type, the offset of the type of the nested type, the offset of the interface where the type is located, and the information corresponding to the field in the type; where, the offset of the interface of the type, the information corresponding to the field in the type There may be more than one, when there are multiple, the interface offset and field information in the currently defined type are arranged in turn; in addition, the offset of the type of the nested type, the offset of the interface inherited by the type, and the type The information corresponding to the field may or may not exist in the three parts. If it does not exist, the information is not written in the compression result. For example, in the code provided in this embodiment, the result of defining the type ClassC is:
0x9B8D240216020200171819F 1EC010684641103 为了使上述压缩结果看起来更清楚直观, 将上述数据以表 14 的形式存 放, 参见表 14, 定义类型 ClassC压缩后的结果的解析结构为: 表 14 0x9B8D240216020200171819F 1EC010684641103 In order to make the above compression result seem clearer and more intuitive, the above data is stored in the form of Table 14. See Table 14. Definition Type The resolution structure of the result after ClassC compression is: Table 14
继 7 的接口计数 02 继 7 的父类的偏移量 16 定义类型所包含字段计数 02 定义类型中方法计数 02 方法重载信息 00 嵌套类型所在类型的偏移量 17 继 ? 的接口偏移量 18, 19 定义字段对应的信息 F1EC 01 06, 8464 11 03 将. net 文件中每一个定义类型压缩完成后, 按照压缩后分配的偏移量顺 序地存储。 本实施例中所要进行压缩处理的数据是在. net 架构下编写的代码编译后 变成的二进制数据, 通过本实施例提供的方法, 定义类型的压缩率最大可以 达到 30%, 进而可以有效地降低 .net文件占用的存储空间, 使. net文件可以 在小容量存储介质 (例如: 智能卡) 上存储并运行, 进而增强了小容量存储 介质 (例如: 智能卡) 的功能。 以上实施例提供的压缩方法或压缩装置有效地降低了. net 文件占用的存 储空间,利于. net文件在各种设备 上使用,同时还较大地节省了系统的资源, 提高了资源利用率。 显然, 本领域的技术人员应该明白, 上述的本发明的各模块或各步骤可 以用通用的计算装置来实现, 它们可以集中在单个的计算装置上, 或者分布 在多个计算装置所组成的网络上, 可选地, 它们可以用计算装置可执行的程 序代码来实现, 从而可以将它们存储在存储装置中由计算装置来执行, 或者 将它们分别制作成各个集成电路模块, 或者将它们中的多个模块或步骤制作 成单个集成电路模块来实现。 这样, 本发明不限制于任何特定的硬件和软件 结合。 以上所述仅为本发明的优选实施例而已, 并不用于限制本发明, 对于本 领域的技术人员来说, 本发明可以有各种更改和变化。 凡在本发明的 ^"神和 原则之内, 所作的任何修改、 等同替换、 改进等, 均应包含在本发明的保护
范围之内。
Following the interface count of 7 02 follows the offset of the parent class of 7 16 Defines the type contained in the field count 02 Defines the type in the method count 02 Method overload information 00 The type of the nested type is offset 17 Follow the interface offset The quantity 18, 19 defines the corresponding information of the field F1EC 01 06, 8464 11 03 After the compression of each defined type in the .net file is completed, it is stored sequentially according to the offset allocated after compression. The data to be compressed in the embodiment is the binary data that is compiled after the code written in the .NET structure, and the compression rate of the defined type can be up to 30% by the method provided in this embodiment, thereby effectively Reduce the storage space occupied by .net files, so that .net files can be stored and run on small-capacity storage media (such as smart cards), which enhances the function of small-capacity storage media (such as smart cards). The compression method or the compression device provided by the above embodiments effectively reduces the storage space occupied by the .net file, which facilitates the use of .net files on various devices, and also saves system resources and improves resource utilization. Obviously, those skilled in the art should understand that the above modules or steps of the present invention can be implemented by a general-purpose computing device, which can be concentrated on a single computing device or distributed over a network composed of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device so that they may be stored in the storage device by the computing device, or they may be separately fabricated into individual integrated circuit modules, or Multiple modules or steps are made into a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software. The above is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modification, equivalent substitution, improvement, etc. made within the "God and Principles" of the present invention shall be included in the protection of the present invention. Within the scope.
Claims
权 利 要 求 书 Claims
1. 一种. net 文件的压缩方法, 其特征在于, 所述方法包括下列步 4聚中的至 少一个: 获取. net文件中的引用类型, 对所述引用类型进行压缩; 获取. net文件中的定义方法, 对所述定义方法进行压缩; 获取. net文件中的定义方法的方法体, 对所述方法体进行压缩; 获取. net文件中的命名空间, 对所述命名空间进行压缩; 获取. net文件中的定义类型, 对所述定义类型进行压缩。 A method for compressing .net files, characterized in that the method comprises at least one of the following steps: obtaining a reference type in a .net file, compressing the reference type; obtaining a .net file a method for defining, compressing the defined method; obtaining a method body of the defined method in the .NET file, compressing the method body; obtaining a namespace in the .net file, and compressing the namespace; The definition type in the .net file, which is compressed for the definition type.
2. 根据权利要求 1 所述的方法, 其特征在于, 所述获取. net文件中的引用 类型, 对引用类型进行压缩的步骤包括: 2. The method according to claim 1, wherein the step of obtaining a reference type in the .net file and compressing the reference type comprises:
获取. net 文件中使用的引用类型的名称, 并进行压缩, 得到压缩后 的引用类型的名称; Get the name of the reference type used in the .net file and compress it to get the name of the compressed reference type;
统计所述引用类型的方法计数和字段计数; Counting method counts and field counts for the reference type;
按照预先确定的格式对所述压缩后的引用类型的名称、 所述方法计 数和所述字段计数进行组合, 得到所述引用类型的压缩结果。 The compressed reference type name, the method count, and the field count are combined in a predetermined format to obtain a compressed result of the reference type.
3. 根据权利要求 2所述的方法, 其特征在于, 所述获取. net文件中使用的 引用类型的名称的步骤包括: 3. The method according to claim 2, wherein the step of obtaining the name of the reference type used in the .net file comprises:
获取. net文件中第一元数据表; Get the first metadata table in the .net file;
从所述第一元数据表中读取所述. net 文件中使用的引用类型的名称 的地址信息; Reading address information of a name of a reference type used in the .net file from the first metadata table;
才艮据所述地址信息读取所述引用类型的名称。 The name of the reference type is read according to the address information.
4. 根据权利要求 2所述的方法, 其特征在于, 所述获取. net文件中使用的 引用类型的名称, 并进行压缩的步骤包括: The method according to claim 2, wherein the step of obtaining the name of the reference type used in the .net file and compressing comprises:
获取. net文件中使用的引用类型的名称; Get the name of the reference type used in the .net file;
才艮据获取的引用类型的名称生成引用类型名称字符串后进行压缩;
所述根据获取的引用类型的名称生成引用类型名称字符串的步骤包 括: After the reference type name string is generated according to the name of the obtained reference type, the compression is performed; The step of generating a reference type name string according to the name of the obtained reference type includes:
将获取的引用类型的名称转换成预定的编码格式, 生成引用类型名 称字符串; Converting the name of the obtained reference type into a predetermined encoding format, and generating a reference type name string;
或 Or
获取所述引用类型所属的命名空间名称, 将所述命名空间名称与所 述引用类型的名称组合, 生成引用类型名称字符串; Obtaining a namespace name to which the reference type belongs, combining the namespace name with a name of the reference type, and generating a reference type name string;
所述对所述引用类型名称字符串进行压缩的步骤包括: The step of compressing the reference type name string includes:
对所述引用类型的名称字符串进行散列运算得到散列值; Hashing the name string of the reference type to obtain a hash value;
取所述散列值中预定的字节作为压缩后的所述引用类型的名称。 根据权利要求 2所述的方法, 其特征在于, 所述统计所述引用类型的方 法计数和字段计数的步骤包括: A predetermined byte of the hash value is taken as the name of the compressed reference type. The method according to claim 2, wherein the step of counting the method count and the field count of the reference type comprises:
获取第二元数据表; 对所述第二元数据表的每一行数据执行下述操作: Obtaining a second metadata table; performing the following operations on each row of data of the second metadata table:
读取所述第二元数据表中当前行数据所指向的引用类型; Reading a reference type pointed to by the current row data in the second metadata table;
当所述当前行数据指向的引用类型的名称与获取的引用类型的名称 一致时, 根据所述当前行数据的特征标识值判断所述当前行数据记录的 是否为方法, 如果是, 将所述引用类型的方法计数加 1 ; 否则, 将所述 引用类型的字段计数加 1。 根据权利要求 2所述的方法, 其特征在于, 所述预先确定的格式为固定 长度的字节, 所述固定长度的字节包括三部分, 其中, 第一部分为所述 压缩后的引用类型的名称, 第二部分为所述方法计数, 第三部分为所述 字段计数。 根据权利要求 1所述的方法, 其特征在于, 获取. net文件中的定义方法, 对所述定义方法进行压缩的步骤包括: When the name of the reference type pointed to by the current row data is consistent with the name of the obtained reference type, determining whether the current row data record is a method according to the feature identifier value of the current row data, and if yes, The method count of the reference type is incremented by one; otherwise, the field count of the reference type is incremented by one. The method according to claim 2, wherein the predetermined format is a fixed length byte, and the fixed length byte comprises three parts, wherein the first part is the compressed reference type Name, the second part is the method count, and the third part is the field count. The method according to claim 1, wherein the obtaining method in the .net file is obtained, and the step of compressing the defining method comprises:
定位到 .net文件; Navigate to the .net file;
才艮据所述. net文件,定位到所述. net文件中的元数据表中的定义方法 表及相关的流;
才艮据所述定义方法表对所述流中每个定义方法的相应数据项的内容 构造字符串, 所述数据项的内容包括参数计数; According to the .net file, the definition method table and related stream in the metadata table in the .net file are located; Forming a character string for the content of the corresponding data item of each of the defined methods in the stream according to the definition method table, and the content of the data item includes a parameter count;
对所述字符串进行散列运算以转换成名称散列值; Hashing the string to convert to a name hash value;
将所述定义方法的执行标识和访问标识进行压缩; Compressing the execution identifier and the access identifier of the defined method;
将所述流中所述定义方法的参数表进行压缩; Compressing a parameter table of the defined method in the stream;
按照预设规则组织所述名称散列值、 所述压缩的执行标识和访问标 识、 参数计数和所述压缩的参数表, 得到压缩结构。 才艮据权利要求 7所述的方法, 其特征在于, -据所述. net文件, 定位到 所述. net文件中的元数据表中的定义方法表及相关的流包括: The name hash value, the compressed execution identifier and the access identifier, the parameter count, and the compressed parameter table are organized according to a preset rule to obtain a compressed structure. The method according to claim 7, wherein: - according to the .net file, the definition method table and related flows in the metadata table located in the .net file include:
由所述. net 文件的文件头中的内容定位到所述. net 文件中的各个流 的地址和大小; The content in the header of the .net file is located to the address and size of each stream in the .net file;
才艮据元数据流的元数据头中的位向量和表记录计数中的内容定位到 定义方法表数据块在元数据流中的偏移地址。 根据权利要求 7所述的方法, 其特征在于, 根据所述定义方法表将所述 流中每个定义方法的相应数据项的内容构造字符串包括: 读取所述数据项中名称项的值, 并 居该值读取所述流中的字符串 流中的数据, 得到所述定义方法的名称; The bit vector in the metadata header of the metadata stream and the contents of the table record count are located to define the offset address of the method table data block in the metadata stream. The method according to claim 7, wherein the content construction string of the corresponding data item of each of the definition methods in the stream according to the definition method table comprises: reading a value of a name item in the data item And reading the value in the stream of the string in the stream to obtain the name of the defined method;
读取所述数据项中签名项的值,并才艮据该值读取所述流中的 Blob流 中的数据, 根据该数据分析所述定义方法的参数信息及返回值的类型; 在所述元数据表的定义类型表或者引用类型表中找到所述返回值的 类型所指向的类型信息, 并通过该类型的数据项表中对应的名称项和命 名空间项记录的偏移在字符串流中读取该类型的类型名称和命名空间名 称的信息; Reading the value of the signature item in the data item, and reading the data in the Blob stream in the stream according to the value, and analyzing the parameter information of the definition method and the type of the return value according to the data; Defining the type information pointed to by the type of the return value in the definition type table or the reference type table of the metadata table, and recording the offset of the corresponding name item and the namespace item in the data item table of the type in the string The stream reads information about the type name and namespace name of the type;
应用所述命名空间名称和类型名称构成返回值全名称字符串; 根据获取的参数信息中的参数类型分别读取参数的类型名称及命名 空间名称, 得到的数据构成参数全名称字符串; Applying the namespace name and the type name to form a full-name string of the return value; respectively, reading the type name and the namespace name of the parameter according to the parameter type in the obtained parameter information, and the obtained data constitutes a parameter full name string;
将得到的返回值全名称字符串、 得到的所述定义方法的名称和所述 参数全名称字符串构成所述字符串; The obtained return value full name string, the obtained name of the definition method, and the parameter full name string constitute the character string;
对所述定义方法的参数表进行压缩包括:
根据所述数据项中参数表项中的值定位到所述元数据表的参数表对 应的参数行中; Compressing the parameter table of the defined method includes: And positioning, according to the value in the parameter table item in the data item, a parameter row corresponding to the parameter table of the metadata table;
读取相应的参数行信息, 该信息包括: 2字节的 Flags项, Sequence 项和 Name项; Read the corresponding parameter line information, which includes: 2-byte Flags item, Sequence item and Name item;
对所述参数行信息进行压缩, 具体为舍弃 Sequence项和 Name项, 将 Flags项中的内容压缩成 1个字节; Compressing the parameter row information, specifically discarding the Sequence item and the Name item, and compressing the contents of the Flags item into one byte;
从 Flags项的值分析出所述参数行的标识; The identifier of the parameter line is analyzed from the value of the Flags item;
将所述标识和所述数据项中的参数类型在压缩文件中类型存储区中 的偏移组合成参数信息; Combining the identifier and the parameter type in the data item in the type storage area in the compressed file into parameter information;
在所述压缩结构中加入参数信息。 Parameter information is added to the compression structure.
10. 根据权利要求 7所述的方法, 其特征在于, 对所述字符串进行散列运算 以转换成名称散列值包括: 10. The method according to claim 7, wherein the hashing of the character string to be converted into a name hash value comprises:
对所述字符串进行散列运算; Hashing the string;
取运算结果的前两位, 将其转换成值类型存储, 存储数据作为所述 定义方法的名称散列值。 Take the first two bits of the result of the operation, convert it to a value type store, and store the data as the name hash value of the defined method.
11. 根据权利要求 7所述的方法, 其特征在于, 将所述定义方法的执行标识 和访问标识进行压缩包括: 11. The method according to claim 7, wherein compressing the execution identifier and the access identifier of the defining method comprises:
将所述执行标识和访问标识中的数据项进行重组, 舍弃其中的部分 数据项, 最终将所述执行标识和访问标识从 4个字节合为 2个字节。 Reconstructing the data items in the execution identifier and the access identifier, discarding some of the data items, and finally combining the execution identifier and the access identifier from 4 bytes into 2 bytes.
12. 根据权利要求 7所述的方法, 其特征在于, 压缩所述. net文件中的定义 方法还包括: 12. The method according to claim 7, wherein the compressing the definition in the .net file further comprises:
确定所述定义方法是大头方法; Determining that the definition method is a big method;
对所述大头方法的局部变量进行压缩; Compressing local variables of the big header method;
在所述压缩结构中加入局部变量计数和所述压缩的局部变量; 对所述大头方法的局部变量进行压缩包括: Adding a local variable count and the compressed local variable to the compression structure; compressing the local variable of the large header method includes:
对所述大头方法的类型信息进行分析, 得到最大栈大小和大头方法 标识;
压缩描述所述最大栈大小的数据, 包括: 取 16位字节的该数据的低 8位, 舍弃高 8位; The type information of the big head method is analyzed to obtain a maximum stack size and a big header method identifier; Compressing data describing the maximum stack size includes: taking a lower 8 bits of the data of 16-bit bytes, and discarding the upper 8 bits;
分析所述大头方法标识, 获取局部变量签名标识, 以获取局部变量 个数; The big head method identifier is analyzed, and the local variable signature identifier is obtained to obtain the number of local variables;
分析所述大头方法标识, 得到异常结构计数和异常信息, 对其中的 异常信息进行压缩; Analyzing the big head method identifier, obtaining an abnormal structure count and abnormal information, and compressing the abnormal information therein;
如果所述大头方法在 IL代码后有结构化异常处理表,则对其中的异 常信息进行压缩包括: If the big header method has a structured exception handling table after the IL code, then compressing the exception information therein includes:
通过分析所述大头方法的方法标识信息, 确定所述大头方法的方法 头字节宽度和代码的大小; Determining the method header size of the big header method and the size of the code by analyzing the method identification information of the big header method;
才艮据所述方法头字节宽度和代码的大小定位到所述结构化异常处理 表; Locating the structured exception handling table according to the method header byte width and the size of the code;
才艮据异常结构处理段存放的存储区的偏移地址定位到段中的内容; 对每个段进行压缩, 过程如下: 如果当前段的异常结构为大头格式, 则当前段中异常结构所占的存 储空间大小为第一长度, 读取方法体段中的数据项, 舍弃其中的 HandlerLength项, 将 TryOf set项、 TryLength项、 HandlerOffset项的值 都压缩成 2字节, 压缩方法为舍弃高位, 保留低位; 把 ClassToken中的 四个字节数据压缩成一个字节, 压缩方法为舍弃高位, 只保留氏 8位; 如果当前段的异常结构为小头格式, 则当前段中异常结构所占的存 储空间大小为第二长度, 读取方法体段中的数据项, 舍弃其中的 HandlerLength项; The content of the segment is located in the segment according to the offset address of the storage area stored in the exception structure processing segment; the process of compressing each segment is as follows: If the exception structure of the current segment is in the big header format, the exception structure in the current segment is occupied. The storage space size is the first length, the data item in the method body segment is read, the HandlerLength item is discarded, and the values of the TryOf set item, the TryLength item, and the HandlerOffset item are all compressed to 2 bytes, and the compression method is to discard the high position. The lower order is reserved; the four bytes of data in the ClassToken are compressed into one byte, and the compression method is to discard the upper bits, leaving only 8 bits; if the exception structure of the current segment is in the form of a small header, the exception structure in the current segment is occupied. The size of the storage space is the second length, and the data item in the method body segment is read, and the HandlerLength item in the method is discarded;
通过上面步 4聚获取的 ClassToken在所述元数据表中的定义类型和引 用类型表中找到对应的异常类型信息; The ClassToken obtained by the above step 4 finds the corresponding exception type information in the definition type and the reference type table in the metadata table;
对所述大头方法的局部变量进行压缩之后还包括: 获取大头方法的 垃圾回收标识及 Finally计数; 并才艮据所述垃圾回收标识、 所述最大栈大 小、 所述 Finally计数、 所述异常结构计数和所述压缩后的异常信息组织 大头方法结构表;在所述压缩结构中加入所述组织后的大头方法结构表。 After compressing the local variable of the big header method, the method further includes: obtaining a garbage collection identifier of the big header method and a Finally count; and determining the garbage collection identifier, the maximum stack size, the Finally count, and the abnormal structure according to the garbage collection identifier Counting and the compressed abnormality information are organized into a large-head method structure table; and the organized large-head method structure table is added to the compressed structure.
13. 根据权利要求 1 所述的方法, 其特征在于, 所述获取. net文件中的定义 方法的方法体, 对所述方法体进行压缩的步骤包括:
获取. net文件中使用的定义方法的方法头; The method according to claim 1, wherein the method for obtaining a method for defining a method in a .NET file, the step of compressing the method body comprises: Get the method header of the defined method used in the .net file;
压缩所述定义方法中的方法体, 得到所述方法体的压缩结果; 所述方法还包括: Compressing the method body in the defining method to obtain a compression result of the method body; the method further includes:
判断是否所述. net 文件中使用的所有定义方法的方法头都已经被读 取并完成方法体的压缩, 如果不是, 继续读取下一个定义方法的方法头; 否则, 结束压缩。 才艮据权利要求 13所述的方法, 其特征在于, 所述压缩所述定义方法中的 方法体的步骤包括: Determine whether the method headers of all defined methods used in the .net file have been read and complete the compression of the method body. If not, continue reading the method header of the next defined method; otherwise, end the compression. The method according to claim 13, wherein the step of compressing the method body in the defining method comprises:
才艮据所述方法头获取所述定义方法的局部变量, 并 居所述局部变 量的类型确定所述局部变量的偏移, 所述局部变量的偏移指所述局部变 量在所述. net文件对应的压缩结构中的偏移; Obtaining a local variable of the defined method according to the method header, and determining a deviation of the local variable according to a type of the local variable, where the offset of the local variable refers to the local variable in the .net The offset in the compressed structure corresponding to the file;
才艮据所述方法头读取该定义方法的 ILcode,并对所述 ILcode进行压 缩, 计算压缩后的 ILcode的长度; Reading the ILcode of the defined method according to the method header, and compressing the ILcode to calculate the length of the compressed ILcode;
按照预先确定的格式对所述压缩后的 ILcode的长度、所述局部变量 的偏移和压缩后的 ILcode进行组合, 得到所述方法体的压缩结果; Combining the length of the compressed ILcode, the offset of the local variable, and the compressed ILcode according to a predetermined format to obtain a compression result of the method body;
所述 居所述方法头获取所述定义方法的局部变量的步骤包括: 读取所述方法头的第一个字节, 根据上述第一个字节判断所述定义 方法是大头方法还是小头方法, 如果是大头方法, 根据所述大头方法中 局部变量的标识 token获取局部变量; 如果是小头方法, 将局部变量置 所述. net 文件对应的压缩结构中的数据是按顺序排列的, 每行数据 的偏移与该行数据的标识 token相对应; The step of obtaining the local variable of the method by the method header includes: reading a first byte of the method header, and determining, according to the first byte, whether the definition method is a big header method or a small header The method, if it is a big-head method, obtains a local variable according to the identifier token of the local variable in the big-head method; if it is a small-head method, the local variable is set. The data in the compressed structure corresponding to the net file is arranged in order, The offset of each row of data corresponds to the identifier token of the row of data;
所述艮据所述方法头读取该定义方法的 ILcode的步 4聚包括: 才艮据所述方法头中的信息确定该定义方法的 ILcode的长度; 才艮据确定的所述 ILcode的长度读取所述 ILcode; The step of reading the ILcode of the definition method according to the method header includes: determining the length of the ILcode of the definition method according to the information in the method header; determining the length of the ILcode according to the determination Reading the ILcode;
所述对所述 ILcode进行压缩的步骤包括: The step of compressing the ILcode includes:
读取所述 ILcode中的操作指令,检查所述操作指令后是否有操作参 数; Reading an operation instruction in the ILcode, and checking whether there is an operation parameter after the operation instruction;
如果没有, 直接记录所述操作指令;
否则, 判断所述操作参数的类型; If not, directly record the operation instruction; Otherwise, determining the type of the operating parameter;
当所述操作参数为跳转的偏移量时, 根据所述跳转的偏移量确定被 跳过的 ILcode, 居所述被跳过的 ILcode重新计算所述跳转的偏移量, 记录所述操作指令和重新计算出的跳转的偏移量; When the operation parameter is an offset of the jump, determining the skipped ILcode according to the offset of the jump, and recalculating the offset of the jump by the skipped ILcode, recording The operation instruction and the offset of the recalculated jump;
当所述操作参数为指向所述局部变量在所述方法体中的偏移量时, 记录所述操作指令和所述操作参数; Recording the operation instruction and the operation parameter when the operation parameter is an offset to the local variable in the method body;
当所述操作参数为标识 token时,确定所述标识 token在所述压缩结 构中对应的偏移量, 记录所述操作指令和确定的偏移量; And determining, when the operation parameter is the identifier token, a corresponding offset of the identifier token in the compression structure, and recording the operation instruction and the determined offset;
所述对所述 ILcode进行压缩的步骤包括: The step of compressing the ILcode includes:
判断所述 ILcode 中所有的操作指令和操作参数是否都被读取并压 缩, 如果是, 执行所述计算压缩后的 ILcode的长度的步骤; 否则, 读取 下一条操作指令并进行压缩; Determining whether all operation instructions and operation parameters in the ILcode are read and compressed, and if so, performing the step of calculating the length of the compressed ILcode; otherwise, reading the next operation instruction and compressing;
所述预先确定的格式指将所述压缩后的 ILcode的长度、所述局部变 量的偏移和压缩后的 ILcode依次排列。 The predetermined format refers to sequentially arranging the length of the compressed ILcode, the offset of the local variable, and the compressed ILcode.
15. 根据权利要求 13所述的方法, 其特征在于, 所述获取. net文件中使用的 定义方法的方法头的步 4聚包括: 15. The method according to claim 13, wherein the step of assembling the method header of the definition method used in the .net file comprises:
获取. net文件中的元数据表 MethodDef; Get the metadata table in the .net file MethodDef;
从所述元数据表 MethodDef 中读取所述. net文件中使用的定义方法 的方法头的地址信息; Reading the address information of the method header of the definition method used in the .net file from the metadata table MethodDef;
才艮据所述地址信息读取所述定义方法的方法头。 The method header of the defined method is read according to the address information.
16. 根据权利要求 1所述的方法, 其特征在于, 获取. net文件中的命名空间, 对所述命名空间进行压缩的步骤包括: The method according to claim 1, wherein the obtaining a namespace in the .net file, and compressing the namespace includes:
获取. net文件中当前类型所属的命名空间名称; Get the namespace name of the current type in the .net file;
按照预先确定的算法对所述命名空间名称进行压缩; Compressing the namespace name according to a predetermined algorithm;
确定所述命名空间名称对应的类型计数, 所述类型计数是指在该命 名空间中包括的类型的个数; Determining a type count corresponding to the namespace name, where the type count refers to a number of types included in the namespace;
按照预先确定的格式对压缩后的所述命名空间名称和所述类型计数 进行组合, 得到所述命名空间名称对应的命名空间的压缩结果;
所述获取. net 文件中当前类型所属的命名空间名称的步骤之后还包 括: Combining the compressed namespace name and the type count according to a predetermined format to obtain a compression result of the namespace corresponding to the namespace name; The step of obtaining the namespace name to which the current type belongs in the .net file further includes:
判断当前获取的所述命名空间名称是否已被获取过, 如果没有, 执 行所述按照预先确定的算法对所述命名空间名称进行压缩的步骤; Determining whether the currently obtained namespace name has been obtained, and if not, performing the step of compressing the namespace name according to a predetermined algorithm;
所述方法还包括: The method further includes:
判断是否所有类型所属的命名空间名称都已经读取到; 如果是, 执行所述按照预先确定的格式对压缩后的所述命名空间名 称和所述类型计数进行组合的步骤; Determining whether the namespace names to which all types belong have been read; if yes, performing the step of combining the compressed namespace names and the type counts according to a predetermined format;
否则, 读取下一个类型所属的命名空间名称。 Otherwise, read the namespace name to which the next type belongs.
17. 根据权利要求 16所述的方法, 其特征在于, 所述获取. net文件中当前类 型所属的命名空间名称的步骤包括: The method according to claim 16, wherein the step of obtaining the namespace name to which the current type belongs in the .net file comprises:
获取. net文件中包含命名空间名称偏移的元数据表; Get the metadata table containing the namespace name offset in the .net file;
从所述元数据表中获取所述当前类型所属的命名空间名称偏移; 根据所述命名空间名称偏移从" #Strings,,流中读取所述命名空间名 称; Obtaining, from the metadata table, a namespace name offset to which the current type belongs; reading the namespace name from the stream according to the namespace name offset from "#Strings,";
所述确定所述命名空间名称对应的类型计数包括: The determining the type count corresponding to the namespace name includes:
当所述命名空间名称是第一次获取时, 将所述命名空间名称对应的 类型计数置 1 ,以后每获取一次所述命名空间名称 ,将所述类型计数加 1 , 直至遍历完所述元数据表; When the namespace name is acquired for the first time, the type count corresponding to the namespace name is set to 1, and each time the namespace name is acquired, the type count is incremented by 1 until the element is traversed. data sheet;
所述元数据表包括: The metadata table includes:
定义类型或接口表 TypeDef, 引用类型表 TypeRef„ Define type or interface table TypeDef, reference type table TypeRef„
18. 居权利要求 16所述的方法, 其特征在于, 所述按照预先确定的算法对 所述命名空间名称进行压缩的步骤包括: 18. The method of claim 16, wherein the step of compressing the namespace name according to a predetermined algorithm comprises:
将所述命名空间名称组成命名空间字符串; The namespace names are grouped into a namespace string;
对所述命名空间字符串进行散列运算得到散列值; Hashing the namespace string to obtain a hash value;
取所述散列值中预定的字节作为压缩后的所述命名空间名称;
所述将所述命名空间名称组成命名空间字符串包括: 使用连接符将 所述. net 文件的公钥标记与所述命名空间名称连接得到命名空间字符 串。 Taking a predetermined byte of the hash value as the compressed namespace name; The forming the namespace name into a namespace string includes: using a connector to connect the public key tag of the .net file with the namespace name to obtain a namespace string.
19. 根据权利要求 16所述的方法, 其特征在于, 所述预先确定的格式为固定 长度的字节, 所述固定长度的字节的一部分字节为所述类型计数, 剩余 字节为所述压缩后的命名空间名称; The method according to claim 16, wherein the predetermined format is a fixed length byte, and a part of bytes of the fixed length byte is the type count, and the remaining bytes are The compressed namespace name;
所述类型计数位于所述压缩后的命名空间名称之前。 The type count is located before the compressed namespace name.
20. 根据权利要求 1所述的方法, 其特征在于, 获取. net文件中的定义类型, 对所述定义类型进行压缩的步骤包括: 20. The method according to claim 1, wherein the definition type in the .net file is obtained, and the step of compressing the definition type comprises:
获取. net文件中使用的定义类型包含的信息; Get the information contained in the definition type used in the .net file;
根据所述定义类型包含的信息获取所述定义类型的指定信息和计数 信息; Obtaining specified information and counting information of the definition type according to information included in the definition type;
对所述指定信息进行压缩; Compressing the specified information;
将压缩后的指定信息和所述计数信息作为所述定义类型的压缩结果 进行存储。 The compressed designation information and the count information are stored as a compression result of the definition type.
21. 根据权利要求 20所述的方法, 其特征在于, 所述获取. net文件中使用的 定义类型包含的信息的步骤包括: The method according to claim 20, wherein the step of acquiring information included in a definition type used in the .net file comprises:
读取所述. net文件中的定义类型所在的元数据表; Reading the metadata table of the definition type in the .net file;
从所述定义类型所在的元数据表中获取所述. net 文件中使用的定义 类型包含的信息。 Get the information contained in the definition type used in the .net file from the metadata table in which the definition type is located.
22. 根据权利要求 20所述的方法, 其特征在于, 所述定义类型包含的信息包 括: 所述定义类型的标识、 所述定义类型的名称的偏移量、 所述定义类 型中的方法的偏移量; The method according to claim 20, wherein the information included in the definition type comprises: an identifier of the definition type, an offset of a name of the definition type, and a method in the definition type Offset;
相应地, 所述指定信息包括: 所述定义类型的标识和所述定义类型 的名称; Correspondingly, the specifying information includes: an identifier of the defined type and a name of the defined type;
所述计数信息包括: 所述定义类型的方法重载信息和所述定义类型 中包含的方法计数; The counting information includes: method overloading information of the defined type and a method count included in the definition type;
所述定义类型的标识分为类型标识、 访问标识和描述性标识; 相应地, 所述对所述指定信息进行压缩的步骤包括:
对所述类型标识、 访问标识和描述性标识进行或运算, 将得到的数 据作为所述定义类型的标识的压缩结果; The identifier of the definition type is divided into a type identifier, an access identifier, and a descriptive identifier. Correspondingly, the step of compressing the specified information includes: Performing an OR operation on the type identifier, the access identifier, and the descriptive identifier, and using the obtained data as a compression result of the identifier of the definition type;
对所述定义类型的名称进行哈希运算, 从运算结果中提取约定的字 节作为所述定义类型的名称的压缩结果; Hashing the name of the defined type, extracting the agreed byte from the operation result as a compression result of the name of the defined type;
所述定义类型包含的信息还包括:所述定义类型中的字段的偏移量; 相应地, 所述指定信息还包括: 所述定义类型中的字段对应的信息; 所述计数信息还包括: 所述定义类型的字段计数; The information included in the definition type further includes: an offset of the field in the definition type; correspondingly, the specified information further includes: information corresponding to the field in the definition type; the counting information further includes: a field count of the defined type;
所述定义类型中的字段对应的信息包括: 字段的名称、 字段的标识 和字段的类型; 其中, 所述字段的标识分为访问标识和描述性标识; 相应地, 所述对所述指定信息进行压缩的步骤包括: The information corresponding to the field in the definition type includes: a name of the field, an identifier of the field, and a type of the field; where the identifier of the field is divided into an access identifier and a descriptive identifier; and correspondingly, the specified information is The steps to compress include:
对所述字段的名称进行哈希运算, 从运算结果中提取约定的字节作 为所述字段的名称的压缩结果; Hashing the name of the field, extracting the agreed byte from the operation result as a compression result of the name of the field;
对所述字段的标识中的访问标识和描述性标识进行或运算, 将得到 的结果作为所述字段的标识的压缩结果; Performing an OR operation on the access identifier and the descriptive identifier in the identifier of the field, and using the obtained result as a compression result of the identifier of the field;
将所述字段的类型对应的类型压缩后的偏移量作为所述字段的类型 的压缩结果。 The type-compressed offset corresponding to the type of the field is used as the compression result of the type of the field.
23. 根据权利要求 20所述的方法, 其特征在于, 所述定义类型包含的信息包 括: 所述定义类型所继承的父类的偏移量; The method according to claim 20, wherein the definition type includes information: an offset of a parent class inherited by the definition type;
相应地, 所述方法还包括: Correspondingly, the method further includes:
判断所述定义类型所继承的父类是否已压缩, 如果是, 获取所述父 类的偏移量; 否则, 对所述父类进行压缩, 并为压缩后的所述父类分配 偏移量; Determining whether the parent class inherited by the definition type is compressed, and if so, obtaining an offset of the parent class; otherwise, compressing the parent class and assigning an offset to the compressed parent class ;
相应地, 所述定义类型的压缩结果中还包括压缩后的所述父类的偏 移量。 Correspondingly, the compression result of the defined type further includes the offset amount of the compressed parent class.
24. 根据权利要求 20所述的方法, 其特征在于, 所述方法还包括: The method according to claim 20, wherein the method further comprises:
判断所述定义类型是否有继承的接口; 如果有, 获取所述定义类型 继 ? 的接口压缩后的偏移量和继 的接口的个数; Determining whether the definition type has an inherited interface; if so, obtaining the offset of the interface of the definition type and the number of subsequent interfaces;
相应地, 所述定义类型的压缩结果中还包括所述继 的接口压缩后 的偏移量和继承的接口的个数。
根据权利要求 20所述的方法, 其特征在于, 所述方法还包括: 判断所述定义类型是否是嵌套类型, 如果是, 获取所述定义类型所 在的类型压缩后的偏移量; Correspondingly, the compression result of the defined type further includes the offset of the succeeded interface and the number of inherited interfaces. The method according to claim 20, wherein the method further comprises: determining whether the defined type is a nested type, and if so, obtaining a compressed offset of the type of the defined type;
相应地, 所述定义类型的压缩结果中还包括所述嵌套类型所在的类 型压缩后的偏移量。
Correspondingly, the compressed result of the defined type further includes the type-compressed offset of the nested type.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/129,296 US8805801B2 (en) | 2009-12-30 | 2010-12-29 | Method for compressing a .net file |
Applications Claiming Priority (10)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200910244160XA CN101770367B (en) | 2009-12-30 | 2009-12-30 | Compressing method and compressing device of .NET file |
CN200910244164A CN101770368B (en) | 2009-12-30 | 2009-12-30 | Compressing method and compressing device of namespace in .net file |
CN200910244160.X | 2009-12-30 | ||
CN 200910244162 CN101794219B (en) | 2009-12-30 | 2009-12-30 | Compression method and device of .net files |
CN2009102441652A CN101794221B (en) | 2009-12-30 | 2009-12-30 | Compression method and device of reference types in .net file |
CN200910244165.2 | 2009-12-30 | ||
CN2009102441633A CN101794220B (en) | 2009-12-30 | 2009-12-30 | Compression method and device of definition types in.net file |
CN200910244164.8 | 2009-12-30 | ||
CN200910244162.9 | 2009-12-30 | ||
CN200910244163.3 | 2009-12-30 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011079796A1 true WO2011079796A1 (en) | 2011-07-07 |
Family
ID=44226180
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2010/080459 WO2011079796A1 (en) | 2009-12-30 | 2010-12-29 | Method for compressing.net document |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2011079796A1 (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1405705A (en) * | 2001-08-20 | 2003-03-26 | 北京九州计算机网络有限公司 | Intelligent compression method for file of computer |
CN1494767A (en) * | 2001-02-02 | 2004-05-05 | Method for compressing/decompressing structured document | |
CN101770367A (en) * | 2009-12-30 | 2010-07-07 | 北京飞天诚信科技有限公司 | Compressing method and compressing device of .NET file |
CN101770368A (en) * | 2009-12-30 | 2010-07-07 | 北京飞天诚信科技有限公司 | Compressing method and compressing device of namespace in .net file |
CN101794221A (en) * | 2009-12-30 | 2010-08-04 | 北京飞天诚信科技有限公司 | Compression method and device of reference types in .net file |
CN101794220A (en) * | 2009-12-30 | 2010-08-04 | 北京飞天诚信科技有限公司 | Compression method and device of definition types in.net file |
CN101794219A (en) * | 2009-12-30 | 2010-08-04 | 北京飞天诚信科技有限公司 | Compression method and device of .net files |
-
2010
- 2010-12-29 WO PCT/CN2010/080459 patent/WO2011079796A1/en active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1494767A (en) * | 2001-02-02 | 2004-05-05 | Method for compressing/decompressing structured document | |
CN1405705A (en) * | 2001-08-20 | 2003-03-26 | 北京九州计算机网络有限公司 | Intelligent compression method for file of computer |
CN101770367A (en) * | 2009-12-30 | 2010-07-07 | 北京飞天诚信科技有限公司 | Compressing method and compressing device of .NET file |
CN101770368A (en) * | 2009-12-30 | 2010-07-07 | 北京飞天诚信科技有限公司 | Compressing method and compressing device of namespace in .net file |
CN101794221A (en) * | 2009-12-30 | 2010-08-04 | 北京飞天诚信科技有限公司 | Compression method and device of reference types in .net file |
CN101794220A (en) * | 2009-12-30 | 2010-08-04 | 北京飞天诚信科技有限公司 | Compression method and device of definition types in.net file |
CN101794219A (en) * | 2009-12-30 | 2010-08-04 | 北京飞天诚信科技有限公司 | Compression method and device of .net files |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI730654B (en) | Method and device for deploying and executing smart contract | |
CN110162296B (en) | Method and device for generating application programming interface document and terminal equipment | |
US10783082B2 (en) | Deploying a smart contract | |
US7703088B2 (en) | Compressing “warm” code in a dynamic binary translation environment | |
US20050060696A1 (en) | Method and a system for constructing control flows graphs of binary executable programs at post-link time | |
US9081896B1 (en) | Generating a replacement binary for emulation of an application | |
JP4997777B2 (en) | Method and system for reducing delimiters | |
US10175998B2 (en) | Container-based language runtime loading an isolated method | |
KR20000052759A (en) | Using a high level programming language with a microcontroller | |
CN111930382B (en) | Application page access method, device and equipment | |
US8805801B2 (en) | Method for compressing a .net file | |
JP2003515857A (en) | Language subset validation | |
CN102004744A (en) | Data extraction system and method from one source table to table of at least one object database | |
JP2002529849A (en) | Data compression method for intermediate object code program executable in embedded system supplied with data processing resources, and embedded system corresponding to this method and having multiple applications | |
KR101535703B1 (en) | Apparatus and method for converting Value Object | |
CN112506569A (en) | Bytecode execution method, bytecode execution device and terminal equipment | |
WO2006009287A1 (en) | Automatic converting program and program conversion server | |
JP5789236B2 (en) | Structured document analysis method, structured document analysis program, and structured document analysis system | |
WO2006125768A1 (en) | Flexible data file format | |
CN101794220B (en) | Compression method and device of definition types in.net file | |
CN115604331A (en) | Data processing system, method and device | |
WO2011079796A1 (en) | Method for compressing.net document | |
CN101770368B (en) | Compressing method and compressing device of namespace in .net file | |
CN104978221A (en) | Method and system for implementing downloading and deleting of program | |
Ko et al. | A double-issue Java processor design for embedded applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 13129296 Country of ref document: US |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10840583 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10840583 Country of ref document: EP Kind code of ref document: A1 |