[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

US20080201567A1 - Method of doing pack unicode zseries instructions - Google Patents

Method of doing pack unicode zseries instructions Download PDF

Info

Publication number
US20080201567A1
US20080201567A1 US11/676,568 US67656807A US2008201567A1 US 20080201567 A1 US20080201567 A1 US 20080201567A1 US 67656807 A US67656807 A US 67656807A US 2008201567 A1 US2008201567 A1 US 2008201567A1
Authority
US
United States
Prior art keywords
block
byte
memory
pack
instructions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US11/676,568
Other versions
US7408484B1 (en
Inventor
Antonisamy A. Rajendran
Muruganandam Somasundaram
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US11/676,568 priority Critical patent/US7408484B1/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: RAJENDRAN, ANTONISAMY A., SOMASUNDARAM, MURUGANANDAM
Priority to US12/182,377 priority patent/US20080301416A1/en
Application granted granted Critical
Publication of US7408484B1 publication Critical patent/US7408484B1/en
Publication of US20080201567A1 publication Critical patent/US20080201567A1/en
Priority to US12/715,017 priority patent/US7868791B2/en
Priority to US12/767,207 priority patent/US8077062B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction

Definitions

  • the present invention relates in general to systems, program products and methods for emulating the computer instructions found in a source computer architecture in which sequences of instructions on a target machine architecture are generated so as to produce the same results on both computer architectures. More particularly, the present invention is related to methods and systems for emulating PACK instructions for Unicode character strings. Even more particularly, the present invention is directed to emulation methods in which computer instructions found on zSeries machines are emulated on other computer architectures, notably the Intel x86 series of microprocessors and the PowerPC series of microprocessors. The present description also includes methods and systems for emulating the PACK ASCII instruction.
  • the Unicode standard for character representation is a two byte (“double byte” in some terminology) system in which each character is represented by 16 bits or two bytes of information.
  • This standard provides a vastly expanded range of representable characters including those from languages in which ideographs are employed to represent words and ideas as opposed to the use of individual letters.
  • ASCII or EBCDIC character representations which provide a maximum of 255 characters or signal indicators.
  • each byte (eight bits) in a data processing system can represent two decimal numbers.
  • decimal numbers are provided in a format in which each byte contains a representation of but one decimal number. It is therefore convenient to be able to PACK decimal numbers (or other data) into a packed format, that is, from a one-decimal-digit-per-byte format to a two-decimal-digit-per-byte format.
  • PACK instruction which is structured as a basic member of the set of a computer's instruction set. These instructions usually come in PACK/UNPACK pairs.
  • big-endian and small-endian are also relevant to the present discussion.
  • these concepts relate to the position in the memory architecture where the high order byte portion of an integer (or other data) is stored.
  • big-endian scheme the most significant byte of the integer is stored in the memory location with the lowest address.
  • small-endian scheme the most significant byte of the integer is stored in the memory location with the highest memory address.
  • the Intel x86 processors and chips which seek to duplicate their functionality, such as those produced by Advanced Micro Devices, Inc., use the small-endian (also called little-endian) format.
  • the zSeries of machines and most of the PowerPC devices employ the big-endian format.
  • PKU is an instruction present in the very well known zSeries computer architecture as found in products manufactured and sold by the assignee of the present invention. Descriptions of this and other instructions are found in any of the Principles of Operation (PoP) manuals published as accompanying documentation for the aforementioned data processing machine products.
  • This particular instruction converts a Unicode string to a packed format.
  • the format of the PKU instruction is “PKU TARGET, SOURCE (L2)” where L2 is the Length of the second operand (0 ⁇ L2 ⁇ 64).
  • the length of the target is always 16 bytes.
  • Appendix I provides a description of an approach to providing emulation code for the PKU (Pack Unicode) instruction. Appendix I thus illustrates a block level algorithm that is used herein.
  • the format of the second operand is changed from Unicode to packed, and the result is placed at the first-operand location.
  • the packed format is described in Chapter 8, “Decimal Instructions.”
  • the two-byte second-operand characters are treated as Unicode Basic Latin characters containing decimal digits, having the binary encoding 0000-1001 for 0-9, in their rightmost four bit positions. The leftmost 12 bit positions of a character are ignored. The second operand is considered to be positive.
  • the implied positive sign (1100 binary) and the source digits are placed at the first-operand location.
  • the source digits are moved unchanged and are not checked for valid codes.
  • the sign is placed in the rightmost four bit positions of the rightmost byte of the result field, and the digits are placed adjacent to the sign and to each other in the remainder of the result field.
  • the length of the first operand is 16 bytes.
  • the byte length of the second operand is designated by the contents of the L2 field.
  • the second-operand length must not exceed 32 characters or 64 bytes, and the byte length must be even (L2 must be less than or equal to 63 and must be odd); otherwise, a specification exception is recognized.
  • PKA is a zSeries instruction that converts an ASCII string to packed format.
  • the format of the PKA instruction is “PKA TARGET SOURCE (L2)” where L2 is the Length of the second operand (0 ⁇ L2 ⁇ 32).
  • the length of the target is always 16 bytes.
  • the format of the second operand is changed from ASCII to packed, and the result is placed at the first-operand location.
  • the packed format is described in Chapter 8, “Decimal Instructions.”
  • the second-operand bytes are treated as containing decimal digits, having the binary encoding 0000-1001 for 0-9, in their rightmost four bit positions. The leftmost four bit positions of a byte are ignored. The second operand is considered to be positive.
  • the implied positive sign (1100 binary) and the source digits are placed at the first-operand location.
  • the source digits are moved unchanged and are not checked for valid codes.
  • the sign is placed in the rightmost four bit positions of the rightmost byte of the result field, and the digits are placed adjacent to the sign and to each other in the remainder of the result field.
  • the length of the first operand is 16 bytes.
  • the length of the second operand is designated by the contents of the L2 field.
  • the second-operand length must not exceed 32 bytes (L2 must be less than or equal to 31); otherwise, a specification exception is recognized.
  • the algorithm discussed in detail below in the description section is a block by block algorithm useful for converting UNICODE strings to a packed format.
  • the algorithm works on both big-endian and small-endian machines.
  • block by block algorithms are much faster when compared to byte-by-byte methods.
  • the algorithm that was initially considered was a byte-by-byte approach which was later improved to the block-by-block method using 8 byte blocks at a time.
  • the algorithm discussed in detail below in the description section is a block by block algorithm to convert an ASCII string to a packed format.
  • the algorithm works on both big-endian and small-endian machines. When timed for performance, block by block algorithms are much faster when compared to byte by byte methods.
  • the algorithm that was initially considered was a byte-by-byte approach which was later improved to the block by block method using 8 byte blocks at a time.
  • a method for packing data from a Unicode field to a field which is approximately half as wide First input data to be packed is retrieved from memory and stored in an offset position in memory locations which represent a local array of bytes. A source pointer is then set to point to the data at address “local array+2.” Additionally, a target pointer is set to point to a position in memory which represents a local pack array which holds operation results. The pack operation is carried out in blocks instead of in a byte-by-byte operation.
  • a first byte of the present block is determined in accordance with the following C language script as: (((BLOCK & 0x000F000F000F) ⁇ 12)
  • the three remaining bytes of the present block are then generated by shifting and then ORing the byte to the 4-byte output block.
  • the pointers are incremented and the steps are repeated four times, once for each block.
  • the last nibble is set to 0xC and the pack array is stored in memory.
  • a nibble is a four bit data portion typically represented by a single hexadecimal digit.
  • FIG. 1 is a block diagram illustrating the environment in which the present invention operates and is employed
  • FIG. 2 is a block diagram illustrating the structure of a 64 byte chunk of Unicode structured data
  • FIG. 3 is a block diagram more particularly illustrating the structure of a single byte of the rightmost portion of a Unicode character, that is, the rightmost of its two bytes and particularly showing its structure as two nibbles;
  • FIG. 4 is a block diagram illustrating the structure of Unicode originated data after its packed into a digits-only format together with a sign nibble;
  • FIG. 5 is a top view of a typical computer readable medium containing program code which implements the methods of the present invention, as for example, as shown a Compact Disc (CD).
  • CD Compact Disc
  • FIG. 6 is a block diagram illustrating the method employed for emulating the PKU instruction.
  • FIG. 7 is a block diagram illustrating the method employed for emulating the PKA instruction.
  • FIG. 8 is a block diagram illustrating an example of the environment in which the present invention is employed.
  • Emulators such as 320 except as input instruction streams 305 , representing machine or assembly language instructions which are designed to operate on source machine 300 .
  • Emulator 320 employees memory 315 in target machine 310 to produce a stream of instructions which are capable of executing on target machine 310 .
  • FIG. 1 particularly shows operation within an emulation environment, it is also noted that the present invention contemplates a situation in which emulator 320 operates essentially as an interpreter in which the instructions are not only translated to the new architecture but in which they are also executed at essentially the same time.
  • FIG. 2 is provided so as to illustrate the structure of data which is typically supplied to a pack type of instruction.
  • FIG. 2 illustrates the structure of 64 bytes of data representing 32 Unicode characters.
  • data of this type would be processed in blocks of 16 bytes each.
  • the “x” that is shown in the rightmost portion of each Unicode character (32 of them in all) represents that portion of a Unicode character which includes the encoding for a decimal digit (and more too which is ultimately discarded).
  • the structure of such a rightmost Unicode character is more particularly illustrated in FIG.
  • the method of the present invention is embodied in its preferred form in the programming code set forth in the appendices one and two herein.
  • Such programming is typically provided on a computer readable medium such as disk 600 shown in FIG. 5 .
  • Such programming may also be distributed over any convenient data processing network.
  • zMemory refers to memory locations in the target system which are employed for the purposes of the emulation process especially when the target machine has the zSeries architecture.
  • zMemory is really just an exemplar and that any suitable memory in the target machine may be employed for this purpose.
  • the symbol “&” represents a logical bitwise AND operation
  • represents a logical bitwise OR operation
  • the symbol “ ⁇ ” represents a logical bit wise LEFT SHIFT operation with the number to the right of “ ⁇ ” indicating the number of bit positions to be shifted
  • “0x” represents any single hexadecimal (four bit) digit (0000 through 1111);
  • “0F” represents the four bit hexadecimal digit which is all ones (that is, 1111) with “00” thus representing the hexadecimal digit which is all zeroes (that is, 0000).
  • the PKU instruction is discussed first.
  • Step 105 Fetch input from zMemory to local array of size 33 bytes using proper offset; that is, copy L2 bytes from zMemory into local array+(32-L2).
  • Step 110 Set source pointer to point to (local array+2).
  • the source pointer type is unsigned long long (for 8 byte processing).
  • Step 115 Set target pointer to point to pack array.
  • the target pointer type is int (for 4 byte processing).
  • Block 120 in FIG. 1 represents the return point for this loop which passes through “counter” 150 .
  • Step 130 The first byte of output is computed as follows:
  • Step 135 Get the three remaining bytes of output by shifting appropriately and then OR the byte to the 4-byte output block.
  • Step 140 Set the last nibble to 0xC for a positive sign.
  • Step 145 Store the pack array into zMemory.
  • the present method uses byte reversal functions whenever required to support both big-endian and little-endian architectures.
  • the architecture for the most relevant source architecture, the zSeries of machines, is big-endian.
  • Sample C code is given below in Appendix I where the input is assumed to be 64 bytes in length. Modifications to this code to accommodate other lengths is easily discerned.
  • Step 205 Copy input from zMemory to local array of size 33 bytes using proper offset; that is, copy L2 bytes from zMemory into local array+(32-L2).
  • Step 210 Set source pointer to point to local array+1 and is of type unsigned long long (for 8 byte processing)
  • Step 215 Set target pointer to point to pack array and is of type int (for 4 byte processing).
  • Block 225 in FIG. 7 Block 220 in FIG. 2 represents the return point for this loop which passes through “counter” 250 .
  • Step 225 Each time in the loop operate on the 8-byte input block to get 4 bytes of output;
  • Step 230 The first byte of output is calculated as follows:
  • Step 235 Get the 3 remaining bytes of output by shifting appropriately and OR the byte to the 4-byte output block.
  • Step 240 Set the last nibble to 0xC for positive sign.
  • Step 245 Store pack array into zMemory
  • FIG. 8 the environment in which the present invention operates is shown in FIG. 8 .
  • the present invention operates in a data processing environment which effectively includes one or more of the computer elements shown in FIG. 8 .
  • computer 500 includes central processing unit (CPU) 520 which accesses programs and data stored within random access memory 510 .
  • Memory 510 is typically volatile in nature and accordingly such systems are provided with nonvolatile memory typically in the form of rotatable magnetic memory 540 . While memory 540 is preferably a nonvolatile magnetic device, other media may be employed.
  • CPU 530 communicates with users at consoles such as terminal 550 through Input/Output unit 530 .
  • Terminal 550 is typically one of many, if not thousands, of consoles in communication with computer 500 through one or more I/O unit 530 .
  • console unit 550 is shown as having included therein a device for reading medium of one or more types such as CD-ROM 600 shown in FIG. 5 .
  • Media 600 may also comprise any convenient device including, but not limited to, magnetic media, optical storage devices and chips such as flash memory devices or so-called thumb drives.
  • Disk 600 also represents a more generic distribution medium in the form of electrical signals used to transmit data bits which represent codes for the instructions discussed herein. While such transmitted signals may be ephemeral in nature they still, nonetheless constitute a physical medium carrying the coded instruction bits and are intended for permanent capture at the signal's destination or destinations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Executing Machine-Instructions (AREA)

Abstract

Emulation methods are provided for two PACK instructions, one for Unicode data and the other for ASCII coded data in which processing is carried out in a block-by-block fashion as opposed to a byte-by-byte fashion as a way to provide superior performance in the face of the usual challenges facing the execution of emulated data processing machine instructions as opposed to native instructions.

Description

    TECHNICAL FIELD
  • The present invention relates in general to systems, program products and methods for emulating the computer instructions found in a source computer architecture in which sequences of instructions on a target machine architecture are generated so as to produce the same results on both computer architectures. More particularly, the present invention is related to methods and systems for emulating PACK instructions for Unicode character strings. Even more particularly, the present invention is directed to emulation methods in which computer instructions found on zSeries machines are emulated on other computer architectures, notably the Intel x86 series of microprocessors and the PowerPC series of microprocessors. The present description also includes methods and systems for emulating the PACK ASCII instruction.
  • BACKGROUND OF THE INVENTION
  • As is well known, the Unicode standard for character representation is a two byte (“double byte” in some terminology) system in which each character is represented by 16 bits or two bytes of information. This standard provides a vastly expanded range of representable characters including those from languages in which ideographs are employed to represent words and ideas as opposed to the use of individual letters. This is in distinction to ASCII or EBCDIC character representations which provide a maximum of 255 characters or signal indicators.
  • It is also known that each byte (eight bits) in a data processing system can represent two decimal numbers. However, it is often the case that decimal numbers are provided in a format in which each byte contains a representation of but one decimal number. It is therefore convenient to be able to PACK decimal numbers (or other data) into a packed format, that is, from a one-decimal-digit-per-byte format to a two-decimal-digit-per-byte format. This is typically accomplished with some form of PACK instruction which is structured as a basic member of the set of a computer's instruction set. These instructions usually come in PACK/UNPACK pairs.
  • Also relevant to the present discussion are the notions of big-endian and small-endian. These concepts relate to the position in the memory architecture where the high order byte portion of an integer (or other data) is stored. In the big-endian scheme, the most significant byte of the integer is stored in the memory location with the lowest address. In the small-endian scheme, the most significant byte of the integer is stored in the memory location with the highest memory address. The Intel x86 processors and chips which seek to duplicate their functionality, such as those produced by Advanced Micro Devices, Inc., use the small-endian (also called little-endian) format. The zSeries of machines and most of the PowerPC devices employ the big-endian format.
  • PKU is an instruction present in the very well known zSeries computer architecture as found in products manufactured and sold by the assignee of the present invention. Descriptions of this and other instructions are found in any of the Principles of Operation (PoP) manuals published as accompanying documentation for the aforementioned data processing machine products. This particular instruction converts a Unicode string to a packed format. The format of the PKU instruction is “PKU TARGET, SOURCE (L2)” where L2 is the Length of the second operand (0≦L2≦64). The length of the target is always 16 bytes. A sample program included herein as Appendix I provides a description of an approach to providing emulation code for the PKU (Pack Unicode) instruction. Appendix I thus illustrates a block level algorithm that is used herein.
  • The format of the second operand is changed from Unicode to packed, and the result is placed at the first-operand location. The packed format is described in Chapter 8, “Decimal Instructions.”
  • The two-byte second-operand characters are treated as Unicode Basic Latin characters containing decimal digits, having the binary encoding 0000-1001 for 0-9, in their rightmost four bit positions. The leftmost 12 bit positions of a character are ignored. The second operand is considered to be positive.
  • The implied positive sign (1100 binary) and the source digits are placed at the first-operand location. The source digits are moved unchanged and are not checked for valid codes. The sign is placed in the rightmost four bit positions of the rightmost byte of the result field, and the digits are placed adjacent to the sign and to each other in the remainder of the result field.
  • The result is obtained as if the operands were processed right to left. When necessary, the second operand is considered to be extended on the left with zeros.
  • The length of the first operand is 16 bytes.
  • The byte length of the second operand is designated by the contents of the L2 field. The second-operand length must not exceed 32 characters or 64 bytes, and the byte length must be even (L2 must be less than or equal to 63 and must be odd); otherwise, a specification exception is recognized.
  • When the length of the second operand is 32 characters (64 bytes), the leftmost character is ignored.
  • Paragraphs [0005] to [0011] above are taken from the published description of the Pack Unicode instruction z/Architecture Principles of Operation having a document number of SA22-7832-03 with a “Build Date” of May 04, 2004 12:13:20 and a “Build Version” of 1.3.1 of “BUILD/VM Version: UG03935” and a Drop Date of Thursday Aug. 8, 2003.
  • PKA is a zSeries instruction that converts an ASCII string to packed format. The format of the PKA instruction is “PKA TARGET SOURCE (L2)” where L2 is the Length of the second operand (0≦L2≦32). The length of the target is always 16 bytes.
  • The format of the second operand is changed from ASCII to packed, and the result is placed at the first-operand location. The packed format is described in Chapter 8, “Decimal Instructions.”
  • The second-operand bytes are treated as containing decimal digits, having the binary encoding 0000-1001 for 0-9, in their rightmost four bit positions. The leftmost four bit positions of a byte are ignored. The second operand is considered to be positive.
  • The implied positive sign (1100 binary) and the source digits are placed at the first-operand location. The source digits are moved unchanged and are not checked for valid codes. The sign is placed in the rightmost four bit positions of the rightmost byte of the result field, and the digits are placed adjacent to the sign and to each other in the remainder of the result field.
  • The result is obtained as if the operands were processed right to left. When necessary, the second operand is considered to be extended on the left with zeros.
  • The length of the first operand is 16 bytes.
  • The length of the second operand is designated by the contents of the L2 field. The second-operand length must not exceed 32 bytes (L2 must be less than or equal to 31); otherwise, a specification exception is recognized.
  • When the length of the second operand is 32 bytes, the leftmost byte is ignored.
  • Paragraphs [0014] to [0020] above are taken from the published description of the Pack Unicode instruction found in the same published Principles of Operation manual cited above.
  • SUMMARY OF THE INVENTION
  • The algorithm discussed in detail below in the description section is a block by block algorithm useful for converting UNICODE strings to a packed format. The algorithm works on both big-endian and small-endian machines. When timed for performance, block by block algorithms are much faster when compared to byte-by-byte methods. The algorithm that was initially considered was a byte-by-byte approach which was later improved to the block-by-block method using 8 byte blocks at a time.
  • The algorithm discussed in detail below in the description section is a block by block algorithm to convert an ASCII string to a packed format. The algorithm works on both big-endian and small-endian machines. When timed for performance, block by block algorithms are much faster when compared to byte by byte methods. The algorithm that was initially considered was a byte-by-byte approach which was later improved to the block by block method using 8 byte blocks at a time.
  • Additional features and advantages are realized through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered a part of the claimed invention.
  • In accordance with a preferred embodiment of the present invention as it relates to the PACK UNICODE instruction, there is provided a method for packing data from a Unicode field to a field which is approximately half as wide. First input data to be packed is retrieved from memory and stored in an offset position in memory locations which represent a local array of bytes. A source pointer is then set to point to the data at address “local array+2.” Additionally, a target pointer is set to point to a position in memory which represents a local pack array which holds operation results. The pack operation is carried out in blocks instead of in a byte-by-byte operation. In the block operation a first byte of the present block is determined in accordance with the following C language script as: (((BLOCK & 0x000F000F000F000F)<<12)|((BLOCK & 0x000F000F000F000F)<<24)) & 0xFF0000000000000), where “BLOCK” represents the current block of input data being processed. The three remaining bytes of the present block are then generated by shifting and then ORing the byte to the 4-byte output block. The pointers are incremented and the steps are repeated four times, once for each block. Lastly, the last nibble is set to 0xC and the pack array is stored in memory. A nibble is a four bit data portion typically represented by a single hexadecimal digit.
  • In accordance with a preferred embodiment of the present invention as it relates to the PACK ASCII instruction
  • The recitation herein of a list of desirable objects which are met by various embodiments of the present invention is not meant to imply or suggest that any or all of these objects are present as essential features, either individually or collectively, in the most general embodiment of the present invention or in any of its more specific embodiments.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the concluding portion of the specification. The invention, however, both as to organization and method of practice, together with the further objects and advantages thereof, may best be understood by reference to the following description taken in connection with the accompanying drawings in which:
  • FIG. 1 is a block diagram illustrating the environment in which the present invention operates and is employed;
  • FIG. 2 is a block diagram illustrating the structure of a 64 byte chunk of Unicode structured data;
  • FIG. 3 is a block diagram more particularly illustrating the structure of a single byte of the rightmost portion of a Unicode character, that is, the rightmost of its two bytes and particularly showing its structure as two nibbles;
  • FIG. 4 is a block diagram illustrating the structure of Unicode originated data after its packed into a digits-only format together with a sign nibble;
  • FIG. 5 is a top view of a typical computer readable medium containing program code which implements the methods of the present invention, as for example, as shown a Compact Disc (CD).
  • FIG. 6 is a block diagram illustrating the method employed for emulating the PKU instruction; and
  • FIG. 7 is a block diagram illustrating the method employed for emulating the PKA instruction.
  • FIG. 8 is a block diagram illustrating an example of the environment in which the present invention is employed.
  • DETAILED DESCRIPTION
  • The typical emulation environment in which the present invention is employed is illustrated in FIG. 1. Emulators such as 320 except as input instruction streams 305, representing machine or assembly language instructions which are designed to operate on source machine 300. Emulator 320 employees memory 315 in target machine 310 to produce a stream of instructions which are capable of executing on target machine 310. While FIG. 1 particularly shows operation within an emulation environment, it is also noted that the present invention contemplates a situation in which emulator 320 operates essentially as an interpreter in which the instructions are not only translated to the new architecture but in which they are also executed at essentially the same time.
  • Also, since the present invention is particularly concerned with packing instructions, FIG. 2 is provided so as to illustrate the structure of data which is typically supplied to a pack type of instruction. In particular, FIG. 2 illustrates the structure of 64 bytes of data representing 32 Unicode characters. In the present invention, data of this type would be processed in blocks of 16 bytes each. The “x” that is shown in the rightmost portion of each Unicode character (32 of them in all) represents that portion of a Unicode character which includes the encoding for a decimal digit (and more too which is ultimately discarded). The structure of such a rightmost Unicode character is more particularly illustrated in FIG. 3 in which it is seen that each of byte “x” is divided into two nibbles, an upper portion “u” and a lower or digit portion “d.” These figures are presented this way due to size limitations as to the detail needed. In the PACK Unicode, instruction considered herein, the packaging process results in a structure such as that shown in FIG. 4 in which each of the digit nibbles are packed two-to-a-byte in a 16 byte block with the rightmost nibble of this block including the sign nibble.
  • As pointed out above, the method of the present invention is embodied in its preferred form in the programming code set forth in the appendices one and two herein. Such programming is typically provided on a computer readable medium such as disk 600 shown in FIG. 5. However, it is also contemplated that such programming may also be distributed over any convenient data processing network.
  • The description of the emulation method for the PKU and the PKA instructions are discussed below. In this description, it is noted that reference to zMemory refers to memory locations in the target system which are employed for the purposes of the emulation process especially when the target machine has the zSeries architecture. However, it is noted that, in the methods provided below, zMemory is really just an exemplar and that any suitable memory in the target machine may be employed for this purpose.
  • As a further preliminary matter, in the description below: the symbol “&” represents a logical bitwise AND operation; the symbol “|” represents a logical bitwise OR operation; the symbol “<<” represents a logical bit wise LEFT SHIFT operation with the number to the right of “<<” indicating the number of bit positions to be shifted; and “0x” represents any single hexadecimal (four bit) digit (0000 through 1111); and “0F” represents the four bit hexadecimal digit which is all ones (that is, 1111) with “00” thus representing the hexadecimal digit which is all zeroes (that is, 0000).
  • The PKU instruction is discussed first.
  • PKU Instruction
  • The method described below is described in the C programming language and it works on both PowerPC-AIX (big endian) and Intel-LINUX (small endian) architectures. The method described below is also shown in flow chart form in FIG. 6 wherein the step numbers are labeled as they are below. The block-by-block processing (as opposed to byte-by-byte processing) is carried out as follows:
  • (Step 105) Fetch input from zMemory to local array of size 33 bytes using proper offset; that is, copy L2 bytes from zMemory into local array+(32-L2).
  • (Step 110) Set source pointer to point to (local array+2). The source pointer type is unsigned long long (for 8 byte processing).
  • (Step 115) Set target pointer to point to pack array. The target pointer type is int (for 4 byte processing).
  • Do the Steps 130 and 135 below four times, each time in the loop operating on the 8-byte input block to get 4 bytes of output (Block 125 in FIG. 6). Block 120 in FIG. 1 represents the return point for this loop which passes through “counter” 150.
  • (Step 130) The first byte of output is computed as follows:
  • (((BLOCK & 0x000F000F000F000F)<<12)|((BLOCK & 0x000F000F000F000F)<<24)) & 0xFF0000000000000); <<=this will be the first byte of output;
  • (Step 135) Get the three remaining bytes of output by shifting appropriately and then OR the byte to the 4-byte output block.
  • (Step 140) Set the last nibble to 0xC for a positive sign.
  • (Step 145) Store the pack array into zMemory.
  • The present method uses byte reversal functions whenever required to support both big-endian and little-endian architectures. The architecture for the most relevant source architecture, the zSeries of machines, is big-endian. Sample C code is given below in Appendix I where the input is assumed to be 64 bytes in length. Modifications to this code to accommodate other lengths is easily discerned.
  • PKA Instruction
  • The method described below is described in the C programming language and it works on both PowerPC-AIX (big endian) and Intel-L1NUX (small endian) architectures. The referenced method is also shown in flow chart form in FIG. 7 wherein the step numbers are labeled as they are below. The block-by-block processing (as opposed to byte-by-byte processing) is carried out as follows:
  • (Step 205) Copy input from zMemory to local array of size 33 bytes using proper offset; that is, copy L2 bytes from zMemory into local array+(32-L2).
  • (Step 210) Set source pointer to point to local array+1 and is of type unsigned long long (for 8 byte processing)
  • (Step 215) Set target pointer to point to pack array and is of type int (for 4 byte processing).
  • Do the Steps 230 and 235 below four times, each time in the loop operating on the 8-byte input block to get 4 bytes of output (Block 225 in FIG. 7). Block 220 in FIG. 2 represents the return point for this loop which passes through “counter” 250.
  • (Step 225) Each time in the loop operate on the 8-byte input block to get 4 bytes of output;
  • (Step 230) The first byte of output is calculated as follows:
  • (((BLOCK & 0x0F0F0F0F0F0F0F0F)<<4)|((BLOCK & 0x0F0F0F0F0F0F0F0F)<<4)) & 0xFF0000000000000);
  • (Step 235) Get the 3 remaining bytes of output by shifting appropriately and OR the byte to the 4-byte output block.
  • (Step 240) Set the last nibble to 0xC for positive sign.
  • (Step 245) Store pack array into zMemory
  • As above, to accommodate different architectures, when fetching and storing, byte reversal functions are used for big-endian and small-endian formats.
  • In any event the environment in which the present invention operates is shown in FIG. 8. The present invention operates in a data processing environment which effectively includes one or more of the computer elements shown in FIG. 8. In particular, computer 500 includes central processing unit (CPU) 520 which accesses programs and data stored within random access memory 510. Memory 510 is typically volatile in nature and accordingly such systems are provided with nonvolatile memory typically in the form of rotatable magnetic memory 540. While memory 540 is preferably a nonvolatile magnetic device, other media may be employed. CPU 530 communicates with users at consoles such as terminal 550 through Input/Output unit 530. Terminal 550 is typically one of many, if not thousands, of consoles in communication with computer 500 through one or more I/O unit 530. In particular, console unit 550 is shown as having included therein a device for reading medium of one or more types such as CD-ROM 600 shown in FIG. 5. Media 600 may also comprise any convenient device including, but not limited to, magnetic media, optical storage devices and chips such as flash memory devices or so-called thumb drives. Disk 600 also represents a more generic distribution medium in the form of electrical signals used to transmit data bits which represent codes for the instructions discussed herein. While such transmitted signals may be ephemeral in nature they still, nonetheless constitute a physical medium carrying the coded instruction bits and are intended for permanent capture at the signal's destination or destinations.
  • APPENDIX I
    #include <stdio.h>
    #include <time.h>
    #include <byteswap.h>
    #define ntohll(x) bswap_64(x)
    #define htonll(x) bswap_64(x)
    main( )
    {
     char input[66]= {0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,
        0x38,0x39,0x3A,0x3B,0x3C,0x3D,0x3E,0x3F,
        0x3F,0x3E,0x3D,0x3C,0x3B,0x3A,0x39,0x38,
        0x37,0x36,0x35,0x34,0x33,0x32,0x31,0x39,
        0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,
        0x38,0x39,0x3A,0x3B,0x3C,0x3D,0x3E,0x3F,
        0x3F,0x3E,0x3D,0x3C,0x3B,0x3A,0x39,0x38,
        0x37,0x36,0x35,0x34,0x33,0x32,0x31,0x39,
        0x00,0x00};
     char pack_array[16]= { 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
          0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00};
      short int i;
     unsigned int *optr=(unsigned int *)pack_array;
     unsigned long long *iptr=(unsigned long long *)(input+2;
     for (1=0;i<4;i++,optr++,iptr=iptr+2)
      {
       *optr = htonl((((((ntohll(*iptr) & 0x000F000F000F000FULL) <<
         12) | ((ntohll(*iptr) & 0x000F000F000F000FULL) << 24))) &
         0xFF00000000000000ULL)  |
         ((((((ntohll(*iptr) & 0x000F000F000F000FULL) << 12) |
         ((ntohll(*iptr) & 0x000F000F000F000FULL) << 24))) &
         0x00000000FF000000ULL) << 24) |
         ((((((ntohll(*(iptr+1)) & 0x000F000F000F000FULL) << 12) |
         ((ntohll(*(iptr+1)) & 0x000F000F000F000FULL) << 24))) &
         0xFF00000000000000ULL) >> 16) |
         (((((((ntohll(*(iptr+1)) & 0x000F000F000F000FULL) << 12) |
         ((ntohll(*(iptr+1)) & 0x000F000F000F000FULL) << 24))) &
         0x00000000FF000000ULL)) << 8)) >> 32;
      }
      optr−−;
      *optr = htonl((ntohl(*optr) & 0xFFFFFFF0) | 0xC);
      }
  • APPENDIX II
    #include <stdio.h>
    #include <time.h>
    #include <byteswap.h>
    #define ntohll(x) bswap_64(x)
    #define htonll(x) bswap_64(x)
    main( )
    {
     char input[33] = {
         0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,
         0x38,0x39,0x3A,0x3B,0x3C,0x3D,0x3E,0x3F,
         0X3F,0x3E,0x3D,0x3C,0x3B,0x3A,0x39,0x38,
         0x37,0x36,0x35,0x34,0x33,0x32,0x31,0x39,
         0x00};
     char pack_array[16] = {
         0x00,0x00,0x00,0x00,0x00,0x00,0x00,00x00,
         0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00};
     short int i;
     unsigned long long *iptr = (unsigned long long *)(input+1);
     unsigned int *optr = (unsigned int *)pack_array;
     for (i=0;i<4;i++,optr++,iptr++)
     {
       *optr = htonl((((((ntohll(*iptr) & 0x0F0F0F0F0F0F0F0FULL) |
        ((ntohll(*iptr) & 0x0F0F0F0F0F0F0F0FULL) << 4)) << 4) &
            0xFF00000000000000ULL) |
          (((((ntohll(*iptr) & 0x0F0F0F0F0F0F0F0FULL) |
        ((ntohll(*iptr) & 0x0F0F0F0F0F0F0F0FULL) << 4)) << 4) &
            0x0000FF0000000000ULL) << 8) |
          (((((ntohll(*iptr) & 0x0F0F0F0F0F0F0F0FULL) |
        ((ntohll(*iptr) & 0x0F0F0F0F0F0F0F0FULL) << 4)) << 4) &
          (((((ntohll(*iptr) & 0x0F0F0F0F0F0F0F0FULL) |
        ((ntohll(*iptr) & 0x0F0F0F0F0F0F0F0FULL) << 4)) << 4) &
            0x00000000FF000000ULL) << 16) |
          (((((ntohll(*iptr) & 0x0F0F0F0F0F0F0F0FULL) |
        ((ntohll(*iptr) & 0x0F0F0F0F0F0F0F0FULL) << 4)) << 4) &
            0x000000000000FF00ULL) << 24)) >> 32);
      }
      optr−−;
      *optr = htonl((ntohl(*optr) & 0xFFFFFFF0) | 0xC);
    }
  • While the invention has been described in detail herein in accordance with certain preferred embodiments thereof, many modifications and changes therein may be effected by those skilled in the art. Accordingly, it is intended by the appended claims to cover all such modifications and changes as fall within the true spirit and scope of the invention.

Claims (3)

1. A method for packing data from a Unicode field to a field which is approximately half as wide, said method comprising the steps of:
retrieving input data to be packed from a memory and storing it in an offset position in memory locations which represent a local array having a plurality of bytes;
setting a source pointer to point to address “local array +2” in said memory;
setting a target pointer to point to a position in said memory said pointer representing a local array which holds operation results in a plurality of 4-byte output blocks;
computing a first byte of a present block of the output as: (((BLOCK & 0x000F000F000F000F)<<12)|((BLOCK & 0x000F000F000F000F)<<24)) & 0xFF0000000000000), where “BLOCK” represents a block of said input data;
generating the three remaining bytes of the present block by shifting and then ORing the byte to the 4-byte output block;
incrementing said pointers and repeating the previous two steps three more times, once for each said block of said input data;
setting the last nibble to 0xC; and
storing said pack array in said memory.
2. (canceled)
3. (canceled)
US11/676,568 2007-02-20 2007-02-20 Method of doing PACK unicode zSeries instructions Expired - Fee Related US7408484B1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US11/676,568 US7408484B1 (en) 2007-02-20 2007-02-20 Method of doing PACK unicode zSeries instructions
US12/182,377 US20080301416A1 (en) 2007-02-20 2008-07-30 System and program product of doing pack unicode z series instructions
US12/715,017 US7868791B2 (en) 2007-02-20 2010-03-01 Pack unicode zSeries instructions
US12/767,207 US8077062B2 (en) 2007-02-20 2010-04-26 Pack ASCII zSeries instructions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/676,568 US7408484B1 (en) 2007-02-20 2007-02-20 Method of doing PACK unicode zSeries instructions

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/182,377 Continuation US20080301416A1 (en) 2007-02-20 2008-07-30 System and program product of doing pack unicode z series instructions

Publications (2)

Publication Number Publication Date
US7408484B1 US7408484B1 (en) 2008-08-05
US20080201567A1 true US20080201567A1 (en) 2008-08-21

Family

ID=39670776

Family Applications (4)

Application Number Title Priority Date Filing Date
US11/676,568 Expired - Fee Related US7408484B1 (en) 2007-02-20 2007-02-20 Method of doing PACK unicode zSeries instructions
US12/182,377 Abandoned US20080301416A1 (en) 2007-02-20 2008-07-30 System and program product of doing pack unicode z series instructions
US12/715,017 Active US7868791B2 (en) 2007-02-20 2010-03-01 Pack unicode zSeries instructions
US12/767,207 Active US8077062B2 (en) 2007-02-20 2010-04-26 Pack ASCII zSeries instructions

Family Applications After (3)

Application Number Title Priority Date Filing Date
US12/182,377 Abandoned US20080301416A1 (en) 2007-02-20 2008-07-30 System and program product of doing pack unicode z series instructions
US12/715,017 Active US7868791B2 (en) 2007-02-20 2010-03-01 Pack unicode zSeries instructions
US12/767,207 Active US8077062B2 (en) 2007-02-20 2010-04-26 Pack ASCII zSeries instructions

Country Status (1)

Country Link
US (4) US7408484B1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7408484B1 (en) * 2007-02-20 2008-08-05 International Business Machines Corporation Method of doing PACK unicode zSeries instructions
US20130254896A1 (en) * 2012-03-23 2013-09-26 Infineon Technologies Austria Ag Method to Detect Tampering of Data
CN114531495B (en) * 2022-02-25 2022-10-11 北方工业大学 Data packet, data packet generation method and data packet generation system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5301285A (en) * 1989-03-31 1994-04-05 Hitachi, Ltd. Data processor having two instruction registers connected in cascade and two instruction decoders
US6215992B1 (en) * 1997-07-29 2001-04-10 Dennis S. Howell Universal dictation input apparatus and method
US20030093649A1 (en) * 2001-11-14 2003-05-15 Ronald Hilton Flexible caching of translated code under emulation
US20050238010A1 (en) * 2004-04-26 2005-10-27 Rina Panigrahy Programmable packet parsing processor
US7073264B2 (en) * 2000-06-30 2006-07-11 Earl & Kimberly Votolato Trustees Of Votolato Living Trust Bag slitting apparatus

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6039765A (en) 1997-12-15 2000-03-21 Motorola, Inc. Computer instruction which generates multiple results of different data types to improve software emulation
US6163764A (en) 1998-10-12 2000-12-19 Intel Corporation Emulation of an instruction set on an instruction set architecture transition
JP3917343B2 (en) * 2000-02-25 2007-05-23 株式会社東芝 Computer-readable recording medium having character code conversion method and character code conversion program recorded in multi-platform environment
US7353287B2 (en) * 2001-09-26 2008-04-01 Adobe Systems Incorporated Marked foreign data blocks
US7092869B2 (en) 2001-11-14 2006-08-15 Ronald Hilton Memory address prediction under emulation
US6701320B1 (en) * 2002-04-24 2004-03-02 Bmc Software, Inc. System and method for determining a character encoding scheme
US6972634B2 (en) * 2002-11-26 2005-12-06 Agere Systems Inc. Interconnected multi-stage oscillator
US7167968B2 (en) * 2004-04-29 2007-01-23 International Business Machines Corporation Storage pre-alignment and EBCDIC, ASCII and unicode basic latin conversions for packed decimal data
US7394409B1 (en) 2007-02-20 2008-07-01 International Business Machines Corporation Method of doing pack ASCII zSeries instructions
US7408484B1 (en) 2007-02-20 2008-08-05 International Business Machines Corporation Method of doing PACK unicode zSeries instructions

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5301285A (en) * 1989-03-31 1994-04-05 Hitachi, Ltd. Data processor having two instruction registers connected in cascade and two instruction decoders
US6215992B1 (en) * 1997-07-29 2001-04-10 Dennis S. Howell Universal dictation input apparatus and method
US7073264B2 (en) * 2000-06-30 2006-07-11 Earl & Kimberly Votolato Trustees Of Votolato Living Trust Bag slitting apparatus
US20030093649A1 (en) * 2001-11-14 2003-05-15 Ronald Hilton Flexible caching of translated code under emulation
US20050238010A1 (en) * 2004-04-26 2005-10-27 Rina Panigrahy Programmable packet parsing processor

Also Published As

Publication number Publication date
US7868791B2 (en) 2011-01-11
US8077062B2 (en) 2011-12-13
US20100161947A1 (en) 2010-06-24
US20080301416A1 (en) 2008-12-04
US7408484B1 (en) 2008-08-05
US20100205526A1 (en) 2010-08-12

Similar Documents

Publication Publication Date Title
US7965207B2 (en) Variable length integer encoding system and method
CN107545066B (en) Techniques for maintaining column vectors of relational data within volatile memory
US8988257B2 (en) Data compression utilizing variable and limited length codes
US7728738B2 (en) Method and apparatus for processing character streams
US7817069B2 (en) Alternative encoding for LZSS output
TWI617976B (en) Processor-based apparatus and method for processing bit streams
US8902087B1 (en) Data decompression utilizing pre-expanded dictionaries during decompression
US10666288B2 (en) Systems, methods, and apparatuses for decompression using hardware and software
US7773005B2 (en) Method and apparatus for decoding variable length data
EP2678787A1 (en) Improved encoding and decoding of variable-length data with group formats
US7962727B2 (en) Method and apparatus for decompression of block compressed data
US7868791B2 (en) Pack unicode zSeries instructions
US7394409B1 (en) Method of doing pack ASCII zSeries instructions
US10915547B2 (en) Optimizing data conversion using pattern frequency

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:RAJENDRAN, ANTONISAMY A.;SOMASUNDARAM, MURUGANANDAM;REEL/FRAME:018905/0161

Effective date: 20070125

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

CC Certificate of correction
REMI Maintenance fee reminder mailed
FPAY Fee payment

Year of fee payment: 4

SULP Surcharge for late payment
REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20160805