CN112035170B - Method and system for branch predictor - Google Patents
Method and system for branch predictor Download PDFInfo
- Publication number
- CN112035170B CN112035170B CN202010842406.XA CN202010842406A CN112035170B CN 112035170 B CN112035170 B CN 112035170B CN 202010842406 A CN202010842406 A CN 202010842406A CN 112035170 B CN112035170 B CN 112035170B
- Authority
- CN
- China
- Prior art keywords
- branch prediction
- branch
- record information
- added
- processor logic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 67
- 230000004044 response Effects 0.000 claims abstract description 12
- 239000000872 buffer Substances 0.000 claims description 8
- 230000006870 function Effects 0.000 description 39
- 238000010586 diagram Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 6
- 230000006399 behavior Effects 0.000 description 5
- 230000004048 modification Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 238000004590 computer program Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 238000007792 addition Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000001010 compromised effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000011010 flushing procedure Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3804—Instruction prefetching for branches, e.g. hedging, branch folding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3842—Speculative instruction execution
- G06F9/3844—Speculative instruction execution using dynamic branch prediction, e.g. using branch history tables
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Advance Control (AREA)
Abstract
A method and system for a branch predictor shared by different processor logical cores of the same processor physical core is provided, the method comprising: in response to a processor logic core adding a piece of branch prediction record information in a case where the flag function is enabled, adding a flag indicating which processor logic core the piece of branch prediction record information is added by to the added branch prediction record information; when the branch predictor performs branch prediction on a program executed on a first processor logic core, determining branch prediction record information added by the first processor logic core according to the mark; and only using the branch prediction record information added by the first processor logic core indicated by the mark to perform branch prediction.
Description
Technical Field
The present application relates to the field of processors and, more particularly, to a method and system for a branch predictor.
Background
Modern processors commonly employ a multi-stage pipeline architecture. A problem encountered when processors incorporating pipelining process branch instructions is that depending on the true/false of the decision condition, a jump may occur which interrupts the processing of the instruction in the pipeline because the processor cannot determine the next instruction to the instruction until the branch is taken. The longer the pipeline, the longer the processor will wait because it must wait for the branch instruction to finish before determining the next instruction to enter the pipeline.
Branch predictors are used to improve execution efficiency. In the process of predicting execution, the processor predicts the result of judging a certain condition according to the history information of code execution, and then selects a corresponding branch to execute in advance. In the process of predicting execution, once an exception is encountered or a branch prediction error is found, the CPU discards a result of previous execution, restores the state of the CPU to a correct state before predicting execution, and then selects a corresponding correct instruction to continue execution.
Branch prediction techniques include static branch prediction performed at compile time and dynamic branch prediction performed by hardware at execution time. Static branch prediction is to directly fix one branch of a selected branch, has an average hit rate of 50%, is not high in precision, is almost not high in adaptability, but is simple to implement. Dynamic branch prediction is a technique that recent processors have attempted to employ. When the program runs, the future behavior of the branch instruction is predicted according to the past performance of the branch instruction. If the branch behavior changes, the prediction result will also change. The simplest dynamic Branch Prediction strategy is a Branch Prediction buffer (Branch Prediction buffer) or Branch history table (Branch history table).
The effectiveness of any branch prediction strategy depends on the accuracy of the strategy itself and the frequency of conditional branches.
Disclosure of Invention
According to one aspect of the present invention, there is provided a method for a branch predictor shared by different processor logical cores of the same processor physical core, the method comprising: in response to a processor logic core adding a piece of branch prediction record information in a case where the flag function is enabled, adding a flag indicating which processor logic core the piece of branch prediction record information is added by to the added branch prediction record information; when a branch predictor performs branch prediction on a program executed on a first processor logic core, determining branch prediction record information added by the first processor logic core according to the mark; branch prediction is performed using only the branch prediction record information added by the first processor logic core indicated by the flag.
According to another aspect of the present invention, there is provided a system for a branch predictor, the system comprising: a branch predictor shared by different processor logic cores of the same processor physical core; and a control register for registering whether to enable a bit of the flag function; wherein, in the case that the control register registers a bit enabling a flag function, in response to addition of a piece of branch prediction record information to one processor logic core, a flag indicating which processor logic core the piece of branch prediction record information is added to the added branch prediction record information is added; when the branch predictor performs branch prediction on a program executed on a first processor logic core, determining branch prediction record information added by the first processor logic core according to the mark; branch prediction is performed using only the branch prediction record information added by the first processor logic core indicated by the flag.
Therefore, the running processor logic core can be enabled to carry out branch prediction by using the branch predictor information recorded by the logic core through the mark, and malicious thread interference and data stealing executed on the other logic core are prevented, so that the safety is ensured.
Drawings
FIG. 1A shows an example of an instruction pipeline.
FIG. 1B illustrates the general principles of branch prediction.
FIG. 2 shows a schematic block diagram of a system for a branch predictor according to an embodiment of the present invention.
FIG. 3 illustrates an example of control registers and added branch prediction record information according to an embodiment of the present invention.
FIG. 4 shows a schematic flow diagram of a method for a branch predictor according to an embodiment of the present invention.
FIG. 5 shows a schematic flow diagram of a method for a branch predictor according to another embodiment of the present invention.
FIG. 6 shows a schematic flow diagram of a method for a branch predictor according to another embodiment of the present invention.
FIGS. 7A and 7B respectively illustrate schematic flow diagrams of methods for branch predictors according to further embodiments of the present invention.
Detailed Description
Reference will now be made in detail to the present embodiments of the invention, examples of which are illustrated in the accompanying drawings. While the invention will be described in conjunction with the specific embodiments, it will be understood that they are not intended to limit the invention to the embodiments described. On the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims. It should be noted that the method steps described herein may be implemented by any functional block or functional arrangement, and that any functional block or functional arrangement may be implemented as a physical entity or a logical entity, or a combination of both.
In order that those skilled in the art will better understand the present invention, the following detailed description of the invention is provided in conjunction with the accompanying drawings and the detailed description of the invention.
Note that the example to be described next is only a specific example, and is not intended as a limitation on the embodiments of the present invention, and specific shapes, hardware, connections, steps, numerical values, conditions, data, orders, and the like, are necessarily shown and described. Those skilled in the art can, upon reading this specification, utilize the concepts of the present invention to construct more embodiments than those specifically described herein.
FIG. 1A shows an example of an instruction pipeline.
Instruction pipelining is the way in which an instruction is divided into multiple small instructions, each step being performed by specialized circuitry, in order to increase the efficiency with which the processor executes the instruction.
For example, an instruction may go through 3 stages to execute: fetching instruction (command), decoding and executing; each stage takes one clock cycle, and if the pipeline technology is not adopted, the next instruction can be executed after the previous instruction is executed, so that the instruction execution needs 3 clock cycles; if the instruction pipeline technology is adopted, when the instruction enters the decoding process after the instruction is subjected to the instruction fetching process, the next instruction can be subjected to the instruction fetching process, and therefore the instruction execution efficiency is improved.
As shown in FIG. 1A, instruction Add is divided into three microinstructions: fetching, decoding and executing Add. The instruction Sub is divided into three small instructions: fetching, decoding and executing Sub. The command Cmp is divided into three cmdlets: fetching, decoding and executing Cmp. By adopting the instruction pipeline technology, when the Add instruction enters the decoding after completing the instruction fetching, the next instruction Sub can carry out the instruction fetching, thereby improving the execution efficiency of the instruction.
FIG. 1B illustrates the general principles of branch prediction.
Without a branch predictor, the processor would wait for the branch instruction to pass through the execution stage of the instruction pipeline before feeding the next instruction into the first stage of the pipeline, the fetch stage, or flushing the subsequent pipeline altogether.
The branch predictor speculatively predicts (or speculatively) which of the two branches is most likely to occur and then speculatively executes instructions for that branch (e.g., the speculative execution paths i through i +1 shown in FIG. 1B as dashed lines) to avoid wasted time due to pipeline stalls. If a branch misprediction is later discovered, those intermediate results of the speculative execution in the pipeline are all discarded, and the instructions on the correct branch path (e.g., the speculative execution paths i +1 through p +1 of the dashed line in FIG. 1B) are re-fetched to begin execution, which incurs a delay in program execution.
Dynamic branch prediction is the prediction of the future behavior of a branch instruction from its past behavior while the program is running. If the branch behavior changes, the prediction result will also change. That is, with dynamic branch prediction, the correct score route can be speculated with greater probability, thereby reducing the delay in program execution. Such dynamic Branch predictors include, but are not limited to, Branch Target Buffers (BTBs), Indirect Branch predictors (IND), Branch History Tables (BHTs), and the like, for example.
Currently, a hyper-Threading technology or a Simultaneous Multi-Threading (SMT) technology is also used, in which two processor logic cores are simulated on the same processor physical core, the two processor logic cores share the same processor CPU resource, and at the same time, two threads can occupy the CPU resource, so that the two threads can be executed, which is to implement parallel operation of executing the two threads at the same time.
Under the condition that two processor logic cores share the same processor CPU resource, because a branch predictor is shared by different logic cores of the same processor physical core, attack codes of malicious programs running on one logic core can influence speculative execution paths of attacked codes running on the other logic core of the same physical core by tampering with information such as BTB or IND, so that the attacked codes can be induced to speculatively execute specific code segments, and partial data in a specific target can be deduced in the specific code segments in a cache channel detection attack mode, so that the aim of revealing confidential information is fulfilled. Thus, the security of the processor may be compromised.
Embodiments of the present invention provide a method that can control whether information such as branch predictor BTB or IND is used separately on two logic cores by adding an option or flag; for a program with higher security requirement, a processor logic core which can run by the program can be enabled to use separated BTB and IND information to perform branch prediction through an option or a mark, so that malicious thread interference and data stealing which are executed on the other logic core are prevented; other programs without safety requirements can share the BTB and IND information with the other processor logic core through options or marks, and the optimal resource use efficiency is achieved.
FIG. 2 shows a schematic block diagram of a system 200 for a branch predictor according to an embodiment of the present invention.
As shown in FIG. 2, a 200 for a branch predictor includes: a branch predictor 201 shared by different processor logic cores of the same processor physical core; and a control register 202 for registering whether the bits of the flag function are enabled.
Wherein, in the case that the control register 202 registers a bit enabling flag function, in response to one processor logic core 203 or 204 adding one piece of branch prediction record information, a flag (e.g., Thread ID, TID) indicating which processor logic core (e.g., 203 or 204) the one piece of branch prediction record information is added by is added to the added branch prediction record information for marking and distinguishing different processor logic cores.
In one embodiment, the control register is an MSR control register MSR _ BP _ CTL, and whether the flag function is enabled is controlled by one or more first bits registered in the MSR control register MSR _ BP _ CTL. Of course, the names of the registers are merely examples, and are not limiting.
As shown in table 1 below, a Bit 0 may be used to control whether the flag function is enabled for any added branch predictor information, and if the Bit is 1, the flag function is enabled in all added branch predictor information, that is, a piece of branch prediction record information is added corresponding to one processor logical core 203 or 204, and a flag (TID) indicating which processor logical core (203 or 204) the piece of branch prediction record information is added by is added to the added branch prediction record information. If the bit is 0, the flag function is not enabled in the branch prediction record information. Of course, the number of bits may also be greater than 1, e.g., 00 for enabled and 11 for not enabled. This may simply enable or disable the flag function.
TABLE 1
Since in one system different branch predictors, branch target buffers BTB, indirect branch predictors IND, branch history tables BHT, etc. may be employed. Thus, the added branch prediction record information may include branch target buffer BTB information, indirect branch predictor IND information, branch history table BHT information, and the like. This information may be based on different branch predictor types.
For example, the different branch prediction record information includes BTB information and IND information. It is desirable to use different bits to control the flag function of more than two different types of recorded information, so that it is more flexible to control whether the flag function is enabled for the recorded information of different branch predictor types, rather than to use only one bit to control whether the flag function is enabled for all recorded information.
Thus, in one embodiment, the control register is an MSR control register MSR _ BP _ CTL, through one or more second bits of which the flag function is separately controlled for each branch prediction record information of a different branch predictor type.
Specifically, as shown in table 2 below, Bit 0 is used to control whether the flag function is enabled for BTB information of the BTB branch predictor, and if the Bit is 1, the flag function is enabled in BTB, i.e., in response to one processor logic core 203 or 204 adding one piece of BTB information, a flag (TID) indicating which processor logic core (e.g., 203 or 204) the piece of BTB information is added by is added in the BTB. If the bit is 0, the flag function is not enabled in the BTB information. Bit 1 is used to control whether the flag function is enabled for the IND information of the IND branch predictor, and if the Bit is 1, the flag function is enabled in the IND information, i.e., in response to one processor logic core 203 or 204 adding a piece of IND information, a flag (TID) indicating which processor logic core (e.g., 203 or 204) the piece of IND information is added by is added to the added IND information.
TABLE 2
Of course, the number of bits and the type of branch predictor are not limited to the above examples, but may be adjusted by those skilled in the art according to the actual situation, which is not an example here.
FIG. 3 illustrates an example of control registers and added branch prediction record information according to an embodiment of the present invention.
As shown in fig. 3, in the case where the bit registered in the MSR control register MSR _ BP _ CTL indicates that the flag functions of BTB information and IND information are enabled, in response to one processor logical core 203 or 204 adding one piece of BTB information, a flag TID indicating which processor logical core (e.g., 203 or 204) the piece of BTB information is added by is added to the added BTB information. In response to one processor logical core 203 or 204 adding a piece of IND information, a flag TID indicating which processor logical core (e.g., 203 or 204) the piece of IND information is added by is added to the added IND information. The tag TID is used to mark to which processor logical core this BTB information or IND information currently belongs.
As shown, the TID flag is inserted in front of each piece of BTB information or IND information, although this location is not a limitation as long as it can identify which processor logical core each piece of information is added by.
Returning to FIG. 2, in one embodiment, the bits registered in MSR control register MSR _ BP _ CTL are only modified by instructions of privileged program 100, while instructions of normal programs cannot. When the common program needs to modify the configuration of the MSR control register MSR _ BP _ CTL, a request needs to be sent to privileged software, and the privileged software determines whether to modify the configuration; illegal modification of the MSR _ BP _ CTL register by malware can be prevented. Therefore, the safety of the bits registered in the MSR control register MSR _ BP _ CTL can be ensured, and the bits cannot be easily tampered by malicious software. Of course, this is merely an example, and whether branch predictor BTB or IND information is used separately on two logical cores of a processor may be controlled by other means, including, but not limited to, for example, executing privileged instructions, modifying specific registers, etc.
In the branch prediction process, when the branch predictor performs branch prediction on a program executed on the first processor logic core 203, determining branch prediction record information added by the first processor logic core 203 according to the mark; branch prediction is performed using only the branch prediction record information added by the first processor logic core 203 indicated by the flag.
In the branch prediction process, when the branch predictor performs branch prediction on a program executed on the second processor logic core 204, determining branch prediction record information added by the second processor logic core 204 according to the mark; branch prediction is performed using only the branch prediction record information added by the second processor logic core 204 indicated by the flag.
Therefore, the method can enable a program running on a certain processor logic core to use the information added by the processor logic core to perform branch prediction, and prevent malicious thread interference and data stealing executed on another processor logic core.
Of course, in the case where the control register 202 registers a bit for not enabling the flag function, the flag is not added to the added branch prediction record information in response to one processor logic core adding a piece of branch prediction record information, wherein the branch predictor performs branch prediction by sharing the added branch prediction record information by the different processor logic cores 203 and 204.
For example, other programs without security requirements may perform branch prediction by sharing branch predictor information with another processor logic core via bits indicating the unmarked function, thereby achieving optimal resource utilization efficiency.
FIG. 4 shows a schematic flow diagram of a method 400 for a branch predictor according to an embodiment of the present invention.
As shown in FIG. 4, a method 400 for a branch predictor shared by different processor logical cores of the same processor physical core, the method comprising: step 401, in the case of enabling the flag, function, in response to a processor logic core adding a piece of branch prediction record information, adding a flag indicating which processor logic core the piece of branch prediction record information is added by into the added branch prediction record information; step 402, when the branch predictor performs branch prediction on a program executed on a first processor logic core, determining branch prediction record information added by the first processor logic core according to the mark; in step 403, branch prediction is performed only by using the branch prediction record information added by the first processor logic core indicated by the flag.
Therefore, the method can enable a program running on a certain processor logic core to use the information added by the processor logic core to perform branch prediction, and prevent malicious thread interference and data stealing executed on another processor logic core.
FIG. 5 shows a schematic flow diagram of a method for a branch predictor according to another embodiment of the present invention.
As shown in fig. 5, the method 400 may further include: step 501, when the branch predictor performs branch prediction on a program executed on a second processor logic core, determining branch prediction record information added by the second processor logic core according to the mark; in step 502, branch prediction is performed only by using the branch prediction record information added by the second processor logic core indicated by the flag.
Therefore, the method can enable a program running on a certain processor logic core to use the information added by the processor logic core to perform branch prediction, and prevent malicious thread interference and data stealing executed on another processor logic core.
FIG. 6 shows a schematic flow diagram of a method for a branch predictor according to another embodiment of the present invention.
As shown in fig. 6, method 400 may further include step 601, responsive to a processor logic core adding a piece of branch prediction record information without enabling the flag function, not adding the flag to the added branch prediction record information. The branch predictor performs branch prediction by sharing the added branch prediction record information among the different processor logic cores.
In this way, other programs without security requirements can perform branch prediction by sharing branch predictor information with another processor logic core through the bit indicating the non-flag function, thereby achieving optimal resource utilization efficiency.
FIGS. 7A and 7B respectively illustrate schematic flow diagrams of methods for branch predictors according to further embodiments of the present invention.
As shown in fig. 7A, the method 400 may further include a step 701 of adding an MSR control register MSR _ BP _ CTL, wherein whether the flag function is enabled is controlled by one or more first bits registered in the MSR control register MSR _ BP _ CTL.
This may simply enable or disable the flag function.
Alternatively, as shown in FIG. 7B, the method 400 may further include a step 701' of adding an MSR control register MSR _ BP _ CTL, through one or more second bits of which to separately control whether a flag function is enabled for each branch prediction record information of a different branch predictor type. In one embodiment, the branch prediction record information includes at least one of branch target buffer BTB information, indirect branch predictor IND information, branch history table BHT information.
The flag functions of more than two different types of recording information are controlled by different bits, so that whether the flag functions are enabled by the recording information of different branch predictor types can be controlled more flexibly, and whether the flag functions are enabled by all the recording information is controlled by only one bit.
In one embodiment, the bits registered in MSR control register MSR _ BP _ CTL are modified only by instructions of privileged programs, while instructions of normal programs are not. Therefore, the safety of the bits registered in the MSR control register MSR _ BP _ CTL can be ensured, and the bits cannot be easily tampered by malicious software.
Of course, the above-mentioned embodiments are merely examples and not limitations, and those skilled in the art can combine and combine some steps and apparatuses from the above-mentioned separately described embodiments to achieve the effects of the present invention according to the concepts of the present invention, and such combined and combined embodiments are also included in the present invention, and such combined and combined embodiments are not necessarily described herein.
It is noted that advantages, effects, and the like, which are mentioned in the present disclosure, are only examples and not limitations, and they are not to be considered essential to various embodiments of the present invention. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the invention is not limited to the specific details described above.
The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
The flowchart of steps in the present disclosure and the above description of methods are merely illustrative examples and are not intended to require or imply that the steps of the various embodiments must be performed in the order presented. As will be appreciated by those skilled in the art, the order of the steps in the above embodiments may be performed in any order. Words such as "thereafter," "then," "next," etc. are not intended to limit the order of the steps; these words are only used to guide the reader through the description of these methods. Furthermore, any reference to an element in the singular, for example, using the articles "a," "an," or "the" is not to be construed as limiting the element to the singular.
In addition, the steps and devices in the embodiments are not limited to be implemented in a certain embodiment, and in fact, some steps and devices in the embodiments may be combined according to the concept of the present invention to conceive new embodiments, and these new embodiments are also included in the scope of the present invention.
The individual operations of the methods described above may be performed by any suitable means capable of performing the corresponding functions. The means may include various hardware and/or software components and/or modules including, but not limited to, a hardware circuit, an Application Specific Integrated Circuit (ASIC), or a processor.
The various illustrative logical blocks, modules, and circuits described may be implemented or performed with a general purpose processor, a Digital Signal Processor (DSP), an ASIC, a field programmable gate array signal (FPGA) or other Programmable Logic Device (PLD), discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any commercially available processor, controller, microcontroller or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
The steps of a method or algorithm described in connection with the disclosure herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may reside in any form of tangible storage medium. Some examples of storage media that may be used include Random Access Memory (RAM), Read Only Memory (ROM), flash memory, EPROM memory, EEPROM memory, registers, hard disk, removable disk, CD-ROM, and the like. A storage medium may be coupled to the processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. A software module may be a single instruction, or many instructions, and may be distributed over several different code segments, among different programs, and across multiple storage media.
The methods disclosed herein comprise one or more acts for implementing the described methods. The methods and/or acts may be interchanged with one another without departing from the scope of the claims. In other words, unless a specific order of actions is specified, the order and/or use of specific actions may be modified without departing from the scope of the claims.
The functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored as one or more instructions on a tangible computer-readable medium. A storage media may be any available tangible media that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other tangible medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. As used herein, disk (disk) and disc (disc) includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers.
Accordingly, a computer program product may perform the operations presented herein. For example, such a computer program product may be a computer-readable tangible medium having instructions stored (and/or encoded) thereon that are executable by one or more processors to perform the operations described herein. The computer program product may include packaged material.
Software or instructions may also be transmitted over a transmission medium. For example, the software may be transmitted from a website, server, or other remote source using a transmission medium such as coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, or microwave.
Further, modules and/or other suitable means for carrying out the methods and techniques described herein may be downloaded and/or otherwise obtained by a user terminal and/or base station as appropriate. For example, such a device may be coupled to a server to facilitate the transfer of means for performing the methods described herein. Alternatively, the various methods described herein can be provided via storage means (e.g., RAM, ROM, a physical storage medium such as a CD or floppy disk) so that the user terminal and/or base station can obtain the various methods when coupled to or providing storage means to the device. Further, any other suitable technique for providing the methods and techniques described herein to a device may be utilized.
Other examples and implementations are within the scope and spirit of the disclosure and the following claims. For example, due to the nature of software, the functions described above may be implemented using software executed by a processor, hardware, firmware, hard-wired, or any combination of these. Features implementing functions may also be physically located at various locations, including being distributed such that portions of functions are implemented at different physical locations. Also, as used herein, including in the claims, "or" as used in a list of items beginning with "at least one" indicates a separate list, such that a list of "A, B or at least one of C" means a or B or C, or AB or AC or BC, or ABC (i.e., a and B and C). Furthermore, the word "exemplary" does not mean that the described example is preferred or better than other examples.
Various changes, substitutions and alterations to the techniques described herein may be made without departing from the techniques of the teachings as defined by the appended claims. Moreover, the scope of the claims of the present disclosure is not limited to the particular aspects of the process, machine, manufacture, composition of matter, means, methods and acts described above. Processes, machines, manufacture, compositions of matter, means, methods, or acts, presently existing or later to be developed that perform substantially the same function or achieve substantially the same result as the corresponding aspects described herein may be utilized. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or acts.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the invention to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.
Claims (14)
1. A method for a branch predictor shared by different processor logical cores of the same processor physical core, the method comprising:
in response to a processor logic core adding a piece of branch prediction record information in a case where the flag function is enabled, adding a flag indicating which processor logic core the piece of branch prediction record information is added by to the added branch prediction record information;
when a branch predictor performs branch prediction on a program executed on a first processor logic core, determining branch prediction record information added by the first processor logic core according to the mark;
branch prediction is performed using only the branch prediction record information added by the first processor logic core indicated by the flag.
2. The method of claim 1, further comprising:
when a branch predictor performs branch prediction on a program executed on a second processor logic core, determining branch prediction record information added by the second processor logic core according to the mark;
branch prediction is performed using only the branch prediction record information added by the second processor logic core indicated by the flag.
3. The method of claim 1, further comprising:
responsive to a processor logic core adding a piece of branch prediction record information without enabling the flag function, not adding the flag to the added branch prediction record information,
wherein the branch predictor performs branch prediction by the different processor logic cores sharing the added branch prediction record information.
4. The method of claim 1, further comprising:
an MSR control register MSR _ BP _ CTL is added, and whether the flag function is enabled or not is controlled by one or more first bits registered in the MSR control register MSR _ BP _ CTL.
5. The method of claim 1, further comprising:
an MSR control register MSR _ BP _ CTL is added, and whether a flag function is enabled for each branch prediction record information of different branch predictor types is controlled by one or more second bits registered in the MSR control register MSR _ BP _ CTL respectively.
6. The method of claim 4 or 5, wherein the bits registered in the MSR control register MSR BP CTL are modified only by instructions of privileged programs and not by instructions of normal programs.
7. The method of claim 1, wherein the branch prediction record information comprises at least one of Branch Target Buffer (BTB) information, indirect branch predictor (IND) information, Branch History Table (BHT) information.
8. A system for a branch predictor, the system comprising:
a branch predictor shared by different processor logic cores of the same processor physical core; and
a control register for registering whether to enable a bit of the flag function;
wherein, in the case that the control register registers a bit enabling a flag function, in response to addition of a piece of branch prediction record information to one processor logic core, a flag indicating which processor logic core the piece of branch prediction record information is added to the added branch prediction record information is added;
when the branch predictor performs branch prediction on a program executed on a first processor logic core, determining branch prediction record information added by the first processor logic core according to the mark;
branch prediction is performed using only the branch prediction record information added by the first processor logic core indicated by the flag.
9. The system of claim 8, wherein
When a branch predictor performs branch prediction on a program executed on a second processor logic core, determining branch prediction record information added by the second processor logic core according to the mark;
branch prediction is performed using only the branch prediction record information added by the second processor logic core indicated by the flag.
10. The system of claim 8, wherein
Responsive to a processor logic core adding a piece of branch prediction record information in the case where the control register registers a bit that does not enable the flag function, not adding the flag in the added branch prediction record information,
wherein the branch predictor performs branch prediction by the different processor logic cores sharing the added branch prediction record information.
11. The system of claim 8, wherein the control register is an MSR control register MSR BP CTL, whether the flag function is enabled being controlled by one or more first bits registered in the MSR control register MSR BP CTL.
12. The system of claim 8, wherein
The control register is an MSR control register MSR _ BP _ CTL, and whether a flag function is enabled for each branch prediction record information of different branch predictor types is controlled by one or more second bits registered in the MSR control register MSR _ BP _ CTL respectively.
13. The system of claim 11 or 12, wherein the bits registered in the MSR control register MSR _ BP _ CTL are modified only by instructions of privileged programs and not by instructions of normal programs.
14. The system of claim 8, wherein the branch prediction record information comprises at least one of Branch Target Buffer (BTB) information, indirect branch predictor (IND) information, Branch History Table (BHT) information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010842406.XA CN112035170B (en) | 2020-08-20 | 2020-08-20 | Method and system for branch predictor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010842406.XA CN112035170B (en) | 2020-08-20 | 2020-08-20 | Method and system for branch predictor |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112035170A CN112035170A (en) | 2020-12-04 |
CN112035170B true CN112035170B (en) | 2021-06-29 |
Family
ID=73578343
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010842406.XA Active CN112035170B (en) | 2020-08-20 | 2020-08-20 | Method and system for branch predictor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112035170B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112699061B (en) * | 2020-12-07 | 2022-08-26 | 海光信息技术股份有限公司 | Systems, methods, and media for implementing cache coherency for PCIe devices |
CN112596792B (en) * | 2020-12-17 | 2022-10-28 | 海光信息技术股份有限公司 | Branch prediction method, apparatus, medium, and device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105718241A (en) * | 2016-01-18 | 2016-06-29 | 北京时代民芯科技有限公司 | SPARC V8 system structure based classified type mixed branch prediction system |
CN111221575A (en) * | 2019-12-30 | 2020-06-02 | 核芯互联科技(青岛)有限公司 | Register renaming method and system for out-of-order high-performance processor |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN202133998U (en) * | 2011-04-18 | 2012-02-01 | 江苏中科芯核电子科技有限公司 | Branch prediction device |
US8862861B2 (en) * | 2011-05-13 | 2014-10-14 | Oracle International Corporation | Suppressing branch prediction information update by branch instructions in incorrect speculative execution path |
WO2016156955A1 (en) * | 2015-03-31 | 2016-10-06 | Centipede Semi Ltd. | Parallelized execution of instruction sequences based on premonitoring |
-
2020
- 2020-08-20 CN CN202010842406.XA patent/CN112035170B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105718241A (en) * | 2016-01-18 | 2016-06-29 | 北京时代民芯科技有限公司 | SPARC V8 system structure based classified type mixed branch prediction system |
CN111221575A (en) * | 2019-12-30 | 2020-06-02 | 核芯互联科技(青岛)有限公司 | Register renaming method and system for out-of-order high-performance processor |
Also Published As
Publication number | Publication date |
---|---|
CN112035170A (en) | 2020-12-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5043560B2 (en) | Program execution control device | |
US6178498B1 (en) | Storing predicted branch target address in different storage according to importance hint in branch prediction instruction | |
US8578141B2 (en) | Loop predictor and method for instruction fetching using a loop predictor | |
US7444501B2 (en) | Methods and apparatus for recognizing a subroutine call | |
US20080059780A1 (en) | Methods and Apparatus for Emulating the Branch Prediction Behavior of an Explicit Subroutine Call | |
US10162635B2 (en) | Confidence-driven selective predication of processor instructions | |
US9851973B2 (en) | Dynamic branch hints using branches-to-nowhere conditional branch | |
US10013257B2 (en) | Register comparison for operand store compare (OSC) prediction | |
CN112035170B (en) | Method and system for branch predictor | |
US20140156978A1 (en) | Detecting and Filtering Biased Branches in Global Branch History | |
CN116737240A (en) | Branch prediction method, device, processor, medium and equipment | |
KR930001055B1 (en) | Data processing apparatus for performing parallel decoding and pararrel execution of a variable word length instruction | |
CN113076136A (en) | Safety protection-oriented branch instruction execution method and electronic device | |
EP3198400B1 (en) | Dependency-prediction of instructions | |
US20140019737A1 (en) | Branch Prediction For Indirect Jumps | |
CN111241599B (en) | Dynamic identification and maintenance method for processor chip safety dependence | |
JP7269318B2 (en) | Branch target buffer with early return prediction | |
JP2019200523A (en) | Arithmetic processing device and method for controlling arithmetic processing device | |
US20050027921A1 (en) | Information processing apparatus capable of prefetching instructions | |
JP2004038338A (en) | Information processor having branch estimation mechanism | |
US11809873B2 (en) | Selective use of branch prediction hints | |
US11687342B2 (en) | Way predictor and enable logic for instruction tightly-coupled memory and instruction cache | |
CN115167924A (en) | Instruction processing method and device, electronic equipment and computer-readable storage medium | |
US7343481B2 (en) | Branch prediction in a data processing system utilizing a cache of previous static predictions | |
CN115167923A (en) | Instruction processing method and device, electronic equipment and computer-readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |