WO2016127600A1 - Exception handling method and apparatus - Google Patents
Exception handling method and apparatus Download PDFInfo
- Publication number
- WO2016127600A1 WO2016127600A1 PCT/CN2015/086164 CN2015086164W WO2016127600A1 WO 2016127600 A1 WO2016127600 A1 WO 2016127600A1 CN 2015086164 W CN2015086164 W CN 2015086164W WO 2016127600 A1 WO2016127600 A1 WO 2016127600A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- exception
- instruction
- address
- pci
- memory space
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
Definitions
- the present invention relates to the field of communications, and in particular to an exception handling method and apparatus.
- PCI-E Peripheral Component Interconnect Express
- 3GIO Third-generation PCI bus
- PCI-E is a high-speed serial point-to-point dual-channel high-bandwidth transmission.
- the connected devices allocate exclusive channel bandwidth and do not share bus bandwidth. They mainly support active power management, error reporting, end-to-end reliability transmission, hot swap and service.
- QoS Quality of Service
- PCI-E memory, I/O, configuration, and message space.
- PCI-E memory, I/O, configuration, and message space.
- the Central Processing Unit issues a read command to access the PCI-E memory space.
- the processor first maps the local address of the instruction to the PCI-E controller through local access windows. Then uses outbound ATMU windows to map the processor address to the address of the PCI-E domain.
- the transaction layer of the PCI-E controller will be based on The access type of the processor and the mapped PCI-E address constitute one or more TLPs. Finally, these TLPs are sent to the opposite end of the bus through the link layer and physical layer of the PCI-E, and wait for the completion message of the opposite end. End this thing.
- the PCI-E controller When the processor sends a read access and the address of the read access does not correspond to the bank, the PCI-E controller will generate an exception due to waiting for the completion message to time out.
- the handler will determine whether the instruction that triggered the exception is in the home state or in the kernel state. In the user mode, exception handling will send a SIGBUS signal to the user mode process. After the process receives the SIGBUS signal, the default processing method is to end the process. If it is in kernel mode, the user will print out the current processor environment information and exit the exception.
- the PCI-E memory access instruction was not successfully executed, and the kernel will re-execute the instruction after the exception is returned. The memory access instruction will access the PCI-E address corresponding to the bank, so the exception will be thrown again, which will eventually cause the kernel to enter an infinite loop state.
- PCI-E devices may be powered down or unplugged at any time. It is assumed that the processor continuously reads and accesses the memory space of the PCI-E device at a certain time. The PCI-E device suddenly loses power during the visit. If the access is initiated by a user-mode process, the process will be killed. If the access is initiated by the kernel mode, the system will fall into an infinite loop because of this access failure, so that the whole machine is down.
- an embodiment of the present invention provides an abnormal processing method and apparatus.
- an exception processing method including: detecting whether an abnormality occurs in a process of accessing a fast peripheral interconnection standard PCI-E storage memory space; if so, an abnormal instruction corresponding to an abnormality is generated The return value is set to an illegal value; the next address of the exception instruction is used as the return address of the current exception.
- detecting whether an abnormality occurs during the process of accessing the PCI-E storage memory space includes: detecting whether the following occurs: when accessing the PCI-E storage memory space, the PCI-E storage memory space is not An abnormality caused by the corresponding storage entity, and whether the operation address of the abnormal instruction to be accessed is located in an address range of the PCI-E storage memory space; wherein, if the detection result is yes, determining to access the An exception occurred during PCI-E memory memory space.
- the operation address is obtained by one of the following manners: when determining that the current abnormality is an exception caused by loading a load class instruction, acquiring the operation address from the load class instruction; or checking from a machine
- the operation address is read in the interrupt status register MCSR.
- the obtaining the operation address from the load class instruction includes: acquiring, according to a format type of the load class instruction, a specified location from the load class instruction corresponding to the format type. Operation address.
- setting the return value corresponding to the abnormal instruction that causes the current abnormality to an illegal value includes: writing the illegal value in a data register corresponding to the abnormal instruction.
- the abnormality includes at least one of the following: an exception caused by loading a load class instruction, and a data bus abnormality read on the bus.
- an exception processing apparatus comprising: a detecting module configured to detect whether an abnormality occurs during access to a standard PCI-E memory space; and a setting module is set to When an abnormality occurs, the return value corresponding to the abnormal command causing the abnormality is set to an illegal value; and the determining module is set to use the next address of the abnormal command as the return address of the current abnormality.
- the detecting module is configured to detect whether an abnormality occurs when the PCI-E storage memory space does not have a corresponding storage entity when accessing the PCI-E storage memory space. And whether the operation address of the abnormal instruction to be accessed is located in an address range of the PCI-E storage memory space; wherein, if the detection result is yes, determining that the access to the PCI-E storage memory space occurs abnormal.
- the device further includes: an acquiring module, configured to acquire the operation address, wherein the acquiring module is further configured to: when determining that the current abnormality is an abnormality caused by loading a load class instruction, The load class instruction acquires the operation address; or reads the operation address from a machine check interrupt status register MCSR.
- the acquiring module is further configured to acquire the operation address from a specified location corresponding to the format type in the load class instruction according to a format type of the load class instruction.
- the next address of the abnormal instruction is used as the return address of the current abnormality, and the return value of the abnormal instruction is set as the illegal value.
- FIG. 2 is a block diagram showing the structure of an exception handling apparatus according to an embodiment of the present invention.
- FIG. 3 is a block diagram showing another structure of an exception processing apparatus according to an embodiment of the present invention.
- FIG. 4 is a flowchart of a processing method for processing a read pcie memory exception using a method of analyzing an instruction according to a preferred embodiment of the present invention
- FIG. 5 illustrates exception handling of pcie memory using registers provided by the powerpc platform architecture in accordance with a preferred embodiment of the present invention.
- FIG. 1 is a flowchart of an exception processing method according to an embodiment of the present invention. As shown in FIG. 1, the flow includes the following steps:
- Step S102 detecting whether an abnormality occurs during the process of accessing the standard peripheral PCI-E memory memory space.
- Step S104 if yes, setting a return value corresponding to the abnormal instruction causing the abnormality to an illegal value
- step S106 the next address of the abnormal command is used as the return address of the current abnormality.
- the next address of the abnormal command is used as the return address of the current abnormality, and the return value of the abnormal command is set to an illegal value.
- the problem caused by an abnormality in the access to the pcie memory address and the system hangs due to an infinite loop, thereby improving the robustness and survivability of the system.
- the illegal value may be preset, and is preferably a hexadecimal F in the embodiment of the present invention.
- Whether an abnormality occurs during the process of detecting the access to the PCI-E memory memory space may be detected by detecting whether an abnormality occurs due to the absence of a corresponding storage entity in the PCI-E storage memory space when accessing the PCI-E storage memory space. And whether the operation address to be accessed of the abnormal instruction is located in an address range of the PCI-E storage memory space; wherein, if the detection result is yes, determining that an abnormality occurs during access to the PCI-E storage memory space.
- the operation address may be obtained by one of the following methods: when determining that the current abnormality is an exception caused by loading a load class instruction, acquiring the operation address from the load class instruction; or reading the above from the machine check interrupt status register MCSR
- the operation address is different in the format type of the load class instruction. Therefore, in the implementation of the present invention, the operation address is obtained from the specified location corresponding to the format type in the load class instruction according to the format type of the load class instruction.
- the abnormality includes at least one of the following: an exception caused by loading a load class instruction, and a data bus abnormality read on the bus.
- the technical solution of the embodiment of the present invention solves the abnormal problem caused by the powerpc architecture processor reading and accessing the pcie memory address corresponding to the memoryless body, thereby achieving the process of not killing the user state, and does not cause the system to cause
- the method of hanging out of the dead loop improves the robustness and survivability of the system.
- Step 1) Determine the cause of the abnormality
- the cause of the exception is determined in the exception handling flow: that is, the processor accesses the exception caused by the pcie memory address corresponding to the bank. If this is not the case, the exception continues as originally. Otherwise, this exception will be further processed
- Step 2) Determine whether the abnormal address is within the range of the pcie memory address.
- Step 3 Fill the return value to full F
- the return value of the load memory instruction is filled with all Fs. Change the address returned by the exception to the address under the instruction that caused the exception, and skip the subsequent printing of the kernel exception or send the SIGBUS signal. Give the user process an operation and return directly.
- the above technical solution provided by the embodiment of the present invention can easily and effectively prevent the problem that the process caused by the abnormality of the read access PCIe is killed or the system hangs.
- An exception processing apparatus is also provided in the embodiment to implement the above-mentioned embodiments and preferred embodiments.
- the descriptions of the modules involved in the apparatus are described below.
- the term "module” may implement a combination of software and/or hardware of a predetermined function.
- the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
- 2 is a block diagram showing the structure of an exception handling apparatus according to an embodiment of the present invention. As shown in Figure 2, the device comprises:
- the detecting module 20 is configured to detect whether an abnormality occurs during the process of accessing the standard peripheral PCI-E memory memory space;
- the setting module 22 is connected to the detecting module 20, and is configured to set a return value corresponding to the abnormal command that causes the abnormality to an illegal value when an abnormality occurs;
- the determining module 24 is connected to the setting module 22 and is configured to set the next address of the abnormal command as the return address of the current abnormality.
- the related art has solved the problem that the process stops due to an abnormality in accessing the pcie memory address and the system hangs due to an infinite loop, thereby improving the robustness and survivability of the system.
- FIG. 3 is a block diagram of another structure of an exception processing apparatus according to an embodiment of the present invention, wherein the detecting module 20 is configured to detect whether a PCI-E memory memory space is accessed due to the PCI-E memory memory space when accessing the PCI-E memory memory space.
- An exception caused by the absence of the corresponding storage unit, and whether the operation address of the abnormal instruction to be accessed is located in the address range of the PCI-E storage memory space; wherein, in the case of the detection result being YES, determining to access the PCI- An exception occurred during the storage of the memory space.
- the device further includes: an obtaining module 26, configured to obtain the operation address, wherein the obtaining module 26 is further configured to: obtain the foregoing from the load class instruction when determining that the current abnormality is an abnormality caused by loading a load type instruction The operation address; or read the above operation address from the machine check interrupt status register MCSR.
- an obtaining module 26 configured to obtain the operation address, wherein the obtaining module 26 is further configured to: obtain the foregoing from the load class instruction when determining that the current abnormality is an abnormality caused by loading a load type instruction The operation address; or read the above operation address from the machine check interrupt status register MCSR.
- the obtaining module 26 is further configured to acquire the operation address from a specified location corresponding to the format type in the load class instruction according to a format type of the load class instruction.
- Step S402 the exception entry, the powerpc architecture instruction belongs to the reduced instruction set, adopts a unified instruction encoding manner, the lengths of the instructions are equal, and the op-code in all the instructions is always in the same position. According to this instruction feature, the judgment of the cause of the abnormality can be implemented by analyzing the instruction op-code.
- step S404 the instruction regs->nip causing the abnormality is extracted by retaining the lower structure struct pt_regs when entering the abnormality. Take the [0:5] bits of the instruction and determine if the instruction belongs to the load class instruction. If so, the processing in step S406 is performed, otherwise step S410.
- the Load instruction is further analyzed.
- the address to be accessed by the instruction can be extracted according to the specific Load instruction.
- Each load instruction is stored in a different format, so the address is extracted according to different instructions.
- step S406 the address range covered by the pcie is obtained by the linux kernel structure variable struct pci_controller hose_head. It is also checked whether the address extracted in step S404 belongs to this range. If so, the operation in step S410 is performed, otherwise step S410 is performed.
- Step S408 the information struct pt_regs retained when the abnormality is acquired is filled in the data register regs->gpr[0] of the load instruction causing the abnormality to be all F.
- step S410 the original exception processing flow is executed.
- Another implementation is to obtain and analyze exception information based on the powerpc core cache.
- the processing of the read access pcie memory exception implemented using the core register will be further described below with reference to FIG.
- Step S504 in the powerpc architecture, when an exception occurs, the Machine Check Address Register (MCAR) register contains the address of the instruction that caused the exception.
- MCAR Machine Check Address Register
- step S506 by reading the MCAR register, the operation address to be accessed by the instruction causing the abnormality can be obtained, and it is determined whether the operation address is located in the address range covered by pcie. If yes, the operation in step S506 is performed, otherwise step S508 is performed.
- Step S508 the information struct pt_regs retained when the abnormality is acquired is filled with the data register regs->gpr[0] of the load instruction causing the abnormality as the full F.
- step S508 the original exception processing flow is executed.
- the embodiment of the present invention achieves the following technical effects: the related art stops the process caused by an abnormality in accessing the pcie memory address and the system hangs due to an infinite loop, thereby improving the robustness of the system. Sex and survivability.
- a storage medium is further provided, wherein the software includes the above-mentioned software, including but not limited to: an optical disk, a floppy disk, a hard disk, an erasable memory, and the like.
- modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein.
- the steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module.
- the invention is not limited to any specific combination of hardware and software.
- the above technical solution provided by the present invention can be applied to an exception processing process, and when an abnormality occurs, the next address of the abnormal instruction is used as the return address of the current abnormality, and the return value of the abnormal instruction is set to an illegal value.
- the technical means solves the problem that the process stops due to an abnormality in accessing the pcie memory address and the system hangs due to an infinite loop, thereby improving the robustness and survivability of the system.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
- Storage Device Security (AREA)
Abstract
An exception handling method and apparatus. The method comprises: detecting whether an exception has occurred in a process of accessing a PCI-E memory space (S102); if so, setting a return value corresponding to an exception instruction causing the exception as an illegal value (S104); and taking the next address of the exception instruction as a return address of the current exception (S106). Using the technical solution solves the problems in the related art that the occurrence of an exception in the process of accessing a pcie memory address causes the procedure to stop and a system halts due to an endless loop, thereby improving the robustness and survivability of the system.
Description
本发明涉及通信领域,具体而言,涉及一种异常处理方法及装置。The present invention relates to the field of communications, and in particular to an exception handling method and apparatus.
快捷外设互联标准(Peripheral Component Interconnect Express,简称为PCI-E)是一种新型的总线标准和接口,是由英特尔在2001年公布的被称为“3GIO”的第三代PCI总线。PCI-E属于高速串行点对点双通道高带宽传输,所连接的设备分配独享通道带宽,不共享总线带宽,主要支持主动电源管理,错误报告,端对端的可靠性传输,热插拔以及服务质量(Quality of Service,简称为QoS)等功能。Peripheral Component Interconnect Express (PCI-E) is a new type of bus standard and interface. It is a third-generation PCI bus called "3GIO" announced by Intel in 2001. PCI-E is a high-speed serial point-to-point dual-channel high-bandwidth transmission. The connected devices allocate exclusive channel bandwidth and do not share bus bandwidth. They mainly support active power management, error reporting, end-to-end reliability transmission, hot swap and service. Features such as Quality of Service (QoS).
PCI-E中规定了四种地址空间,它们分别是memory、I/O、configuration和message空间。在PCI-E体系架构中,访问没有存储体对应的memory地址时会产生异常。对于不同处理器平台,在对此类异常的处理方法上略会有些差异。Four address spaces are specified in PCI-E, which are memory, I/O, configuration, and message space. In the PCI-E architecture, an exception occurs when accessing a memory address that does not have a bank. For different processor platforms, there will be some differences in how to handle such anomalies.
在精简指令集架构的中央处理器(PerPerformance Optimization With Enhanced-Performance Computing,简称POWERPC)处理器平台下,中央处理器(Central Processing Unit,简称为CPU发出一条读访问PCI-E memory空间的指令后,处理器首先通过local access windows将指令中的local address地址映射到PCI-E controller上。然后使用outbound ATMU windows将处理器地址映射为PCI-E域的地址。PCI-E控制器的事物层将根据处理器的访问类型和映射后的PCI-E地址构成一个或多个TLP。最终,这些TLP会通过PCI-E的链路层、物理层送达总线的对端,并等待对端的completion报文结束本次事物。Under the PerPerformance Optimization With Enhanced-Performance Computing (POWERPC) processor platform, the Central Processing Unit (CPU) issues a read command to access the PCI-E memory space. The processor first maps the local address of the instruction to the PCI-E controller through local access windows. Then uses outbound ATMU windows to map the processor address to the address of the PCI-E domain. The transaction layer of the PCI-E controller will be based on The access type of the processor and the mapped PCI-E address constitute one or more TLPs. Finally, these TLPs are sent to the opposite end of the bus through the link layer and physical layer of the PCI-E, and wait for the completion message of the opposite end. End this thing.
当处理器发送一个读访问,而读访问的地址没有存储体与之对应的情况下,PCI-E控制器会因等待completion报文超时而产生一个异常。在异常处理中过程中,处理函数会判断触发该异常的指令是户态,还是内核态的。如果是用户态,异常处理将发送SIGBUS信号给用户态的进程。用进程在收到SIGBUS信号后,默认的处理方式是结束该进程。如果是内核态,用户将打印出当前的处理器环境信息,并退出异常。PCI-E访存指令没有成功执行,在异常返回后内核将会重新执行该条指令。该访存指令会访问没有存储体对应的PCI-E地址,因此也会再次引发异常,最终将会导致内核进入死循环状态。When the processor sends a read access and the address of the read access does not correspond to the bank, the PCI-E controller will generate an exception due to waiting for the completion message to time out. During the exception handling process, the handler will determine whether the instruction that triggered the exception is in the home state or in the kernel state. In the user mode, exception handling will send a SIGBUS signal to the user mode process. After the process receives the SIGBUS signal, the default processing method is to end the process. If it is in kernel mode, the user will print out the current processor environment information and exit the exception. The PCI-E memory access instruction was not successfully executed, and the kernel will re-execute the instruction after the exception is returned. The memory access instruction will access the PCI-E address corresponding to the bank, so the exception will be thrown again, which will eventually cause the kernel to enter an infinite loop state.
在支持PCI-E热插拔的系统中,PCI-E设备可能随时会掉电或拔出。假设在某一个时刻处理器连续读访问PCI-E设备的memory空间。在访问期间PCI-E设备突然掉电。若此次访问是用户态进程发起的,进程将被杀死。若访问是内核态发起的,系统将会因为这次访问失败而陷入死循环中,以致整机down掉。In systems that support PCI-E hot plugging, PCI-E devices may be powered down or unplugged at any time. It is assumed that the processor continuously reads and accesses the memory space of the PCI-E device at a certain time. The PCI-E device suddenly loses power during the visit. If the access is initiated by a user-mode process, the process will be killed. If the access is initiated by the kernel mode, the system will fall into an infinite loop because of this access failure, so that the whole machine is down.
针对相关技术中,访问pcie memory地址过程中出现异常而引起的进程停止以及系统因死循环挂掉的问题,尚未提出有效的解决方案。
In the related art, an effective solution has not been proposed for the process stoppage caused by an abnormality in accessing the pcie memory address and the system hangs due to an infinite loop.
发明内容Summary of the invention
为了解决上述技术问题,本发明实施例提供了一种异常处理方法及装置。In order to solve the above technical problem, an embodiment of the present invention provides an abnormal processing method and apparatus.
根据本发明的一个实施例,提供了一种异常处理方法,包括:检测访问快捷外设互联标准PCI-E存储memory空间过程中是否发生异常;如果是,则将引起异常的异常指令所对应的返回值设置为非法值;将所述异常指令的下一个地址作为所述当前异常的返回地址。According to an embodiment of the present invention, an exception processing method is provided, including: detecting whether an abnormality occurs in a process of accessing a fast peripheral interconnection standard PCI-E storage memory space; if so, an abnormal instruction corresponding to an abnormality is generated The return value is set to an illegal value; the next address of the exception instruction is used as the return address of the current exception.
在本发明实施例中,检测访问PCI-E存储memory空间过程中是否发生异常,包括:检测是否发生以下情况:访问所述PCI-E存储memory空间时,由于所述PCI-E存储memory空间没有对应的存储实体而导致的异常,以及所述异常指令的所要访问的操作地址是否位于所述PCI-E存储memory空间的地址范围内;其中,在检测结果为是的情况下,确定访问所述PCI-E存储memory空间过程中发生异常。In the embodiment of the present invention, detecting whether an abnormality occurs during the process of accessing the PCI-E storage memory space includes: detecting whether the following occurs: when accessing the PCI-E storage memory space, the PCI-E storage memory space is not An abnormality caused by the corresponding storage entity, and whether the operation address of the abnormal instruction to be accessed is located in an address range of the PCI-E storage memory space; wherein, if the detection result is yes, determining to access the An exception occurred during PCI-E memory memory space.
在本发明实施例中,通过以下方式之一获取所述操作地址:在判断所述当前异常为加载load类指令引起的异常时,从所述load类指令获取所述操作地址;或从机器检查中断状态寄存器MCSR中读取所述操作地址。In the embodiment of the present invention, the operation address is obtained by one of the following manners: when determining that the current abnormality is an exception caused by loading a load class instruction, acquiring the operation address from the load class instruction; or checking from a machine The operation address is read in the interrupt status register MCSR.
在本发明实施例中,从所述load类指令获取所述操作地址,包括:根据所述load类指令的格式类型,从所述load类指令中与所述格式类型对应的指定位置获取所述操作地址。In the embodiment of the present invention, the obtaining the operation address from the load class instruction includes: acquiring, according to a format type of the load class instruction, a specified location from the load class instruction corresponding to the format type. Operation address.
在本发明实施例中,将引起所述当前异常的异常指令所对应的返回值设置为非法值,包括:在与所述异常指令对应的数据寄存器中写入所述非法值。In the embodiment of the present invention, setting the return value corresponding to the abnormal instruction that causes the current abnormality to an illegal value includes: writing the illegal value in a data register corresponding to the abnormal instruction.
在本发明实施例中,所述异常至少包括以下之一:加载load类指令引起的异常、总线上读取数据总线异常。In the embodiment of the present invention, the abnormality includes at least one of the following: an exception caused by loading a load class instruction, and a data bus abnormality read on the bus.
根据本发明的另一个实施例,还提供了一种异常处理装置,包括:检测模块,设置为检测访问快捷外设互联标准PCI-E存储memory空间过程中是否发生异常;设置模块,设置为在发生异常时,将引起异常的异常指令所对应的返回值设置为非法值;确定模块,设置为将所述异常指令的下一个地址作为所述当前异常的返回地址。According to another embodiment of the present invention, there is also provided an exception processing apparatus, comprising: a detecting module configured to detect whether an abnormality occurs during access to a standard PCI-E memory memory space; and a setting module is set to When an abnormality occurs, the return value corresponding to the abnormal command causing the abnormality is set to an illegal value; and the determining module is set to use the next address of the abnormal command as the return address of the current abnormality.
在本发明实施例中,所述检测模块,设置为检测是否发生以下情况:访问所述PCI-E存储memory空间时,由于所述PCI-E存储memory空间没有对应的存储实体而导致的异常,以及所述异常指令的所要访问的操作地址是否位于所述PCI-E存储memory空间的地址范围内;其中,在检测结果为是的情况下,确定访问所述PCI-E存储memory空间过程中发生异常。In the embodiment of the present invention, the detecting module is configured to detect whether an abnormality occurs when the PCI-E storage memory space does not have a corresponding storage entity when accessing the PCI-E storage memory space. And whether the operation address of the abnormal instruction to be accessed is located in an address range of the PCI-E storage memory space; wherein, if the detection result is yes, determining that the access to the PCI-E storage memory space occurs abnormal.
在本发明实施例中,所述装置还包括,获取模块,设置为获取所述操作地址,其中,所述获取模块还设置为在判断所述当前异常为加载load类指令引起的异常时,从所述load类指令获取所述操作地址;或从机器检查中断状态寄存器MCSR中读取所述操作地址。In the embodiment of the present invention, the device further includes: an acquiring module, configured to acquire the operation address, wherein the acquiring module is further configured to: when determining that the current abnormality is an abnormality caused by loading a load class instruction, The load class instruction acquires the operation address; or reads the operation address from a machine check interrupt status register MCSR.
在本发明实施例中,所述获取模块,还设置为根据所述load类指令的格式类型,从所述load类指令中与所述格式类型对应的指定位置获取所述操作地址。
In the embodiment of the present invention, the acquiring module is further configured to acquire the operation address from a specified location corresponding to the format type in the load class instruction according to a format type of the load class instruction.
通过本发明实施例,采用在发生异常时,将异常指令的下一个地址作为所述当前异常的返回地址,并将异常指令的返回值设置为非法值的技术手段,解决了相关技术中,访问pcie memory地址过程中出现异常而引起的进程停止以及系统因死循环挂掉的问题,进而提升了系统的健壮性和残存性。According to the embodiment of the present invention, when the abnormality occurs, the next address of the abnormal instruction is used as the return address of the current abnormality, and the return value of the abnormal instruction is set as the illegal value. The process caused by an abnormality in the pcie memory address process and the system hangs due to an infinite loop, thereby improving the robustness and survivability of the system.
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The drawings described herein are intended to provide a further understanding of the invention, and are intended to be a part of the invention. In the drawing:
图1是根据本发明实施例的异常处理方法的流程图;1 is a flowchart of an exception handling method according to an embodiment of the present invention;
图2为根据本发明实施例的异常处理装置的结构框图;2 is a block diagram showing the structure of an exception handling apparatus according to an embodiment of the present invention;
图3为根据本发明实施例的异常处理装置的另一结构框图;3 is a block diagram showing another structure of an exception processing apparatus according to an embodiment of the present invention;
图4为根据本发明优选实施例的使用分析指令的方法处理读pcie memory异常的处理方法的流程图;4 is a flowchart of a processing method for processing a read pcie memory exception using a method of analyzing an instruction according to a preferred embodiment of the present invention;
图5为根据本发明优选实施例的使用powerpc平台架构提供的寄存器实现pcie memory的异常处理。FIG. 5 illustrates exception handling of pcie memory using registers provided by the powerpc platform architecture in accordance with a preferred embodiment of the present invention.
下文中将参考附图并结合实施例来详细说明本发明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。The invention will be described in detail below with reference to the drawings in conjunction with the embodiments. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict.
本发明的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在所写的说明书、权利要求书、以及附图中所特别指出的结构来实现和获得。Other features and advantages of the invention will be set forth in the description which follows, The objectives and other advantages of the invention may be realized and obtained by means of the structure particularly pointed in the appended claims.
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is an embodiment of the invention, but not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts shall fall within the scope of the present invention.
在本实施例中提供了一种异常处理方法,图1是根据本发明实施例的异常处理方法的流程图,如图1所示,该流程包括如下步骤:An exception handling method is provided in this embodiment. FIG. 1 is a flowchart of an exception processing method according to an embodiment of the present invention. As shown in FIG. 1, the flow includes the following steps:
步骤S102,检测访问快捷外设互联标准PCI-E存储memory空间过程中是否发生异常;Step S102, detecting whether an abnormality occurs during the process of accessing the standard peripheral PCI-E memory memory space.
步骤S104,如果是,则将引起异常的异常指令所对应的返回值设置为非法值;Step S104, if yes, setting a return value corresponding to the abnormal instruction causing the abnormality to an illegal value;
步骤S106,将上述异常指令的下一个地址作为上述当前异常的返回地址。
In step S106, the next address of the abnormal command is used as the return address of the current abnormality.
通过上述各个步骤,采用在发生异常时,将异常指令的下一个地址作为所述当前异常的返回地址,,并将异常指令的返回值设置为非法值的的技术手段,解决了相关技术中,访问pcie memory地址过程中出现异常而引起的进程停止以及系统因死循环挂掉的问题,进而提升了系统的健壮性和残存性。Through the above-mentioned various steps, in the related art, the next address of the abnormal command is used as the return address of the current abnormality, and the return value of the abnormal command is set to an illegal value. The problem caused by an abnormality in the access to the pcie memory address and the system hangs due to an infinite loop, thereby improving the robustness and survivability of the system.
可选地,非法值可以是预先设定的,在本发明实施例中优选为十六进制的F。Optionally, the illegal value may be preset, and is preferably a hexadecimal F in the embodiment of the present invention.
对于检测访问PCI-E存储memory空间过程中是否发生异常可以通过检测是否发生以下情况:访问上述PCI-E存储memory空间时,由于上述PCI-E存储memory空间没有对应的存储实体而导致的异常,以及上述异常指令的所要访问的操作地址是否位于上述PCI-E存储memory空间的地址范围内;其中,在检测结果为是的情况下,确定访问上述PCI-E存储memory空间过程中发生异常。Whether an abnormality occurs during the process of detecting the access to the PCI-E memory memory space may be detected by detecting whether an abnormality occurs due to the absence of a corresponding storage entity in the PCI-E storage memory space when accessing the PCI-E storage memory space. And whether the operation address to be accessed of the abnormal instruction is located in an address range of the PCI-E storage memory space; wherein, if the detection result is yes, determining that an abnormality occurs during access to the PCI-E storage memory space.
其中,可以通过以下方式之一获取上述操作地址:在判断上述当前异常为加载load类指令引起的异常时,从上述load类指令获取上述操作地址;或从机器检查中断状态寄存器MCSR中读取上述操作地址,由于load类指令的格式类型存在多种情况,因此,本发明实施中,根据上述load类指令的格式类型,从上述load类指令中与上述格式类型对应的指定位置获取上述操作地址。The operation address may be obtained by one of the following methods: when determining that the current abnormality is an exception caused by loading a load class instruction, acquiring the operation address from the load class instruction; or reading the above from the machine check interrupt status register MCSR The operation address is different in the format type of the load class instruction. Therefore, in the implementation of the present invention, the operation address is obtained from the specified location corresponding to the format type in the load class instruction according to the format type of the load class instruction.
在本发明实施例中,上述异常至少包括以下之一:加载load类指令引起的异常、总线上读取数据总线异常。In the embodiment of the present invention, the abnormality includes at least one of the following: an exception caused by loading a load class instruction, and a data bus abnormality read on the bus.
综上所述,本发明实施例的技术方案,解决了powerpc架构处理器读访问无存储体对应的pcie memory地址而引起的异常问题,进而达到不杀死用户态进程,也不会导致系统因死循环挂掉的方法,从而提升了系统的健壮性和残存性。In summary, the technical solution of the embodiment of the present invention solves the abnormal problem caused by the powerpc architecture processor reading and accessing the pcie memory address corresponding to the memoryless body, thereby achieving the process of not killing the user state, and does not cause the system to cause The method of hanging out of the dead loop improves the robustness and survivability of the system.
以下结合一示例说明上述实施例中所提供的异常处理方法,但不用于限定本发明实施例:The exception processing method provided in the above embodiment is described below with reference to an example, but is not used to limit the embodiment of the present invention:
步骤1)确定导致异常的原因Step 1) Determine the cause of the abnormality
首先要在异常处理流程中判定导致异常的原因:即处理器访问了没有存储体对应的pcie memory地址导致的异常。如果不是这种情况,异常按原有的流程继续进行。否则,将针对这种异常做进一步处理First, the cause of the exception is determined in the exception handling flow: that is, the processor accesses the exception caused by the pcie memory address corresponding to the bank. If this is not the case, the exception continues as originally. Otherwise, this exception will be further processed
步骤2)判断异常地址是否在pcie memory地址范围内Step 2) Determine whether the abnormal address is within the range of the pcie memory address.
在确定异常原因后,需要进一步获取引起异常指令的操作地址,并将这个地址记录下来。并获取系统中所有pcie控制器所管辖的memory地址范围。最后,判断所记录的异常指令操作地址是否属于pcie memory空间。如果是,说明这种情况满足本发明实施例要处理的条件。否则,按原有的异常处理流程执行。After determining the cause of the abnormality, it is necessary to further obtain the operation address that caused the abnormal instruction and record the address. And obtain the range of memory addresses governed by all pcie controllers in the system. Finally, it is judged whether the recorded abnormal instruction operation address belongs to the pcie memory space. If so, it is stated that this condition satisfies the conditions to be handled by the embodiment of the present invention. Otherwise, it is executed according to the original exception handling process.
步骤3)将返回值填充为全FStep 3) Fill the return value to full F
当以上两种情况和条件都满足后,将load memory指令的返回值填充为全F。并将异常返回的地址改为引发异常的指令下地址,并跳过后续记录内核异常的打印或发送SIGBUS信号
给用户进程的操作,直接返回。When the above two conditions and conditions are met, the return value of the load memory instruction is filled with all Fs. Change the address returned by the exception to the address under the instruction that caused the exception, and skip the subsequent printing of the kernel exception or send the SIGBUS signal.
Give the user process an operation and return directly.
采用本发明实施例所提供的上述技术方案,可以简单有效的防止读访问PCIe产生异常后导致的进程被杀死或是系统挂掉的问题。The above technical solution provided by the embodiment of the present invention can easily and effectively prevent the problem that the process caused by the abnormality of the read access PCIe is killed or the system hangs.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必需的。It should be noted that, for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because certain steps may be performed in other sequences or concurrently in accordance with the present invention. Secondly, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present invention.
在本实施例中还提供了一种异常处理装置,用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述,下面对该装置中涉及到的模块进行说明。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。图2为根据本发明实施例的异常处理装置的结构框图。如图2所示,该装置包括:An exception processing apparatus is also provided in the embodiment to implement the above-mentioned embodiments and preferred embodiments. The descriptions of the modules involved in the apparatus are described below. As used below, the term "module" may implement a combination of software and/or hardware of a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated. 2 is a block diagram showing the structure of an exception handling apparatus according to an embodiment of the present invention. As shown in Figure 2, the device comprises:
检测模块20,设置为检测访问快捷外设互联标准PCI-E存储memory空间过程中是否发生异常;The detecting module 20 is configured to detect whether an abnormality occurs during the process of accessing the standard peripheral PCI-E memory memory space;
设置模块22,与检测模块20连接,设置为在发生异常时,将引起异常的异常指令所对应的返回值设置为非法值;The setting module 22 is connected to the detecting module 20, and is configured to set a return value corresponding to the abnormal command that causes the abnormality to an illegal value when an abnormality occurs;
确定模块24,与设置模块22连接,设置为将上述异常指令的下一个地址作为上述当前异常的返回地址。The determining module 24 is connected to the setting module 22 and is configured to set the next address of the abnormal command as the return address of the current abnormality.
通过上述各个模块的综合应用,采用在发生异常时,将异常指令的下一个地址作为所述当前异常的返回地址,并将将引起上述异常指令所对应的返回值设置为非法值的技术手段,解决了相关技术中,访问pcie memory地址过程中出现异常而引起的进程停止以及系统因死循环挂掉的问题,进而提升了系统的健壮性和残存性。Through the integrated application of each of the above modules, when the abnormality occurs, the next address of the abnormal instruction is used as the return address of the current abnormality, and the technical means for causing the return value corresponding to the abnormal command to be set to an illegal value is adopted. The related art has solved the problem that the process stops due to an abnormality in accessing the pcie memory address and the system hangs due to an infinite loop, thereby improving the robustness and survivability of the system.
图3为根据本发明实施例的异常处理装置的另一结构框图,其中,检测模块20,用于检测是否发生以下情况:访问上述PCI-E存储memory空间时,由于上述PCI-E存储memory空间没有对应的存储单元而导致的异常,以及上述异常指令的所要访问的操作地址是否位于上述PCI-E存储memory空间的地址范围内;其中,在检测结果为是的情况下,确定访问上述PCI-E存储memory空间过程中发生异常。FIG. 3 is a block diagram of another structure of an exception processing apparatus according to an embodiment of the present invention, wherein the detecting module 20 is configured to detect whether a PCI-E memory memory space is accessed due to the PCI-E memory memory space when accessing the PCI-E memory memory space. An exception caused by the absence of the corresponding storage unit, and whether the operation address of the abnormal instruction to be accessed is located in the address range of the PCI-E storage memory space; wherein, in the case of the detection result being YES, determining to access the PCI- An exception occurred during the storage of the memory space.
可选地,上述装置还包括,获取模块26,设置为获取上述操作地址,其中,获取模块26还设置为在判断上述当前异常为加载load类指令引起的异常时,从上述load类指令获取上述操作地址;或从机器检查中断状态寄存器MCSR中读取上述操作地址。Optionally, the device further includes: an obtaining module 26, configured to obtain the operation address, wherein the obtaining module 26 is further configured to: obtain the foregoing from the load class instruction when determining that the current abnormality is an abnormality caused by loading a load type instruction The operation address; or read the above operation address from the machine check interrupt status register MCSR.
进一步地,获取模块26,还设置为根据上述load类指令的格式类型,从上述load类指令中与上述格式类型对应的指定位置获取上述操作地址。Further, the obtaining module 26 is further configured to acquire the operation address from a specified location corresponding to the format type in the load class instruction according to a format type of the load class instruction.
为了更好的理解上述异常处理过程,以下结合优选实施例进行说明,但不用于限定本发
明实施例的保护范围。In order to better understand the above abnormal processing procedure, the following description will be made in conjunction with the preferred embodiments, but is not intended to limit the present invention.
The scope of protection of the embodiments.
下面结合图4进一步详细描述使用分析指令的方法解决读pcie memory异常的处理方法:The method for solving the read pcie memory exception using the method of analyzing the instruction is further described in detail below with reference to FIG. 4:
步骤S402,异常入口,powerpc架构的指令属于精简指令集,采用统一指令编码方式,指令的长度相等,而且所有指令中的op-code永远位于同样的位置。根据这一指令特征,对异常原因的判断可以采取分析指令op-code的方式来实现。Step S402, the exception entry, the powerpc architecture instruction belongs to the reduced instruction set, adopts a unified instruction encoding manner, the lengths of the instructions are equal, and the op-code in all the instructions is always in the same position. According to this instruction feature, the judgment of the cause of the abnormality can be implemented by analyzing the instruction op-code.
步骤S404,通过进入异常时保留下结构体struct pt_regs,提取引起异常的指令regs->nip。将指令的[0:5]bits取出,判断该指令是否属于load类指令。如果是,则执行步骤S406中的处理,否则步骤S410。In step S404, the instruction regs->nip causing the abnormality is extracted by retaining the lower structure struct pt_regs when entering the abnormality. Take the [0:5] bits of the instruction and determine if the instruction belongs to the load class instruction. If so, the processing in step S406 is performed, otherwise step S410.
为了获取引起异常的指令的操作地址,要对Load指令做进行进一步分析。可根据具体的Load指令提取出指令要访问的地址。每种load指令放存的格式不同,因此要根据不同的指令来提取地址。In order to obtain the operation address of the instruction causing the exception, the Load instruction is further analyzed. The address to be accessed by the instruction can be extracted according to the specific Load instruction. Each load instruction is stored in a different format, so the address is extracted according to different instructions.
步骤S406,通过linux内核结构变量struct pci_controller hose_head获取pcie所覆盖的地址范围。并检验步骤S404中提取的地址是否属于这个范围。如果是,执行步骤S410中的操作,否则执行步骤S410。In step S406, the address range covered by the pcie is obtained by the linux kernel structure variable struct pci_controller hose_head. It is also checked whether the address extracted in step S404 belongs to this range. If so, the operation in step S410 is performed, otherwise step S410 is performed.
步骤S408,获取异常时保留下来的信息struct pt_regs,将引起异常的load指令的数据寄存器regs->gpr[0]填充为全F。并将异常返回的地址指向引起异常指令的下地址regs->nip=(regs->nip)+4;。这样再重新返回到程序运行时,程序将不会感知到异常的发生过,就好像load指令在pcie memory空间内获取了值为全F的非法值一样。Step S408, the information struct pt_regs retained when the abnormality is acquired is filled in the data register regs->gpr[0] of the load instruction causing the abnormality to be all F. The address returned by the exception is pointed to the next address regs->nip=(regs->nip)+4; which causes the exception instruction. When you return to the program again, the program will not perceive the exception, just as the load instruction gets the illegal value of the full F in the pcie memory space.
步骤S410,执行原有的异常处理流程。In step S410, the original exception processing flow is executed.
另一种实现方法是根据powerpc核心存器来获取并分析异常信息。下面结合图5对使用核心寄存器实现的读访问pcie memory异常的处理作进一步的描述。Another implementation is to obtain and analyze exception information based on the powerpc core cache. The processing of the read access pcie memory exception implemented using the core register will be further described below with reference to FIG.
步骤S502,在异常入口处,通过获取MCSR寄存器来判断异常的类型。如果自动校验综合寄存器Machine Check Syndrome Register,简称为MCSR)[60]=1,则表示产生了“Bus read data bus error”异常。这种类型异常;Step S502, at the abnormal entry, the type of the abnormality is determined by acquiring the MCSR register. If the Machine Check Syndrome Register, referred to as MCSR)[60]=1, it indicates that a "Bus read data bus error" exception has occurred. This type of exception;
步骤S504,在powerpc架构中,在发生异常时,自动校验地址寄存器(Machine Check Address Register,简称为MCAR)寄存器包含了引起异常的指令所操作的地址。Step S504, in the powerpc architecture, when an exception occurs, the Machine Check Address Register (MCAR) register contains the address of the instruction that caused the exception.
步骤S506,通过读取MCAR寄存器,可以获取引起异常的指令要访问的操作地址,判断上述操作地址是否位于pcie所覆盖的地址范围,如果是,执行步骤S506中的操作,否则执行步骤S508。In step S506, by reading the MCAR register, the operation address to be accessed by the instruction causing the abnormality can be obtained, and it is determined whether the operation address is located in the address range covered by pcie. If yes, the operation in step S506 is performed, otherwise step S508 is performed.
步骤S508,获取异常时保留下来的信息struct pt_regs,将引起异常的load指令的数据寄存器regs->gpr[0]填充为全F。并将异常返回的地址指向引起异常指令的下地址regs->nip=(regs->nip)+4;。这样再重新返回到程序运行时,程序将不会感知到异常的发生过,就好像load指令在pcie memory空间内获取了值为全F的非法值一样。
Step S508, the information struct pt_regs retained when the abnormality is acquired is filled with the data register regs->gpr[0] of the load instruction causing the abnormality as the full F. The address returned by the exception is pointed to the next address regs->nip=(regs->nip)+4; which causes the exception instruction. When you return to the program again, the program will not perceive the exception, just as the load instruction gets the illegal value of the full F in the pcie memory space.
步骤S508,执行原有的异常处理流程。In step S508, the original exception processing flow is executed.
综上所述,本发明实施例达到了以下技术效果:解决了相关技术中,访问pcie memory地址过程中出现异常而引起的进程停止以及系统因死循环挂掉的问题,进而提升了系统的健壮性和残存性。In summary, the embodiment of the present invention achieves the following technical effects: the related art stops the process caused by an abnormality in accessing the pcie memory address and the system hangs due to an infinite loop, thereby improving the robustness of the system. Sex and survivability.
在另外一个实施例中,还提供了一种软件,该软件用于执行上述实施例及优选实施方式中描述的技术方案。In another embodiment, software is also provided for performing the technical solutions described in the above embodiments and preferred embodiments.
在另外一个实施例中,还提供了一种存储介质,该存储介质中存储有上述软件,该存储介质包括但不限于:光盘、软盘、硬盘、可擦写存储器等。In another embodiment, a storage medium is further provided, wherein the software includes the above-mentioned software, including but not limited to: an optical disk, a floppy disk, a hard disk, an erasable memory, and the like.
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的对象在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It is to be understood that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar objects, and are not necessarily used to describe a particular order or order. It is to be understood that the objects so used are interchangeable, where appropriate, so that the embodiments of the invention described herein can be carried out in a sequence other than those illustrated or described herein. In addition, the terms "comprises" and "comprises" and "the" and "the" are intended to cover a non-exclusive inclusion, for example, a process, method, system, product, or device that comprises a series of steps or units is not necessarily limited to Those steps or units may include other steps or units not explicitly listed or inherent to such processes, methods, products or devices.
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。It will be apparent to those skilled in the art that the various modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein. The steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software.
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above description is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and scope of the present invention are intended to be included within the scope of the present invention.
本发明提供的上述技术方案,可以应用于异常处理过程中,采用在发生异常时,将异常指令的下一个地址作为所述当前异常的返回地址,并将异常指令的返回值设置为非法值的技术手段,解决了相关技术中,访问pcie memory地址过程中出现异常而引起的进程停止以及系统因死循环挂掉的问题,进而提升了系统的健壮性和残存性。
The above technical solution provided by the present invention can be applied to an exception processing process, and when an abnormality occurs, the next address of the abnormal instruction is used as the return address of the current abnormality, and the return value of the abnormal instruction is set to an illegal value. The technical means solves the problem that the process stops due to an abnormality in accessing the pcie memory address and the system hangs due to an infinite loop, thereby improving the robustness and survivability of the system.
Claims (10)
- 一种异常处理方法,包括:An exception handling method, including:检测访问快捷外设互联标准PCI-E存储memory空间过程中是否发生异常;Detecting whether an exception occurs during access to the standard peripheral PCI-E memory memory space.如果是,则将引起异常的异常指令所对应的返回值设置为非法值;If yes, the return value corresponding to the exception instruction that caused the exception is set to an illegal value;将所述异常指令的下一个地址作为当前异常的返回地址。The next address of the exception instruction is taken as the return address of the current exception.
- 根据权利要求1所述的方法,其中,检测访问PCI-E存储memory空间过程中是否发生异常,包括:The method of claim 1, wherein detecting whether an abnormality occurs during access to the PCI-E storage memory space comprises:检测是否发生以下情况:访问所述PCI-E存储memory空间时,由于所述PCI-E存储memory空间没有对应的存储实体而导致的异常,以及所述异常指令的所要访问的操作地址是否位于所述PCI-E存储memory空间的地址范围内;其中,在检测结果为是的情况下,确定访问所述PCI-E存储memory空间过程中发生异常。Detecting whether: when accessing the PCI-E storage memory space, an exception caused by the PCI-E storage memory space having no corresponding storage entity, and whether the operation address of the abnormal instruction to be accessed is located The PCI-E stores the address range of the memory space; wherein, if the detection result is YES, it is determined that an abnormality occurs during access to the PCI-E storage memory space.
- 根据权利要求2所述的方法,其中,通过以下方式之一获取所述操作地址:The method of claim 2, wherein the operation address is obtained by one of:在判断所述当前异常为加载load类指令引起的异常时,从所述load类指令获取所述操作地址;或When it is determined that the current abnormality is an exception caused by loading a load class instruction, acquiring the operation address from the load class instruction; or从机器检查中断状态寄存器MCSR中读取所述操作地址。The operation address is read from the machine check interrupt status register MCSR.
- 根据权利要求3所述的方法,其中,从所述load类指令获取所述操作地址,包括:The method of claim 3, wherein the obtaining the operation address from the load class instruction comprises:根据所述load类指令的格式类型,从所述load类指令中与所述格式类型对应的指定位置获取所述操作地址。And obtaining, according to a format type of the load class instruction, the operation address from a specified location corresponding to the format type in the load class instruction.
- 根据权利要求1所述的方法,其中,将引起所述当前异常的异常指令所对应的返回值设置为非法值,包括:The method according to claim 1, wherein setting a return value corresponding to the abnormal instruction causing the current abnormality to an illegal value comprises:在与所述异常指令对应的数据寄存器中写入所述非法值。The illegal value is written in a data register corresponding to the abnormal instruction.
- 根据权利要求1至5任一项所述的方法,其中,所述异常至少包括以下之一:The method according to any one of claims 1 to 5, wherein the abnormality comprises at least one of the following:加载load类指令引起的异常、总线上读取数据总线的异常。The exception caused by loading the load class instruction and the error of reading the data bus on the bus.
- 一种异常处理装置,包括:An abnormal processing device includes:检测模块,设置为检测访问快捷外设互联标准PCI-E存储memory空间过程中是否发生异常;The detecting module is configured to detect whether an abnormality occurs during the process of accessing the standard peripheral PCI-E memory memory space;设置模块,设置为在发生异常时,将引起异常的异常指令所对应的返回值设置为非法值;Set the module to set the return value corresponding to the exception instruction that caused the exception to an illegal value when an exception occurs;确定模块,设置为将所述异常指令的下一个地址作为当前异常的返回地址。The determining module is set to use the next address of the abnormal instruction as the return address of the current abnormality.
- 根据权利要求7所述的装置,其中,所述检测模块,设置为检测是否发生以下情况:访问 所述PCI-E存储memory空间时,由于所述PCI-E存储memory空间没有对应的存储实体而导致的异常,以及所述异常指令的所要访问的操作地址是否位于所述PCI-E存储memory空间的地址范围内;其中,在检测结果为是的情况下,确定访问所述PCI-E存储memory空间过程中发生异常。The apparatus of claim 7, wherein the detection module is configured to detect if the following occurs: access When the PCI-E stores the memory space, the abnormality caused by the PCI-E storage memory space does not have a corresponding storage entity, and whether the operation address of the abnormal instruction to be accessed is located in the PCI-E storage memory space. Within the address range; wherein, in the case where the detection result is YES, it is determined that an abnormality occurs during access to the PCI-E storage memory space.
- 根据权利要求8所述的装置,其中,所述装置还包括,获取模块,设置为获取所述操作地址,其中,所述获取模块还设置为在判断所述当前异常为加载load类指令引起的异常时,从所述load类指令获取所述操作地址;或从机器检查中断状态寄存器MCSR中读取所述操作地址。The device according to claim 8, wherein the device further comprises: an obtaining module, configured to acquire the operation address, wherein the obtaining module is further configured to determine that the current abnormality is caused by loading a load class instruction When the exception occurs, the operation address is obtained from the load class instruction; or the operation address is read from the machine check interrupt status register MCSR.
- 根据权利要求9所述的装置,其中,所述获取模块,还设置为根据所述load类指令的格式类型,从所述load类指令中与所述格式类型对应的指定位置获取所述操作地址。 The apparatus according to claim 9, wherein the obtaining module is further configured to acquire the operation address from a specified location corresponding to the format type in the load class instruction according to a format type of the load class instruction. .
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510076615.7A CN105988905A (en) | 2015-02-12 | 2015-02-12 | Exception processing method and apparatus |
CN201510076615.7 | 2015-02-12 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016127600A1 true WO2016127600A1 (en) | 2016-08-18 |
Family
ID=56614198
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2015/086164 WO2016127600A1 (en) | 2015-02-12 | 2015-08-05 | Exception handling method and apparatus |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN105988905A (en) |
WO (1) | WO2016127600A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111124726A (en) * | 2019-12-09 | 2020-05-08 | 上海移远通信技术股份有限公司 | Method and device for detecting abnormity of opened modem port |
CN114647546A (en) * | 2022-03-30 | 2022-06-21 | 苏州浪潮智能科技有限公司 | Case abnormity processing method and device, electronic equipment and storage medium |
CN117573418A (en) * | 2024-01-15 | 2024-02-20 | 北京趋动智能科技有限公司 | Processing method, system, medium and equipment for video memory access exception |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110825593B (en) * | 2019-11-11 | 2022-08-23 | 腾讯科技(深圳)有限公司 | Method, device and equipment for detecting abnormal state of process and storage medium |
CN113467981A (en) * | 2020-03-31 | 2021-10-01 | 华为技术有限公司 | Exception handling method and device |
CN111682991B (en) * | 2020-05-28 | 2022-08-12 | 杭州迪普科技股份有限公司 | Bus error message processing method and device |
CN116680208B (en) * | 2022-12-16 | 2024-05-28 | 荣耀终端有限公司 | Abnormality recognition method and electronic device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060041863A1 (en) * | 2002-08-05 | 2006-02-23 | Kazunori Saito | Data processing method, date processing device computer program and recording medium |
US20080148016A1 (en) * | 2006-12-13 | 2008-06-19 | Fujitsu Limited | Multiprocessor system for continuing program execution upon detection of abnormality |
CN101625656A (en) * | 2009-07-28 | 2010-01-13 | 杭州华三通信技术有限公司 | Method and device for processing abnormity of PCI system |
CN103309762A (en) * | 2013-06-21 | 2013-09-18 | 杭州华三通信技术有限公司 | Equipment exception handling method and device |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101533370B (en) * | 2009-04-09 | 2011-10-26 | 成都市华为赛门铁克科技有限公司 | Memory abnormal access positioning method and device |
US9430349B2 (en) * | 2013-01-24 | 2016-08-30 | Xcerra Corporation | Scalable test platform in a PCI express environment with direct memory access |
-
2015
- 2015-02-12 CN CN201510076615.7A patent/CN105988905A/en not_active Withdrawn
- 2015-08-05 WO PCT/CN2015/086164 patent/WO2016127600A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060041863A1 (en) * | 2002-08-05 | 2006-02-23 | Kazunori Saito | Data processing method, date processing device computer program and recording medium |
US20080148016A1 (en) * | 2006-12-13 | 2008-06-19 | Fujitsu Limited | Multiprocessor system for continuing program execution upon detection of abnormality |
CN101625656A (en) * | 2009-07-28 | 2010-01-13 | 杭州华三通信技术有限公司 | Method and device for processing abnormity of PCI system |
CN103309762A (en) * | 2013-06-21 | 2013-09-18 | 杭州华三通信技术有限公司 | Equipment exception handling method and device |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111124726A (en) * | 2019-12-09 | 2020-05-08 | 上海移远通信技术股份有限公司 | Method and device for detecting abnormity of opened modem port |
CN111124726B (en) * | 2019-12-09 | 2024-01-26 | 上海移远通信技术股份有限公司 | Method and device for detecting abnormality of open modem port |
CN114647546A (en) * | 2022-03-30 | 2022-06-21 | 苏州浪潮智能科技有限公司 | Case abnormity processing method and device, electronic equipment and storage medium |
CN117573418A (en) * | 2024-01-15 | 2024-02-20 | 北京趋动智能科技有限公司 | Processing method, system, medium and equipment for video memory access exception |
CN117573418B (en) * | 2024-01-15 | 2024-04-23 | 北京趋动智能科技有限公司 | Processing method, system, medium and equipment for video memory access exception |
Also Published As
Publication number | Publication date |
---|---|
CN105988905A (en) | 2016-10-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2016127600A1 (en) | Exception handling method and apparatus | |
JP6871957B2 (en) | Emulated endpoint configuration | |
US10853272B2 (en) | Memory access protection apparatus and methods for memory mapped access between independently operable processors | |
CN105095128B (en) | Interrupt processing method and interrupt controller | |
US10678583B2 (en) | Guest controlled virtual device packet filtering | |
US9912474B2 (en) | Performing telemetry, data gathering, and failure isolation using non-volatile memory | |
TWI632462B (en) | Switching device and method for detecting i2c bus | |
US11960350B2 (en) | System and method for error reporting and handling | |
US8286027B2 (en) | Input/output device including a mechanism for accelerated error handling in multiple processor and multi-function systems | |
US10078543B2 (en) | Correctable error filtering for input/output subsystem | |
US8813071B2 (en) | Storage reclamation systems and methods | |
CN105373345B (en) | Memory device and module | |
US9575855B2 (en) | Storage apparatus and failure location identifying method | |
KR101498452B1 (en) | Debugging complex multi-core and multi-socket systems | |
US10157005B2 (en) | Utilization of non-volatile random access memory for information storage in response to error conditions | |
US8402320B2 (en) | Input/output device including a mechanism for error handling in multiple processor and multi-function systems | |
EP3035227A1 (en) | Method and device for monitoring data integrity in shared memory environment | |
EP2951706A1 (en) | Controlling error propagation due to fault in computing node of a distributed computing system | |
US9639076B2 (en) | Switch device, information processing device, and control method of information processing device | |
US8880957B2 (en) | Facilitating processing in a communications environment using stop signaling | |
JP5341198B2 (en) | Bit inversion in communication interface | |
US8589722B2 (en) | Methods and structure for storing errors for error recovery in a hardware controller | |
JP7404223B2 (en) | System and method for preventing unauthorized memory dump modification | |
CN114936135A (en) | Abnormity detection method and device and readable storage medium | |
CN108874579B (en) | Method for policing and initializing ports |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15881759 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 15881759 Country of ref document: EP Kind code of ref document: A1 |