[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN116795195A - Main board system with multiple CPU modules, control method of main board and computing equipment - Google Patents

Main board system with multiple CPU modules, control method of main board and computing equipment Download PDF

Info

Publication number
CN116795195A
CN116795195A CN202310638308.8A CN202310638308A CN116795195A CN 116795195 A CN116795195 A CN 116795195A CN 202310638308 A CN202310638308 A CN 202310638308A CN 116795195 A CN116795195 A CN 116795195A
Authority
CN
China
Prior art keywords
cpu
module
cpu module
modules
bmc
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310638308.8A
Other languages
Chinese (zh)
Inventor
莘盼龙
李韦霆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
XFusion Digital Technologies Co Ltd
Original Assignee
XFusion Digital Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by XFusion Digital Technologies Co Ltd filed Critical XFusion Digital Technologies Co Ltd
Priority to CN202310638308.8A priority Critical patent/CN116795195A/en
Publication of CN116795195A publication Critical patent/CN116795195A/en
Pending legal-status Critical Current

Links

Landscapes

  • Hardware Redundancy (AREA)

Abstract

The embodiment of the application provides a motherboard system with multiple CPU modules, a motherboard control method and computing equipment, which relate to the field of computers and can improve the stability of the motherboard, wherein the motherboard system comprises: n power supply modules, N CPU modules, a starting control module, a high-speed switch and a BMC; the high-speed switch is connected with any one of the N CPU modules, and the control end of the high-speed switch is connected with the BMC; the other end of the high-speed switch is connected with the starting control module; the N power supply modules are respectively used for supplying power to the N CPU modules, and the control ends of the N power supply modules are connected with the BMC; the BMC is used for controlling one end of the high-speed switch to be switched to be connected with any CPU module without alarm fault in the N CPU modules under the condition that the CPU modules connected with the high-speed switch have alarm faults; the BMC is used for isolating the CPU module with the alarm fault from the corresponding power module under the condition that the CPU module with the alarm fault exists in the N CPU modules, so that the CPU module with the alarm fault is powered down.

Description

Main board system with multiple CPU modules, control method of main board and computing equipment
Technical Field
The embodiment of the application relates to the field of computers, in particular to a motherboard system with multiple CPU modules, a motherboard control method and computing equipment.
Background
With the advent of the digitization age, computing power has become the core productivity of the digitization age; a common way to increase the power of a computing device is to set a plurality of central processing units (central processing Unit, CPU) modules on a motherboard in the computing device, where one master CPU module and at least one slave CPU module exist in the plurality of CPU modules. However, the motherboard of the multi-CPU module needs to be powered on to a plurality of CPU modules on the motherboard at the same time in the starting process, so that when one of the CPU modules fails, the motherboard cannot be started normally, thereby reducing the stability of the motherboard.
Disclosure of Invention
The embodiment of the application provides a motherboard system with multiple CPU modules, a motherboard control method and computing equipment, which can improve the stability of the motherboard.
In order to achieve the above purpose, the embodiment of the application adopts the following technical scheme:
in a first aspect, an embodiment of the present application provides a motherboard system with multiple CPU modules, where the motherboard system includes: n power supply modules, N CPU modules, a starting control module, a high-speed switch and a baseboard management controller BMC, wherein N is an integer greater than or equal to 2; one end of the high-speed switch is connected with any one of the N CPU modules, and the control end of the high-speed switch is connected with the BMC; the other end of the high-speed switch is connected with the starting control module; the N power supply modules are respectively used for supplying power to the N CPU modules, and the control ends of the N power supply modules are connected with the BMC; the BMC is used for: under the condition that the CPU module connected with one end of the high-speed switch has alarm faults, controlling one end of the high-speed switch to be switched to be connected with any one of the N CPU modules without alarm faults; wherein, the CPU module connected with the high-speed switch is a main CPU module; BMC is also used to: under the condition that the CPU module with the alarm fault exists in the N CPU modules, the CPU module with the alarm fault is isolated from the corresponding power module, so that the CPU module with the alarm fault is powered down.
The embodiment of the application provides a mainboard system with multiple CPU modules, which controls a high-speed switch to be connected with any CPU module without alarm fault in N CPU modules under the condition that the alarm fault occurs in a main CPU module connected with the high-speed switch through a BMC; and under the condition that the CPU module with the alarm fault exists in the N CPU modules, the BMC isolates the CPU module with the alarm fault from the corresponding power module so as to enable the CPU module with the alarm fault to be powered down. Therefore, in the restarting process of the main board system of the multi-CPU module, the main board system can be started normally only by controlling the CPU module with the alarm fault to be in a power-down state and controlling the CPU module without the alarm fault to be in a power-up state, so that the stability of the main board of the multi-CPU module is improved.
In a possible implementation manner, the main board system further includes a complex programmable logic device CPLD, and the BMC is connected with the control end of the high-speed switch and the control ends of the N power supply modules through the CPLD; BMC is used for: under the condition that the CPU module connected with one end of the high-speed switch has alarm faults, controlling one end of the high-speed switch to be switched to be connected with any one of the N CPU modules without alarm faults through the CPLD; wherein, the CPU module connected with the high-speed switch is a main CPU module; BMC is also used to: under the condition that the CPU module with the alarm fault exists in the N CPU modules, the CPLD is used for isolating the CPU module with the alarm fault from the corresponding power supply module so as to enable the CPU module with the alarm fault to be powered down.
In a possible implementation manner, the starting control module includes an integrated south bridge PCH and a basic input output system BIOS chip, the PCH is connected with a main CPU module of the N CPU modules through the high-speed switch, a control end of the PCH is connected with the BMC, and the PCH is connected with the BIOS chip; the BMC is also used for controlling the PCH and N CPU modules to be powered down and controlling M CPU modules without alarm faults in the PCH and N CPU modules to be powered up under the condition that the CPU modules connected with the high-speed switch do not have alarm faults and the other CPU modules have the alarm faults; m is an integer greater than or equal to 1 and less than N; and the PCH is used for running a boot program of the BIOS chip after power-on so as to start the main board system based on M CPU modules without alarm faults.
In the embodiment, under the condition that the CPU module connected with the high-speed switch does not have an alarm fault and the CPU modules with alarm faults exist in the rest CPU modules, the PCH and the N CPU modules are controlled to be powered down, and the M CPU modules without alarm faults in the PCH and the N CPU modules are controlled to be powered up; therefore, the PCH starts the main board system based on the M CPU modules without alarm faults, and the stability of the multi-CPU module main board is improved.
In a possible implementation manner, the starting control module further comprises an integrated south bridge PCH and a basic input output system BIOS chip, the PCH is connected with a main CPU module of the N CPU modules through a high-speed switch, and a control end of the PCH is connected with the BMC; the BMC is specifically used for controlling the high-speed switch to be connected with the main CPU module and be switched to be connected with a target CPU module in the N CPU modules when the CPU module with the alarm fault comprises the CPU module connected with the high-speed switch, wherein the target CPU module is one CPU module in the N CPU modules, and the alarm fault does not occur; the BMC is also specifically used for controlling the PCH and N CPU modules to be powered down and controlling the PCH and M CPU modules without alarm faults in the N CPU modules to be powered up, wherein M is an integer greater than or equal to 1 and less than N; and the PCH is also used for running a boot program of the BIOS chip after power-on so as to start the main board system based on the M CPU modules without alarm faults.
In a possible implementation, when n=2 X When X is an integer greater than or equal to 2, the BMC is specifically configured to control the CPU module in a first combination to be powered on, where the first combination is the current main CPU module of the motherboard system and any 2 of the N CPU modules X-1 -a combination of 1 slave CPU modules.
In a possible implementation manner, the BMC is further configured to, when it is determined that the CPU modules in all the first combinations are powered on, control the power on of the PCH and the CPU modules in the second combination when the first combination includes at least two CPU modules, where the second combination is any 2 of the current main CPU module and N CPU modules of the main board system X-2 -a combination of 1 slave CPU modules;
in a possible implementation manner, the BMC is configured to output first alarm information when the motherboard system is started normally, where the first alarm information is used to indicate that a faulty CPU module exists in CPU modules other than the CPU module in the second combination among the N CPU modules; and the BMC is further configured to, when it is determined that all the CPU modules in the second combination are powered on, fail to start the motherboard system normally, and when the second combination includes one CPU module, output second alarm information, where the second alarm information is used to indicate that a fault occurring on the motherboard system is a non-CPU module fault, or that the N CPU modules are all faulty CPU modules.
In one possible implementation manner, one power module of the N power modules is specifically connected to a pin of a memory power supply, a pin of a clock power supply, a pin of an input/output IO power supply, and a pin of a core power supply in a corresponding CPU module of the N CPU modules.
In one possible implementation, the high-speed switch is a one-out-of-many switch.
In a possible implementation manner, the BMC is further configured to obtain, from the PCH, an identification of a CPU module that alarms from among the N CPU modules.
In a second aspect, an embodiment of the present application provides a method for controlling a motherboard of a CPU module of a multi-CPU, where the method is applied to a motherboard system of the multi-CPU module, and the motherboard system includes: n power supply modules, N CPU modules, a high-speed switch, a starting control module and a baseboard management controller BMC, wherein N is an integer greater than or equal to 2; one end of the high-speed switch is connected with any one of the N CPU modules; the control end of the high-speed switch is connected with the BMC, and the other end of the high-speed switch is connected with the starting control module; the N power supply modules are respectively used for supplying power to the N CPU modules, and the control ends of the N power supply modules are connected with the BMC; the method comprises the following steps: when the CPU module with alarm fault includes one CPU module connected to one end of the high speed switch, the BMC controls the one end of the high speed switch to connect with one of the N CPU modules without alarm fault, wherein, the BMC is connected with the one end of the high speed switch The CPU module connected with the high-speed switch is a main CPU module; the BMC controls the starting control module and N CPU modules to be powered down, wherein N=2 X X is an integer greater than or equal to 2; the BMC controls the starting control module and the CPU module in the first combination to be electrified, wherein the first combination is any 2 of the main CPU module and N CPU modules X-1 -a combination of 1 slave CPU modules.
In a possible implementation manner, when it is determined that all the CPU modules in the first combination are powered on, the motherboard system cannot be started normally, the method further includes: the BMC controls the starting control module and N CPU modules to be powered down; the BMC controls the starting control module and the CPU module in the second combination to be electrified, wherein the second combination is any 2 of the current main CPU module and N CPU modules of the main board system X-2 -a combination of 1 slave CPU modules.
In a possible implementation manner, in a case that the CPU module connected to one end of the high-speed switch is not included in the CPU module with the alarm fault, the method further includes: the BMC controls the starting control module and N CPU modules to be powered down, wherein N=2 X X is an integer greater than or equal to 2; the BMC controls the starting control module and the CPU module in the first combination to be electrified, wherein the first combination is any 2 of the target CPU module and N CPU modules X-1 -a combination of 1 slave CPU modules.
In a third aspect, an embodiment of the present application provides a BMC, where the BMC includes a control unit, where the control unit is configured to, when a CPU module having an alarm fault includes a CPU module connected to one end of a high-speed switch, control one end of the high-speed switch to be switched to be connected to any one of N CPU modules having no alarm fault; the control unit is also used for controlling the starting control module and the N CPU modules to be powered down and controlling the starting control module and the CPU modules in the first combination to be powered up.
In one possible implementation manner, the control unit is configured to control the start control module and the N CPU modules to be powered down when the motherboard system cannot be started normally under the condition that all the CPU modules in the first combination are powered up; and controlling the starting control module and the CPU module in the second combination to be electrified.
In a possible implementation manner, the control unit is further configured to control the start control module and the N CPU modules to be powered down and control the start control module and the CPU modules in the first combination to be powered up when the CPU module connected to one end of the high-speed switch is not included in the CPU module with the alarm fault.
In a possible implementation manner, the BMC further includes an output unit; the control unit is used for controlling the high-speed switch to be switched from being connected with the current main CPU module to being connected with the target CPU module when the CPU module for alarming faults comprises the main CPU module; the control unit is also used for controlling the PCH and the N CPU modules to be powered down and controlling M CPU modules without alarm faults in the PCH and the N CPU modules to be powered up; the output unit is used for outputting the first alarm information after the main board system is started normally.
In a fourth aspect, an embodiment of the present application provides a BMC for performing the method according to any of the second aspect and its possible implementation manners.
In a fifth aspect, embodiments of the present application provide a computing device comprising the motherboard system of any one of the first aspect and its possible implementation forms.
Drawings
FIG. 1 is a schematic diagram of a motherboard system with multiple CPU modules according to an embodiment of the present application;
FIG. 2 is a second schematic diagram of a motherboard system with multiple CPU modules according to an embodiment of the present application;
fig. 3 is a schematic flow chart of a control method of a motherboard of a multi-CPU module according to an embodiment of the present application;
Fig. 4 is a schematic diagram of a motherboard system with multiple CPU modules according to an embodiment of the present application;
fig. 5 is a schematic flow chart II of a control method of a motherboard of a multi-CPU module according to an embodiment of the present application;
fig. 6 is a flowchart of a control method of a motherboard with multiple CPU modules according to an embodiment of the present application;
fig. 7 is a flowchart of a control method of a motherboard with multiple CPU modules according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a motherboard system with multiple CPU modules according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a BMC according to an embodiment of the present application.
Detailed Description
The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone.
The terms first and second and the like in the description and in the claims of embodiments of the application, are used for distinguishing between different objects and not necessarily for describing a particular sequential order of objects. For example, the first CPU module and the second CPU module, etc. are used to distinguish between different CPU modules, and are not used to describe a particular order of CPU modules.
In embodiments of the application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g." in an embodiment should not be taken as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
In the description of the embodiments of the present application, unless otherwise indicated, the meaning of "a plurality" means two or more. For example, the plurality of CPU modules refers to two or more CPU modules.
As is well known, a common way to increase the computing power of a computing device is to set up multiple CPU modules on a motherboard in the computing device; wherein the plurality of CPU modules comprises a main CPU module and at least one slave CPU module, and a south bridge chip (platform controller hub, PCH) in a main board of the plurality of CPU modules is connected with the main CPU module through a direct media interface (direct media interface, DMI) bus.
In the starting process of the system of the main board of the multi-CPU module (called as the main board system of the multi-CPU module for short), PCH acquires and runs a BIOS boot program from a basic input output system (E1 sic input output system, BIOS) chip, and in the process of running the boot program, PCH can complete self-checking of the plurality of CPU modules through a DMI bus between the PCH and the main CPU module, and then guides an operating system to complete starting of the main board system of the multi-CPU module.
However, in the starting process of the motherboard system with multiple CPU modules, when a faulty CPU module exists in the multiple CPU modules on the motherboard, the motherboard simultaneously powers up or simultaneously powers down the multiple CPU modules, so that the motherboard can simultaneously power down the multiple CPU modules, thereby causing the failure of starting the motherboard system with multiple CPU modules, and after the faulty CPU module is replaced, the motherboard system with multiple CPU modules can be started normally, so that any fault of the CPU modules can not be seen, and the service system on the motherboard can not run, thereby reducing the stability of the motherboard with multiple CPU modules.
Based on this, the embodiment of the application provides a motherboard system with multiple CPU modules, which controls a high-speed switch to be connected with any one of N CPU modules without alarm fault under the condition that the main CPU module connected with the high-speed switch has alarm fault through a baseboard management controller (E1 seboard management controller, BMC); and under the condition that the CPU module with the alarm fault exists in the N CPU modules, the BMC isolates the CPU module with the alarm fault from the corresponding power module so as to enable the CPU module with the alarm fault to be powered down. Therefore, in the restarting process of the main board system of the multi-CPU module, the main board system can be started normally only by controlling the CPU module with the alarm fault to be in a power-down state and controlling the CPU module without the alarm fault to be in a power-up state, so that the stability of the main board of the multi-CPU module is improved to a certain extent.
The embodiment of the application provides a mainboard system of a plurality of CPU modules, which comprises: the system comprises N power supply modules, N CPU modules, a high-speed switch, a BMC and a starting control module, wherein the starting control module comprises an integrated south bridge chip PCH and a basic input output system BIOS chip; wherein N is an integer of 2 or more. The embodiment of the present application is illustrated by taking a value of N of 4 as an example, and the motherboard system is specifically shown in fig. 1.
The power supply unit outside the main board system supplies power to the 4 CPU modules (such as a first CPU module (main CPU module), a second CPU module, a third CPU module and a fourth CPU module) through the 4 power modules (such as a first power module, a second power module, a third power module and a fourth power module) on the main board, that is, the 4 power modules in the main board system are respectively used for supplying power to the 4 CPU modules; the method specifically comprises the following steps:
the input end of each power module in the 4 power modules is connected with a power supply unit outside the main board system, and the output end of each power module is connected with a corresponding CPU module. That is, the input end of the first power module is connected with the power supply unit, and the output end of the first power module is connected with the first CPU module, so that the power supply unit supplies power to the first CPU module through the first power module. The input end of the second power supply module is connected with the power supply unit, and the output end of the second power supply module is connected with the second CPU module, so that the power supply unit supplies power to the second CPU module through the second power supply module; the input end of the third power supply module is connected with the power supply unit, and the output end of the third power supply module is connected with the third CPU module, so that the power supply unit supplies power for the third CPU module through the third power supply module. The input end of the fourth power supply module is connected with the power supply unit, and the output end of the fourth power supply module is connected with the fourth CPU module, so that the power supply unit supplies power for the fourth CPU module through the fourth power supply module. The power supply unit may be a power supply (power supply unit, PSU).
It should be understood that a CPU module in fig. 1 may be understood as a CPU module, which includes a CPU module chip and its corresponding peripheral circuits, such as a memory, an input/output circuit, and the like.
It should be noted that any one of the 4 CPU modules includes: pins of a memory (i.e., cache) power supply, pins of a clock power supply, pins of an Input Output (IO) power supply and pins of a core power supply in the CPU module. The pins of the memory power supply provide working voltage for the memories in the CPU module; the pins of the clock power supply are pins for providing working voltage for the clock modules in the CPU module; the pins of the IO power supply are pins for providing working voltage for the IO modules in the CPU module; the core power supply pin is a pin for providing working voltage for the processing module in the CPU module.
It should be understood that, in the process of powering on the above CPU module (e.g., the first CPU module), the pins of the 4 power supplies in the first CPU module need to be powered on sequentially according to a specified time sequence, where the specified time sequence is a pin of the memory power supply, a pin of the clock power supply, a pin of the IO power supply, and a pin of the core power supply sequentially.
Of the 4 power supply modules, a first power supply module is taken as an example for explanation; the first power module includes: sub power supply module 1 to sub power supply module 4; the sub power module 1 is connected with a pin of a memory power supply in the first CPU module, and is configured to output a working voltage applicable to a memory in the first CPU module by performing a step-down process or a step-up process on a voltage input by the power supply unit. The sub power module 2 is connected with a pin of a clock power supply in the first CPU module and is used for outputting working voltage applicable to the clock module in the first CPU module by means of step-down processing or step-up processing on voltage input by the power supply unit. The sub power module 3 is connected with a pin of an IO power supply in the first CPU module and is used for outputting working voltage applicable to the IO module in the first CPU module by means of step-down processing or step-up processing on voltage input by the power supply unit. The sub power module 4 is connected with a core power pin in the first CPU module and is used for outputting working voltage applicable to the processing module in the first CPU module by means of step-down processing or step-up processing on voltage input by the power supply unit. Namely: the sub power supply module 1 in the first power supply module is a memory power supply in the first CPU module, the sub power supply module 2 in the first power supply module is a clock power supply in the first CPU module, the sub power supply module 3 in the first power supply module is an IO power supply in the first CPU module, and the sub power supply module 4 in the first power supply module is a core power supply in the first CPU module. The sub power modules 1 to 4 may be voltage regulation modules (voltage regulator module, VRM).
It should be noted that, each of the above-mentioned sub power modules has a built-in switch, and the built-in switch is used for controlling the on and off of the sub power module where the built-in switch is located; when the built-in switch in the sub power supply module is closed, the sub power supply module is in a conducting state and is used for outputting working voltage to the corresponding module; when the built-in switch in the sub power supply module is disconnected, the sub power supply module is in a disconnected state, and at the moment, the sub power supply module does not output working voltage to the corresponding module. Then, the first power module controls whether the first CPU module is electrified or not by controlling the on state and the off state of a plurality of sub power modules included in the first power module; that is, when the first power module is in an on state (i.e., a plurality of sub power modules included in the first power module are in an on state), the first CPU module is powered on; when the first power module is in an isolated state (namely, at least one sub power module included in the first power module is in an off state), the first CPU module is powered down.
The control ends of the 4 power supply modules are connected with the BMC; namely: the control end of the first power supply module, the control end of the second power supply module, the control end of the third power supply module and the control end of the fourth power supply module are all connected with the BMC, so that the BMC is used for controlling one or more power supply modules in the 4 power supply modules to be in an on state or an isolated state, and specifically, the BMC can respectively control the 4 power supply modules to be in the on state or the isolated state through 4 independent signals.
The BMC is used for controlling any one or more power modules in the 4 power modules to be in an on state or an isolation state so as to enable any one or more CPU modules to be electrified or powered down; when one power module is in an isolated state, the CPU module corresponding to the power module (connected with the power module) is in a power-down state; when the power supply module is in an on state, the CPU module corresponding to the power supply module is in a power-on state. The specific implementation process comprises the following steps: when the BMC sends 0 to the first power supply module, the first power supply module adjusts the state to be in an isolated state, so that the CPU module connected with the first power supply module is powered down; when the BMC sends 1 to the first power module, the first power module adjusts the state to be in an on state, so that the CPU module connected with the first power module is electrified.
Then under the condition that the CPU module with the alarm fault exists in the 4 CPU modules, the BMC controls the CPU module with the alarm fault to be isolated from the corresponding power module (namely, the power module connected with the CPU module with the alarm fault is controlled to be in an isolated state, and the power supply for the CPU module with the alarm fault is stopped), so that the CPU module with the alarm fault is powered down; the CPU module for alarming faults refers to the CPU module for alarming faults.
One end of the high-speed switch is connected with any one of the 4 CPU modules, wherein the CPU module connected with the high-speed switch is a main CPU module in the main board system. Wherein the high-speed switch is a multi-selection switch.
It should be noted that the high-speed switch only establishes a connection relationship with one of the 4 CPU modules, where the CPU module connected by the high-speed switch is a master CPU module of the 4 CPU modules, and the remaining 3 CPU modules are slave CPU modules.
Illustratively, when the high-speed switch is connected with the first CPU module, the first CPU module is the main CPU module of the 4 CPU modules; when the high-speed switch is connected with the second CPU module, the second CPU module is the main CPU module in the 4 CPU modules.
The BMC is connected with the control end of the high-speed switch and is used for controlling the high-speed switch to be connected with a specific one of the 4 CPU modules.
The BMC is used for switching the CPU module currently connected with the high-speed switch into one of other CPU modules in the 4 CPU modules; that is, the BMC can switch the main CPU module among the 4 CPU modules by controlling the high-speed switch; if the BMC controls the connection relation between the high-speed switch and the 4 CPU modules, the current main CPU module can be switched from the first CPU module to any one of the second CPU module, the third CPU module and the fourth CPU module.
The specific implementation of the connection relation between the BMC control high-speed switch and the 4 CPU modules comprises: when the BMC sends 00 to the high-speed switch, the high-speed switch is selectively connected with the first CPU module; when the BMC sends 01 to the high-speed switch, the high-speed switch is selectively connected with the second CPU module; when the BMC sends 10 to the high-speed switch, the high-speed switch is selectively connected with the third CPU module; when the BMC sends 11 to the high speed switch, the high speed switch is selectively connected to the fourth CPU module.
The other end of the high-speed switch is connected with the starting control module, so that the starting control module is connected to the main CPU module through the high-speed switch, and after the CPU module is electrified, a boot program of a BIOS chip in the starting control module is operated, and starting of the main board system is realized.
It should be noted that, one main CPU module (e.g., a first CPU module) and 3 sub CPU modules (e.g., a second CPU module, a third CPU module, and a fourth CPU module) exist in the 4 CPU modules; in the starting process of the main board system of the multi-CPU module, a starting control module starts a main CPU module, and the main CPU module guides the 3 slave CPU modules to finish self-checking and guides the 3 slave CPU modules and the main CPU module to run an operating system; therefore, when the BMC controls the power module between the main CPU module and the power supply unit to be in an isolated state, the main CPU module cannot guide the 3 slave CPU modules to self-check and run the operating system, so that the starting failure of the main board system is caused.
Then, under the condition that the CPU module connected with one end of the high-speed switch has alarm faults, the BMC controls one end of the high-speed switch to be switched to be connected with any one of the 4 CPU modules without alarm faults; so that the main CPU module in the main board system is a CPU module without alarm fault.
In one example, the BMC is connected to the control end of the high-speed switch and the control ends of the 4 power modules through a complex programmable logic device (complex programming logic device, CPLD), and at this time, the BMC controls the power modules to be in an on state or an isolated state through the CPLD, and the BMC controls the connection relationship between the high-speed switch and the 4 CPU modules through the CPLD. The specific implementation mode of the BMC for controlling the power supply module and the high-speed switch through the CPLD is not particularly limited.
The embodiment of the application provides a mainboard system with multiple CPU modules, which controls a high-speed switch to be connected with any CPU module without alarm fault in N CPU modules under the condition that the CPU modules (main CPU modules) connected with the high-speed switch generate alarm fault through a BMC; and under the condition that the CPU module with the alarm fault exists in the N CPU modules, the BMC isolates the CPU module with the alarm fault from the corresponding power module so as to enable the CPU module with the alarm fault to be powered down. Therefore, in the restarting process of the main board system of the multi-CPU module, the main board system can be started normally only by controlling the CPU module with the alarm fault to be in a power-down state and controlling the CPU module without the alarm fault to be in a power-up state, so that the stability of the main board of the multi-CPU module is improved to a certain extent.
In one embodiment, as shown in fig. 2 on the basis of the motherboard system shown in fig. 1, the start control module includes: PCH and BIOS chips.
The PCH is connected with a main CPU module in the 4 CPU modules through the high-speed switch, and is connected with the BIOS chip and used for acquiring and executing a boot program (simply called BIOS boot program) in the BIOS chip so as to realize the starting of the main board system; the control end of the PCH is connected with the BMC; the BMC is used for controlling the power-on and power-off of the PCH, for example, when the BMC sends a '1' to the PCH, the PCH performs power-on operation; when the BMC sends a "0" to the PCH, the PCH performs a power down action.
The BMC may specifically be coupled to the PCH based on an enhanced serial peripheral interface (enhanced serial peripheral interface, ESPI) bus; the BMC may also be coupled to the PCH based on a Low Pin Count (LPC) bus.
It should be understood that, the PCH is used for running a boot program of the BIOS after the PCH is powered on, and the PCH will first guide the main CPU module connected with the PCH to complete self-checking in the process of running the boot program, and the main CPU module guides the other three CPU modules to complete self-checking; then, PCH guides itself (i.e. main CPU module) and other three CPU modules to start and run operation system through the main CPU module, thus completing the start of the main board system of the multi-CPU module. Wherein the PCH is connected with the main CPU module based on a direct media interface (direct media interface, DMI).
The BMC is also used for acquiring information of the CPU module for alarming faults from the PCH.
It should be noted that, in the running or starting process of the above motherboard system, if one or more CPU modules in the 4 CPU modules in the motherboard system generate uncorrectable UCE fault, the one or more CPU modules report the UCE fault information to the PCH through the main CPU module in an alarm manner, and at this time, the BMC acquires the UCE fault information from the PCH; the UCE fault information comprises the identification of the CPU module reporting the fault (namely, the CPU module alarming the fault).
In the mainboard system of the multiple CPU modules provided by the embodiment of the application, the PCH running the BIOS boot program is connected with one CPU module of the N CPU modules through the high-speed switch, wherein the CPU module connected with the PCH is the main CPU module of the N CPU modules; the control end of the high-speed switch is connected with the BMC, the BMC controls the high-speed switch to be connected with a specific CPU module among the N CPU modules, namely, the BMC determines a main CPU module among the N CPU modules; therefore, when the CPU module with alarm fault exists in the N CPU modules and the CPU module with alarm fault comprises a main CPU module, the BMC can switch the main CPU module from the CPU module with alarm fault (such as a first CPU module) to the CPU module without alarm fault (such as a second CPU module) by controlling the high-speed switch; then, the BMC can normally start the main board system by controlling the CPU module with the alarm fault to be in a power-down state and controlling other CPU modules except the CPU module with the alarm fault to be in a power-up state, so that the stability of the main board with the multiple CPU modules is improved.
Based on the above motherboard system, the embodiment of the application provides two control methods of the motherboard with multiple CPU modules, specifically through the development of the first to second schemes.
Scheme one
The embodiment of the application provides a control method of a motherboard of a multi-CPU module, which is shown in FIG. 3 and comprises the following steps: S110-S150.
S110, the BMC determines whether the CPU module with the alarm fault comprises a main CPU module.
When the BMC determines that the main CPU module is included in the CPU module for alarming failure, S120 is executed.
And when the BMC determines that the main CPU module is not included in the CPU module with the alarm fault, executing S130.
It should be understood that, in the running or starting process of the above-mentioned motherboard system, if an uncorrectable fault (i.e., UCE) occurs in one or more CPU modules in the N CPU modules in the motherboard system, the faulty CPU module will report the fault information to the PCH through the main CPU module on the motherboard system in an alarm manner; the BMC obtains the fault information from the PCH. Similarly, in the running or starting process of the main board system, if one or more CPU modules in the N CPU modules in the main board system have circuit faults (for example, the voltage of the core power supply pin of a certain CPU module is lower than a threshold value), the faulty CPU module can send fault information to the BMC in an alarm mode; the fault information obtained by the BMC in the two modes includes an identifier of a CPU module sending the fault information (abbreviated as a fault-alarming CPU module), that is, the BMC may determine the fault-alarming CPU module in the N CPU modules.
The main CPU module is a CPU module connected with one end of the high-speed switch; that is, the main CPU module is the main CPU module of the N CPU modules in the current motherboard system.
It should be noted that, the identifier of the main CPU module in the N CPU modules is obtained by the BMC module from the PCH, and a specific obtaining manner refers to the related art and is not described herein.
The specific implementation of S110 includes: the BMC judges whether the identification set of the CPU module with the alarm fault (namely, the identification set of the CPU module with the alarm fault) comprises the identification of the main CPU module; when the identification set of the CPU module with the alarm fault comprises the identification of the main CPU module, the BMC determines that the CPU module with the alarm fault comprises the main CPU module. When the identification set of the CPU module with the alarm fault does not include the identification of the main CPU module, the BMC determines that the CPU module with the alarm fault does not include the main CPU module, that is, the CPU module with the alarm fault is part or all of the N CPU modules on the main board.
S120, the BMC controls the high-speed switch to be connected with the current main CPU module and is switched to be connected with the target CPU module, wherein the target CPU module is one CPU module which does not have alarm faults in the N CPU modules.
It should be noted that, the current main CPU module is the CPU module connected to the high-speed switch before the S120 switch, that is, the current main CPU module is the main CPU module in the motherboard system before the S120 switch.
The target CPU module is any one of the N CPU modules without alarm faults. For example, in the motherboard system shown in fig. 2, the first CPU module is a main CPU module; the CPU module for alarming fault obtained by the BMC is assumed to comprise: a first CPU module and a second CPU module; at this time, the target CPU module may be any one of the third CPU module and the fourth CPU module.
The specific implementation of the BMC control high-speed switch from being connected with the current main CPU module to being connected with the target CPU module comprises the following steps: taking the motherboard system shown in fig. 2 as an example; when the CPU module for alarming faults comprises a first CPU module and a second CPU module, wherein the first CPU module is a main CPU module, the BMC sends a switching command of the main CPU module to the high-speed switch, and the switching command comprises an identifier of a target CPU module, for example, an identifier 10 of a third CPU module; the high-speed switch responds to the switching command to disconnect the connection with the first CPU module and connect with the third CPU module, so that the third CPU module becomes a main CPU module in the main board system.
S130, the BMC controls PCH and N CPU modules to be powered down.
The specific implementation of S130 includes: the BMC generates a power-down instruction to the PCH and the N power modules, so that the PCH is controlled to be powered down through a control end of the PCH; and isolating N power supply modules (such as the first power supply module to the second power supply module in fig. 1 or fig. 2) between the N CPU modules and the power supply unit, so that the N CPU modules (such as the first CPU module to the second CPU module in fig. 1 or fig. 2) are powered down.
After the execution of S130, the BMC downgrades the motherboard system, so that the CPU module in the power-on state in the downgraded motherboard system does not include the CPU module with the alarm fault, which is specifically referred to S140 below.
And S140, the BMC controls the PCH and M CPU modules without alarm faults in the N CPU modules to be powered on.
M is an integer greater than or equal to 1 and less than N.
The M CPU modules include the current main CPU module (such as a third CPU module) of the main board system, and the M CPU modules do not include the CPU module with alarm fault, that is, any one of the M CPU modules is the CPU module without alarm fault.
It can be understood that, when the BMC does not execute the S120, the current main CPU module of the motherboard system is the main CPU module (e.g., the first CPU module) in the motherboard system before the switch in the S120; when the BMC executes the step S120, the current main CPU module of the main board system is the target CPU module (e.g. the third CPU module).
In the process of powering on M CPU modules without alarm faults in PCH and N CPU modules, other (namely N-M) CPU modules except the M CPU modules without alarm faults are in a power-down state, and the other CPU modules comprise the CPU modules with alarm faults; that is, the BMC isolates N-M CPU modules with alarm faults included in the motherboard system, so that the motherboard system is degraded to a motherboard system of M paths of CPU modules.
It should be noted that, after the PCH is powered on, the boot program in the BIOS chip is automatically run to start the motherboard system based on the M CPU modules without alarm faults.
For example, when the above-mentioned multi-CPU-module motherboard system is the motherboard system as shown in fig. 2, it is assumed that the first CPU module and the second CPU module are the failure-warning CPU modules. The third CPU module is a main CPU module in the 4 CPU modules; then, the BMC controls PCH and 4 CPU modules in the main board system to be powered down; at this time, when the value of M is 2, the BMC controls PCH, the third CPU module and the fourth CPU module to be electrified; the PCH then activates the motherboard system based on the third CPU module and the fourth CPU module. When the value of M is 1, the BMC controls the PCH and the third CPU module to be powered on, and then the PCH starts the mainboard system based on the third CPU module.
And S150, after the mainboard system is started normally, the BMC outputs first alarm information.
The first alarm information is used for indicating that the CPU module with actual faults exists in the CPU modules except the M CPU modules which are electrified in the N CPU modules.
It should be noted that after S150 is performed, the present control method ends.
In the multi-CPU module main board system provided by the embodiment of the application, when the CPU module with alarm fault comprises a main CPU module, the BMC is particularly used for controlling the high-speed switch to switch the current main CPU module (such as a first CPU module) of the main board system into one CPU module in the CPU module without alarm fault, namely a target CPU module; then, the BMC restarts M CPU modules in the PCH and the CPU modules without alarm fault, and the rest N-M CPU modules are in a power-down state in the restarting process; finally, PCH starts the main board system based on the M CPU modules without alarm faults after power-on. The BMC is also specifically used for restarting the PCH and M CPU modules in the CPU modules without the alarm fault when the CPU modules without the alarm fault do not include the main CPU module, and enabling the rest N-M CPU modules to be in a power-down state in the restarting process; finally, PCH starts the main board system based on the M CPU modules without alarm faults after power-on. Therefore, the problem that the mainboard system cannot be started normally due to the fact that the CPU module with the alarm fault exists in the N CPU modules is solved, and therefore stability of the mainboard is improved.
It should be noted that, because of the strong correlation between the N CPU modules, for example: the first CPU module invokes the computational power of the second CPU module to execute the task, and when the second CPU module fails, the CPU module with the alarm failure obtained from the BMC may not only include the second CPU module, but also erroneously determine the first CPU module as the failed CPU module (i.e. also include the first CPU module). It can be seen that the BMC determines the CPU module that has reported an alarm failure as the actual failed CPU module is not necessarily accurate, i.e.: the CPU module with the alarm fault obtained by the BMC is the predicted CPU module with the actual fault in the main board system, so that the main board system can not be started normally when other CPU modules except the CPU module with the alarm fault are controlled to be in a power-on state under the condition that the CPU module with the alarm fault in the main board system is controlled to be in a power-off state. The embodiment of the application also provides another control method of the main board of the multi-CPU module, and the scheme II is specifically shown below.
It should be noted that, in the second scheme, for convenience of description, the CPU module actually failed is simply referred to as a failed CPU module, and will not be described in detail later.
Scheme II
The embodiment of the application provides a control method of a main board of a multi-CPU module, which is illustrated by 3 examples and specifically comprises the following steps:
in a first example, when the above-mentioned motherboard system includes a first CPU module and a second CPU module as shown in fig. 4, that is, the motherboard system is a motherboard system of a 2-way CPU module, based on the motherboard system, the embodiment of the present application provides a method for controlling a motherboard of a 2-way CPU module, as shown in fig. 5, the method includes: S210-S250.
S210, the BMC determines whether the CPU module with the alarm fault comprises a main CPU module.
When the main CPU module is included in the malfunction alerting CPU module, the BMC performs S220.
When the main CPU module is not included in the malfunction alerting CPU module, the BMC performs S230.
It should be noted that, the implementation manner of S210 is consistent with that of S110, and specific descriptions of S210 may refer to the related descriptions of S110, which are not repeated herein.
S220, the BMC controls the high-speed switch to be connected with the target CPU module from the current main CPU module, wherein the target CPU module is one CPU module which does not have alarm fault in the 2 CPU modules.
It should be noted that, the CPU module connected to the high-speed switch is a main CPU module in the motherboard system.
The target CPU module is one of the CPU modules without alarm fault. For example, in the motherboard system shown in fig. 4, it is assumed that the first CPU module is a main CPU module; the CPU module of the warning trouble that BMC obtained includes: the first CPU module is used for executing the first CPU module; then the target CPU module is the second CPU module.
It should be noted that, the implementation of S220 is similar to the implementation of S120, and specific descriptions of S220 may refer to the related descriptions of S120, which are not repeated herein.
S230, the BMC controls the PCH, the first CPU module and the second CPU module to be powered down.
It should be noted that, the implementation of S230 is similar to the implementation of S130, and specific descriptions of S230 may refer to the related descriptions of S130, which are not repeated herein.
S240, the BMC controls PCH and the CPU module without alarm fault to power on.
It should be noted that, when the BMC executes the step S220, the CPU module without the alarm fault is the slave CPU module, for example, a second CPU module; when the BMC has not executed the S220, the CPU module without the alarm fault is the main CPU module, for example, a first CPU module. That is, the CPU module without alarm fault is the current main CPU module of the main board system.
It should be noted that, after the PCH is powered on, the boot program in the BIOS chip is automatically run to start the motherboard system based on the current powered-on CPU module without alarm fault in the motherboard system.
For example, assume that the motherboard system includes 2 CPU modules as shown in fig. 4, namely a first CPU module and a second CPU module, where the first CPU module is a master CPU module and the second CPU module is a slave CPU module; when the CPU module with the alarm fault comprises a first CPU module, the BMC switches the main CPU module of the main board system from the first CPU module to a target CPU module, namely a second CPU module; and then, under the condition that the first CPU module is powered down, the BMC restarts the PCH and the second CPU module to be powered up so as to enable the main board system to be started based on the second CPU module. When the CPU module with the alarm fault does not comprise the first CPU module, the BMC restarts PCH and the first CPU module to power on under the condition that the second CPU module is powered off, so that the main board system is started based on the first CPU module.
S250, the BMC judges whether the mainboard system is started normally or not.
When the PCH is based on the CPU module without alarm fault to normally start the main board system, the BMC outputs the first alarm information and ends the current control method.
It should be noted that, the first alarm information is used to indicate that a CPU module with an actual fault (referred to as a fault CPU module for short) exists in other CPU modules except a powered CPU module (i.e., a fault-free CPU module) in the plurality of CPU modules in the motherboard system, so that a user determines the fault CPU module from the other CPU modules and performs fault processing on the fault CPU module; the user is not required to determine the fault CPU module from a plurality of CPU modules in the main board system, so that the efficiency of determining the fault CPU module is improved, and the efficiency of carrying out fault processing on the fault CPU module is further improved.
When the PCH cannot normally start the mainboard system based on the CPU module without the alarm fault, the BMC outputs the second alarm information and ends the current control method.
The second alarm information is used for indicating that the fault occurring on the main board system is a fault of a non-CPU module (namely, a fault of a non-CPU module for short), or that a plurality of CPU module components included in the main board system are all fault CPU modules.
In a second example, as shown in fig. 2, the above-mentioned motherboard system includes 4 CPU modules, which are a first CPU module, a second CPU module, a third CPU module, and a fourth CPU module, respectively, that is, the motherboard system is a motherboard system of the 4-way CPU module; based on the motherboard system, the embodiment of the application provides a control method of a motherboard of a 4-way CPU module, which comprises S310-S390 as shown in FIG. 6.
S310, the BMC determines whether the CPU module with the alarm fault comprises a main CPU module.
When the main CPU module is included in the malfunction alerting CPU module, the BMC performs S320.
When the main CPU module is not included in the malfunction alerting CPU module, the BMC performs S330.
It should be noted that, the implementation manner of S310 is consistent with that of S110, and specific descriptions of S310 may refer to the related descriptions of S110, which are not repeated herein.
S320, the BMC controls the high-speed switch to be connected with the current main CPU module and is switched to be connected with the target CPU module, wherein the high-speed switch is one CPU module which does not have alarm fault among the 4 CPU modules.
It should be noted that, the CPU module connected with the high-speed switch is a main CPU module in the motherboard system; the target CPU module is one of the CPU modules without alarm fault.
It should be noted that, the implementation of S320 is similar to the implementation of S120, and specific descriptions of S320 may refer to the related descriptions of S120, which are not repeated herein.
S330, the BMC controls the PCH, the first CPU module, the second CPU module, the third CPU module and the fourth CPU module to be powered down.
It should be noted that, the implementation of S330 is similar to the implementation of S130, and specific descriptions of S330 may refer to the related descriptions of S130, which are not repeated herein.
And S340, the BMC controls the PCH and the CPU module in the first combination to be powered on.
The first combination is a combination of a current main CPU module and any one of 3 slave CPU modules in the main board system; when the BMC executes S320, the main CPU module is the target CPU module (e.g., the third CPU module), i.e., the main CPU module on the motherboard system after the high-speed switch is switched. When the BMC has not executed S320, the main CPU module is a main CPU module, such as a first CPU module, on the motherboard system that is not switched by the high-speed switch.
For example, when there are 4 CPU modules (as shown in fig. 2) in the motherboard system, the first combination is shown in table 1 below, and when the main CPU module is the first CPU module, the first combination is a combination of the first CPU module and any one of the three slave CPU modules, that is, the second CPU module, the third CPU module and the fourth CPU module; that is, the first combination includes three combinations, specifically: a combination of the first CPU module and the second CPU module, a combination of the first CPU module and the third CPU module, and a combination of the first CPU module and the fourth CPU module. When the main CPU module is the second CPU module, the first combination comprises three combinations, specifically: a combination of the second CPU module and the first CPU module, a combination of the second CPU module and the third CPU module, and a combination of the second CPU module and the fourth CPU module. When the main CPU module is the third CPU module, the first combination includes three combinations, specifically: a combination of the third CPU module and the first CPU module, a combination of the third CPU module and the second CPU module, and a combination of the third CPU module and the fourth CPU module. When the main CPU module is the fourth CPU module, the first combination includes three combinations, specifically: a combination of the fourth CPU module and the first CPU module, a combination of the fourth CPU module and the second CPU module, and a combination of the fourth CPU module and the third CPU module.
TABLE 1
Based on the above example, assuming that the third CPU module is the current main CPU module of the motherboard system, three first combinations exist in table 1, specifically: the first combination comprises: third CPU module and first CPU module, second kind first combination includes: third CPU module and second CPU module, third first combination includes: a third CPU module and a fourth CPU module. At this time, the BMC controls the PCH and the first CPU module in the first combination (i.e., the third CPU module and the first CPU module) to power up, so that the PCH starts the main board system based on the third CPU module and the first CPU module.
It should be noted that the first combination of power-up is one of a plurality of first combinations.
S350, the BMC judges whether the mainboard system is started normally or not.
When the PCH normally starts the main board system based on the CPU module in the first combination, the BMC outputs the first alarm information and ends the current control method. The first alarm information is used for indicating that the fault CPU module exists in other CPU modules except the CPU module in the first combination which is powered on in the 4 CPU modules.
When the PCH cannot normally start the motherboard system based on the CPU module in the first combination, the BMC performs S360.
It should be understood that, after the above-mentioned main board system is started normally, the main CPU module feeds back the indication information for normal start of the main board system to the PCH, and then the BMC obtains the indication information from the PCH. In addition, when the main board system cannot be started normally, the BMC can acquire information of the CPU module for alarming faults. Based on the information, the BMC can determine whether the mainboard system is started normally by judging whether the indication information of normal starting is obtained or not; whether the main board system is started normally can also be determined by judging whether the indication information of the CPU module with the alarm fault is obtained, and the specific implementation mode of the S350 is not limited in the embodiment of the application.
S360, the BMC judges whether all the first combinations are traversed completely.
When all the first combinations are not all traversed, the BMC performs S330 described above based on one first combination that is not traversed.
When all the first combinations are traversed, the BMC performs S370 described below.
All the first combinations mentioned above refer to all different combinations of the master CPU module and any one of the 3 slave CPU modules. For example, in the motherboard system shown in fig. 2, the first combinations are shown in table 1 above, and when the main CPU module is the third CPU module, there are three first combinations, specifically: the first combination comprises: third CPU module and first CPU module, second kind first combination includes: third CPU module and second CPU module, third first combination includes: a third CPU module and a fourth CPU module. At this time, all the first combinations are three combinations of the main CPU module and the third CPU module.
It should be noted that, when the main board system cannot be started up normally based on the CPU modules in one first combination (e.g., the third CPU module and the first CPU module) and the first combination includes the main CPU module and at least one sub CPU module, the main board system is started up based on the CPU modules in the other first combination (e.g., the third CPU module and the second CPU module), namely: the BMC again executes the above S330-S340, and at this time, the CPU module in the first combination powered up in S340 is the CPU module in the other first combination (e.g. the third CPU module and the second CPU module). If the main board system can not be started normally at this time, the PCH starts the main board system based on the CPU module in the next first combination, and the main board system is reciprocated in sequence until the main board system is started normally or all the first combinations are traversed. That is, the BMC sequentially performs S330-S360 described above based on the different first combinations until the motherboard system is normally started or traverses all of the first combinations.
It should be understood that, in the case of normal start of the above-mentioned motherboard system, there is no faulty CPU module among the CPU modules currently powered on in the motherboard system.
The step S360 is used to determine whether the BMC is able to start up the motherboard system normally after executing the steps S230-S240 based on any one of the first combinations, and the first combination includes a master CPU module and at least one slave CPU module.
It should be understood that when the number of failed CPU modules in the above-mentioned motherboard system shown in fig. 2 (i.e., the motherboard system including 4 CPU modules) is greater than 2, the PCH cannot normally start the motherboard system based on any one of the first combination of CPU modules, for example, in the motherboard system shown in fig. 2, assuming that the main CPU module is the third CPU module, the first combination is shown in table 1, and when the number of failed CPU modules in the motherboard system is 3 (i.e., the failed CPU module includes the first CPU module, the second CPU module, and the fourth CPU module), the first combination is any one of the combination of the third CPU module and the first CPU module, the combination of the third CPU module and the second CPU module, and the combination of the third CPU module and the fourth CPU module, and the first combination includes the failed CPU module, so the motherboard system cannot normally start.
Based on this, when all the first combinations are traversed (i.e., when it is determined that the motherboard system cannot be started normally in the case of powering up the CPU modules in all the first combinations), the BMC downgrades the motherboard system again, specifically, the BMC executes S370 to S380 described below.
And S370, the BMC controls the PCH and the CPU module in the first combination to be powered down.
It should be understood that after S340, the CPU modules in the power-on state in the motherboard system only have the CPU modules in the first combination, so that at this time, after the BMC controls the CPU modules in the first combination to power down, the CPU modules on the motherboard are all in the power-down state.
It should be noted that, the implementation manner of S370 is consistent with the implementation manner of S130, and specific descriptions of S370 may refer to the related descriptions of S130, which are not repeated herein.
And S380, the BMC controls the PCH and the CPU module in the second combination to be powered on.
The second combination is a current main CPU module among 4 CPU modules included in the main board system as shown in fig. 2; the second combination is a third CPU module when the current main CPU module is the third CPU module.
It should be noted that, the specific implementation of S380 is similar to S340, and the specific description of S380 may refer to the related description of S340, which is not repeated herein.
S390, the BMC judges whether the mainboard system is started normally.
When the PCH starts the main board system normally based on the CPU module in the second combination, the BMC outputs the first alarm information and ends the current control method.
When the PCH cannot normally start the mainboard system based on the CPU module in the second combination, the BMC outputs the second alarm information and ends the current control method. The second alarm information is used for indicating that the fault occurring on the main board system is a non-CPU module fault or that the plurality of CPU modules are all faulty.
It should be noted that, to start up a motherboard system normally, the motherboard system includes at least one CPU module, so that the PCH starts up the motherboard system based on the one CPU module. So that the motherboard system cannot be downgraded again after the motherboard system is downgraded to a 1-way motherboard system.
In a third example, the embodiment of the present application further provides a method for controlling a motherboard of an N-way CPU module based on the method for controlling a motherboard of a 2-way CPU module and the method for controlling a motherboard of a 4-way CPU module, where the motherboard system includes N CPU modules, n=2 X X is an integer greater than or equal to 1; as shown in fig. 7, the control method of the main board includes S410 to S480.
S410, the BMC determines whether the CPU module with the alarm fault comprises a main CPU module.
When the main CPU module is included in the malfunction alerting CPU module, the BMC performs S420.
When the main CPU module is not included in the malfunction alerting CPU module, the BMC performs S430.
It should be noted that, the implementation manner of S410 is consistent with that of S110, and specific descriptions of S410 may refer to the related descriptions of S110, which are not repeated herein.
S420, the BMC controls the high-speed switch to be connected with the current main CPU module and is switched to be connected with the target CPU module, wherein the target CPU module is one CPU module which does not have alarm fault in N CPU modules.
It should be noted that, the implementation of S420 is similar to the implementation of S320, and specific descriptions of S420 may refer to the related descriptions of S320, which are not repeated herein.
S430, the BMC controls the starting control module and the N CPU modules to be powered down.
The starting control module comprises a PCH and a BIOS chip; the connection manner of the PCH and the BIOS chip refers to the related description of fig. 2, and will not be described herein.
The role or function of the PCH and BIOS chips in the motherboard system is specifically described with reference to S310-S390, and will not be described in detail herein.
It should be noted that, the implementation manner of S430 is consistent with that of S130, and specific descriptions of S430 may refer to the related descriptions of S130, which are not repeated herein.
S440, the BMC controls the starting control module and the CPU module in the ith combination to be electrified.
The above i is a variable of integer type, and the initial value of the variable is 1.
It should be noted that, after the power-on, the start control module automatically runs the boot program of the BIOS to start the motherboard system based on the current power-on CPU module in the motherboard system.
The i-th combination is any 2 of the current main CPU module and N-1 slave CPU modules of the main board system X-i -a combination of 1 CPU modules; wherein i is an integer less than or equal to X, that is, S440 is to power up half of the N CPU modules including the main CPU module.
It should be noted that, when the BMC has executed S420, the main CPU module is the target CPU module (e.g., the third CPU module) module, that is, the main CPU module on the motherboard system after the high-speed switch is switched. When the BMC has not executed S420, the main CPU module is a main CPU module on the motherboard system that is not switched by the high-speed switch, for example, the first CPU module.
When the value of X is 1, the motherboard system includes 2 CPU modules, i.e., a first CPU module and a second CPU module, as shown in fig. 4, where the first CPU module is a main CPU module; at this time due to 2 X-i The value of-1 is 0, so the i-th combination comprises only the first CPU module; the BMC controls the starting control module and the first CPU module to be electrified so that the starting control module starts the mainboard system based on the first CPU module.
Still another example, when X is greater than 1, such as: the value of X is 2, and there are 4 CPU modules (as shown in fig. 2) in the above motherboard system, where the ith combination is shown in table 1; assuming that the main CPU module of the above motherboard system is a third CPU module, the i-th combination includes three combinations, specifically: a combination of the third CPU module and the first CPU module, a combination of the third CPU module and the second CPU module, and a combination of the third CPU module and the fourth CPU module. At this time, the BMC controls the starting control module and the CPU module in the first i-th combination (namely, the third CPU module and the first CPU module) to be electrified so that the starting control module starts the main board system based on the third CPU module and the first CPU module.
The i-th combination of the power-on is one of a plurality of i-th combinations.
S450, the BMC judges whether the mainboard system is started normally or not.
When the starting control module normally starts the main board system based on the CPU module in the ith combination, the BMC outputs the first alarm information and ends the current control method.
When the start control module cannot normally start the motherboard system based on the CPU module in the i-th combination, the BMC executes S460.
It should be noted that, the implementation of S450 is similar to the implementation of S350, and specific descriptions of S450 may refer to the related descriptions of S350, which are not repeated herein.
S460, the BMC judges whether all the ith combinations are traversed completely.
When all the ith combinations are not all traversed, the BMC performs S430 described above based on one of the ith combinations that are not traversed.
When all the ith combinations are traversed, the BMC performs S470 described below.
All the ith combinations are any 2 of the main CPU module and N CPU modules X-i All different combinations of 1 slave CPU module.
It should be noted that, the implementation of S460 is similar to the implementation of S360, and specific descriptions of S460 may refer to the related descriptions of S360, which are not repeated herein.
S470, the BMC judges whether the ith combination only comprises a main CPU module.
When the ith combination only comprises the main CPU module, the BMC outputs the second alarm information and ends the current control method.
When the i-th combination includes a master CPU module and at least one slave CPU module, the BMC performs S480 described below.
It should be noted that, since the ith combination is any 2 of the master CPU module and N-1 slave CPU modules X-i -a combination of 1 slave CPU module, so when only the master CPU module is included in the i-th combination, the 2 X-i The value of-1 is 0, i.e. x=i. That is, when only the main CPU module is included in the i-th combination, the main board system has been downgraded to a minimized main board system, at which time the main board system does not support downgrading again. The BMC outputs the second alarm information; the second alarm information is used for indicating that the fault occurring on the main board system is a non-CPU module fault or that all the N CPU modules are faulty CPU modules.
It should be noted that, when the ith combination includes a master CPU module and at least one slave CPU module, at this time, the motherboard system includes at least 2 CPU modules, so the BMC may downgrade the motherboard system again; the specific degradation is described in S480 below.
S480, BMC sets the value of i to i=i+1.
Note that, when the above-mentioned motherboard system is a motherboard system including 4 CPU modules as shown in fig. 2, at this time, N has a value of 4, x has a value of 2, n=2 X The method comprises the steps of carrying out a first treatment on the surface of the Then, when the motherboard system is downgraded through the above-mentioned S410-S460, the i-th combination (i.e. the first combination) includes the main CPU module and 2 2-1 -1 slave CPU module (i.e. comprising a master CPU module and one slave CPU module in a first combination); that is, after the first degradation, the main board system is the main board system of the 2-way CPU module, so that the main board system of the 2-way CPU module needs to be degraded again to be the main board system of the 1-way CPU module.
It should be understood that, when the motherboard system cannot be started normally under the condition that the CPU modules in all the i-th combination (i.e. the first combination) are powered on, the start control module starts the motherboard system based on the CPU modules in the i+1-th combination (i.e. the second combination)The i+1 combination is any 2 of the main CPU module and the N CPU modules X-(i+1) -a combination of 1 slave CPU modules; the method comprises the steps that the method is repeated until the main board system is started normally, or until the starting control module cannot start the main board system normally based on the CPU module in the X-th combination, the BMC outputs the second alarm information; the X-th combination only comprises the current main CPU module in the N CPU modules.
For example, when the above-mentioned motherboard system is a motherboard system including 4 CPU modules as shown in fig. 2, N has a value of 4, x has a value of 2, n=2 2 The method comprises the steps of carrying out a first treatment on the surface of the The ith combination (first combination) includes a main CPU module and 2 2-i -1 slave CPU module, since the initial value of i is 1; so at this time, the first combination includes the master CPU module and 1 slave CP module U. When PCH can not normally start the main board system based on different first combinations, BMC performs degradation again, the main board system of the 2-path CPU module is reduced to the main board system of the 1-path CPU module, and the BMC sets the value of i as i=i+1; i=2; at this time, the second combination includes the main CPU module (e.g. the first CPU module) and 2 2-2 -1 slave CPU module, i.e. the second combination only comprises the current master CPU module, and then starting the motherboard system based on the master CPU module, when the motherboard system is started normally, the BMC outputs a first alarm message, such as: "the second CPU module, the third CPU module and the fourth CPU module have a fault CPU, please process in time". When the main board system cannot be started normally, outputting second alarm information, such as: "the current motherboard failure is not caused by a CPU failure, please check in time"; and or as follows: "please deal with in time when 4 CPU components on the motherboard have all failed.
Still another example, when the above-described motherboard system is a motherboard system including 8 CPU modules as shown in fig. 8, the value of N is 8, and the initial value of x is 3,i is 1. The BMC starts the main board system based on a first combination, wherein the first combination comprises a main CPU module and 2 3-1 -1 slave CPU module (i.e. 3 slave CPU modules), when PCH cannot normally start the motherboard system based on a different first combination, BMC sets the value of i to i+1, i.e. i=2; the BMC starts the main board system based on the second combination; the method comprisesThe second combination comprises a main CPU module and 2 3-2 -1 slave CPU module (i.e. 1 slave CPU module) and so on until the motherboard system is started up normally, or until the PCH cannot start up the motherboard system normally either based on the CPU module in the third combination, which only includes the master CPU module, the BMC module outputs the second alarm information.
It should be noted that after the BMC completes S480, a new round of degradation is started, i.e., the BMC executes S430.
In the multi-CPU module motherboard system provided by the embodiment of the application, the BMC is specifically used for controlling the current main CPU module (such as the first CPU module) of the motherboard system to be switched into one CPU module (such as the third CPU module) of the CPU module without the alarm fault when the CPU module with the alarm fault comprises the main CPU module. Then, the BMC downgrades the motherboard system, which specifically includes: the BMC controls the starting control module and N CPU modules to be powered down, and then controls the starting control module and the CPU modules in the first combination to be powered up, so that the starting control module starts the main board system based on the CPU modules in the first combination, wherein the first combination is any 2 of the current main CPU module and N-1 auxiliary CPU modules of the main board system X-1 -a combination of 1 slave CPU modules. When the main board system is started normally, the BMC outputs first alarm information, so that the control method is ended; when the main board system cannot be started normally, the BMC controls the starting control module to start the main board system based on another first combination, and accordingly, the BMC is used for degrading the main board system again when the main board system cannot be started normally under the condition that the CPU modules in all the first combinations are electrified, so that the starting control module starts the main board system based on the CPU modules in a second combination, wherein the second combination is used for starting the main CPU module and any 2 of the N CPU modules X-2 -a combination of 1 slave CPU modules. And the operation is repeated until the main board system is started normally. The BMC is also specifically used for degrading the main board system when the CPU module with the alarm fault does not comprise the main CPU module, so that the main board system is started normally; thereby solving the problem that the normal starting of the mainboard system cannot be realized due to the large number of the fault CPU modulesThus, stability of the motherboard is provided.
In addition, the number of the CPU modules is 2 X The method comprises the steps of carrying out a first treatment on the surface of the The number of the CPU modules in the first combination is half of N, the number of the CPU modules in the second combination is half of the number of the CPU modules in the first combination, and the like, the BMC can not be odd no matter how many times the BMC is degraded on the main board system, so that the problem of unbalanced calculation force distribution of the CPU modules is solved, and therefore the calculation force utilization rate of the current powered CPU modules in the main board system is improved.
The foregoing description of the solution provided by the embodiments of the present application has been mainly presented in terms of a method. To achieve the above functions, the BMC includes a hardware structure and/or a software module that perform respective functions. Those of skill in the art will readily appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is implemented as hardware or computer software driven hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The embodiment of the application can divide the functional modules of the BMC according to the method, for example, the BMC may include each functional module corresponding to each functional division, or may integrate two or more functions into one processing module. The integrated modules may be implemented in hardware or in software functional modules. It should be noted that, in the embodiment of the present application, the division of the modules is schematic, which is merely a logic function division, and other division manners may be implemented in actual implementation.
FIG. 9 shows a schematic diagram of a BMC; the computing device includes: a control unit 101.
The control unit 101 is configured to, in a case where the CPU module with the alarm fault includes a CPU module connected to one end of the high-speed switch, control the one end of the high-speed switch to be switched to be connected to any one of the N CPU modules without the alarm fault, for example, execute step S420 in the above method embodiment.
The control unit 101 is further configured to control the start control module and the N CPU modules to power down, and control the start control module and the CPU modules in the first combination to power up, for example, to perform steps S430-S440 in the above method embodiment.
Optionally, the control unit 101 is configured to control the start control module and the N CPU modules to power down when the motherboard system cannot be started normally under the condition that the CPU modules in all the first combinations are powered up; and controlling the starting control module and the CPU module in the second combination to be electrified; for example, steps S370-S380 in the method embodiments described above are performed.
Optionally, the BMC further includes a transceiver unit 102; the control unit 101 is configured to control the high-speed switch to switch from being connected with the current main CPU module to being connected with the target CPU module when the main CPU module is included in the CPU module for alarming the failure; for example, step S120 in the above-described method embodiment is performed.
The control unit 101 is further configured to control the PCH and N CPU modules to be powered down, and control M CPU modules without alarm faults among the PCH and N CPU modules to be powered up; for example, steps S130-S140 in the method embodiment described above are performed.
The transceiver unit 102 is configured to output a first alarm message after the motherboard system is normally started; for example, step S150 in the above-described method embodiment is performed.
The units of the BMC may also be used to perform other actions in the method embodiments, and all relevant content of each step related to the method embodiments may be cited to a functional description of a corresponding functional unit, which is not described herein.
The embodiment of the application also provides a BMC for executing any one of the methods shown in the above figures 3 and 5-7.
The embodiment of the application also provides a computing device, which comprises any one of the main board systems shown in the above figures 1, 2, 4 and 8.
Embodiments of the present application also provide a computer readable storage medium having stored thereon a computer program which, when run on a computer, causes the computer to perform a method performed by any of the computer devices provided above.
For the explanation of the relevant content and the description of the beneficial effects in any of the above-mentioned computer-readable storage media, reference may be made to the above-mentioned corresponding embodiments, and the description thereof will not be repeated here.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using a software program, it may be wholly or partly implemented in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, the processes or functions in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, from one website, computer, server, or data center via a wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means to another website, computer, server, or data center. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device including one or more servers, data centers, etc. that can be integrated with the available medium. The usable medium may be a magnetic medium (e.g., floppy disk, magnetic tape), an optical medium (e.g., digital video disc (digital video disc, DVD)), or a semiconductor medium (e.g., solid state disk (solid state drives, SSD)), or the like.
From the foregoing description of the embodiments, it will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of functional modules is illustrated, and in practical application, the above-described functional allocation may be implemented by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to implement all or part of the functions described above. The specific working processes of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, which are not described herein.
In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in whole or in part in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) or a processor to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: flash memory, removable hard disk, read-only memory, random access memory, magnetic or optical disk, and the like.
The foregoing is merely illustrative of specific embodiments of the present application, and the scope of the present application is not limited thereto, but any changes or substitutions within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A motherboard system of a multi-CPU module, the motherboard system comprising: n power supply modules, N CPU modules, a starting control module, a high-speed switch and a baseboard management controller BMC, wherein N is an integer greater than or equal to 2;
one end of the high-speed switch is connected with any one of the N CPU modules, and the control end of the high-speed switch is connected with the BMC; the other end of the high-speed switch is connected with the starting control module;
the N power supply modules are respectively used for supplying power to the N CPU modules, and the control ends of the N power supply modules are connected with the BMC;
the BMC is used for: under the condition that the CPU module connected with one end of the high-speed switch has alarm faults, controlling one end of the high-speed switch to be switched to be connected with any one of the N CPU modules without alarm faults; wherein, the CPU module connected with the high-speed switch is a main CPU module;
The BMC is also configured to: and under the condition that the CPU module with the alarm fault exists in the N CPU modules, isolating the CPU module with the alarm fault from the corresponding power module so as to enable the CPU module with the alarm fault to be powered down.
2. The motherboard system of claim 1, wherein the start control module comprises an integrated south bridge PCH and a basic input output system BIOS chip, the PCH is connected with the main CPU module of the N CPU modules through the high-speed switch, a control end of the PCH is connected with the BMC, and the PCH is connected with the BIOS chip;
the BMC is further used for controlling the PCH and the N CPU modules to be powered down and controlling M CPU modules without alarm faults in the PCH and the N CPU modules to be powered up under the condition that the CPU modules connected with the high-speed switch do not have alarm faults and the CPU modules with alarm faults exist in the rest CPU modules; m is an integer greater than or equal to 1 and less than N;
and the PCH is used for running the boot program of the BIOS chip after power-on so as to start the main board system based on the M CPU modules without alarm faults.
3. The motherboard system according to claim 1 or 2, wherein the start control module further comprises an integrated south bridge PCH and a basic input output system BIOS chip, the PCH is connected to the main CPU module of the N CPU modules through the high-speed switch, a control end of the PCH is connected to the BMC, and the PCH is connected to the BIOS chip;
The BMC is specifically configured to, when the CPU module with the alarm fault includes a CPU module connected to the high-speed switch, control the high-speed switch to switch from being connected to the main CPU module to being connected to a target CPU module of the N CPU modules, where the target CPU module is one CPU module of the N CPU modules that has no alarm fault;
the BMC is further specifically configured to control the PCH and the N CPU modules to be powered down, and control M CPU modules without alarm faults in the PCH and the N CPU modules to be powered up, where M is an integer greater than or equal to 1 and less than N;
and the PCH is also used for running a boot program of the BIOS chip after power-on so as to start the main board system based on the M CPU modules without alarm faults.
4. The motherboard system according to claim 2 or 3, wherein,
when n=2 X When X is an integer greater than or equal to 2, the BMC is specifically configured to control the CPU module in a first combination to power up, where the first combination is any 2 of the current main CPU module and the N CPU modules of the motherboard system X-1 -a combination of 1 slave CPU modules.
5. The motherboard system according to claim 4, wherein,
The BMC is further configured to, when it is determined that all the CPU modules in the first combination are powered on, disable normal start-up of the motherboard system, and when the first combination includes at least two CPU modules, control powering on of the PCH and the CPU modules in the second combination, where the second combination is any 2 of the current main CPU module and the N CPU modules of the motherboard system X-2 -a combination of 1 slave CPU modules;
and the PCH is also used for running a boot program of the BIOS chip after power-on so as to start the main board system based on the CPU module in the second combination.
6. The motherboard system according to claim 5, wherein,
the BMC is used for outputting first alarm information when the main board system is started normally, and the first alarm information is used for indicating that a fault CPU module exists in the CPU modules except the CPU module in the second combination in the N CPU modules;
and the BMC is further used for outputting second alarm information when the CPU modules in all the second combinations are determined to be powered on and the mainboard system cannot be started normally and the second combinations comprise one CPU module, wherein the second alarm information is used for indicating that the faults occurring on the mainboard system are non-CPU module faults or the N CPU modules are all failed CPU modules.
7. The method is applied to a motherboard system of a plurality of CPU modules, and the motherboard system comprises: n power supply modules, N CPU modules, a high-speed switch, a starting control module and a baseboard management controller BMC, wherein N is an integer greater than or equal to 2; one end of the high-speed switch is connected with any one of the N CPU modules; the control end of the high-speed switch is connected with the BMC, and the other end of the high-speed switch is connected with the starting control module; the N power supply modules are respectively used for supplying power to the N CPU modules, and the control ends of the N power supply modules are connected with the BMC; the method comprises the following steps:
under the condition that the CPU module with the alarm fault comprises a CPU module connected with one end of the high-speed switch, the BMC controls one end of the high-speed switch to be switched to be connected with any one of the N CPU modules without the alarm fault, wherein the CPU module connected with the high-speed switch is a main CPU module;
the BMC controls the starting control module and the N CPU modules to be powered down, wherein N=2 X X is an integer greater than or equal to 2;
The BMC controls the starting control module and a CPU module in a first combination to be electrified, wherein the first combination is any 2 of the main CPU module and the N CPU modules X-1 -a combination of 1 slave CPU modules.
8. The method of claim 7, wherein in the event that it is determined that all of the CPU modules in the first combination are powered up, the motherboard system is not able to be started up normally, the method further comprising:
the BMC controls the starting control module and the N CPU modules to be powered down;
the BMC controls the starting control module and a CPU module in a second combination to be electrified, wherein the second combination is any 2 of the current main CPU module of the main board system and the N CPU modules X-2 -a combination of 1 slave CPU modules.
9. The method according to claim 7 or 8, wherein in case that the CPU module connected to one end of the high-speed switch is not included in the CPU module having the alarm failure, the method further comprises:
the BMC controls the starting control module and the N CPU modules to be powered down, wherein N=2 X X is an integer greater than or equal to 2;
the BMC controls the starting control module and a CPU module in a first combination to be electrified, wherein the first combination is any 2 of the main CPU module and the N CPU modules X-1 -a combination of 1 slave CPU modules.
10. A computing device comprising the motherboard system of any of claims 1-6.
CN202310638308.8A 2023-05-31 2023-05-31 Main board system with multiple CPU modules, control method of main board and computing equipment Pending CN116795195A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310638308.8A CN116795195A (en) 2023-05-31 2023-05-31 Main board system with multiple CPU modules, control method of main board and computing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310638308.8A CN116795195A (en) 2023-05-31 2023-05-31 Main board system with multiple CPU modules, control method of main board and computing equipment

Publications (1)

Publication Number Publication Date
CN116795195A true CN116795195A (en) 2023-09-22

Family

ID=88037711

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310638308.8A Pending CN116795195A (en) 2023-05-31 2023-05-31 Main board system with multiple CPU modules, control method of main board and computing equipment

Country Status (1)

Country Link
CN (1) CN116795195A (en)

Similar Documents

Publication Publication Date Title
US8898517B2 (en) Handling a failed processor of a multiprocessor information handling system
CN110083494B (en) Method and apparatus for managing hardware errors in a multi-core environment
CN111767244B (en) Dual-redundancy computer equipment based on domestic Loongson platform
CN105700969A (en) Server system
US20100017630A1 (en) Power control system of a high density server and method thereof
US11340684B2 (en) System and method for predictive battery power management
US20140136866A1 (en) Rack and power control method thereof
US10754408B2 (en) Power supply unit mismatch detection system
CN104050061A (en) Multi-main-control-panel redundant backup system based on PCIe bus
CN111367392B (en) Dynamic power supply management system
US10101799B2 (en) System and method for smart power clamping of a redundant power supply
US10893626B2 (en) Information handling system having synchronized power loss detection armed state
US11099961B2 (en) Systems and methods for prevention of data loss in a power-compromised persistent memory equipped host information handling system during a power loss event
CN114442787A (en) Method and system for realizing whole machine power consumption callback after server enters power consumption capping
CN111984471B (en) Cabinet power BMC redundancy management system and method
US20180267870A1 (en) Management node failover for high reliability systems
CN212541329U (en) Dual-redundancy computer equipment based on domestic Loongson platform
WO2023029375A1 (en) Power source consumption management apparatus for four-way server
CN116795195A (en) Main board system with multiple CPU modules, control method of main board and computing equipment
CN115309340A (en) Memory control method, memory controller and electronic equipment
CN113535472A (en) Cluster server
CN118708418B (en) System and method for diagnosing software and hardware information of server
US20240057240A1 (en) Light control device, light control method and server thereof
CN111381659A (en) Computer system and power management method
US20230161598A1 (en) System booting method and related computer system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination