[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN102929747B - Method for treating crash dump of Linux operation system based on loongson server - Google Patents

Method for treating crash dump of Linux operation system based on loongson server Download PDF

Info

Publication number
CN102929747B
CN102929747B CN201210437092.0A CN201210437092A CN102929747B CN 102929747 B CN102929747 B CN 102929747B CN 201210437092 A CN201210437092 A CN 201210437092A CN 102929747 B CN102929747 B CN 102929747B
Authority
CN
China
Prior art keywords
kernel
linux
collapse
dump
suse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210437092.0A
Other languages
Chinese (zh)
Other versions
CN102929747A (en
Inventor
张路波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Standard Software Co Ltd
Original Assignee
China Standard Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Standard Software Co Ltd filed Critical China Standard Software Co Ltd
Priority to CN201210437092.0A priority Critical patent/CN102929747B/en
Publication of CN102929747A publication Critical patent/CN102929747A/en
Application granted granted Critical
Publication of CN102929747B publication Critical patent/CN102929747B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a method for treating crash dump of a Linux operation system based on a loongson server. The method includes that a kernel of an output and input system of the loongson boots program pass parameters and starts a Linux system; crash dump service of the Linux system is started, and captured kernel mapping file and startup parameters are loaded to storage space; when a system kernel crashes, the processor of the system is performed to control and capture kernel boot, memory images of crashed kernels are stored into dump files; and the Linux system is rebooted to analyze the dump files. According to the method, supports to million instructions per second (MIPS) frameworks of the crash dump service and the loongson server are added when the kernel of the Linux operation system operating on the loongson server is compiled, so that the Linux system operating on domestic MIPS frameworks can achieve kernel crash dump during crashing, and kernel debugging efficiency and practicability of developers are enhanced.

Description

Based on the disposal route of the (SuSE) Linux OS collapse dump of Loongson server
Technical field
The present invention relates to operation system technology field, particularly relate to the disposal route of a kind of (SuSE) Linux OS based on Loongson server collapse dump.
Background technology
Along with country is to independently controlled continuous attention, the Loongson processor of domestic MIPS framework obtains more and more deep popularization.The problem solving its software support ability seems more and more important, particularly supports to run (SuSE) Linux OS thereon.
But, although support (SuSE) Linux OS at present, but the Kernel Panic dump function of the linux system run thereon when collapsing is still unrealized, when the disappearance of this function causes the linux system run in the Loongson processor of MIPS framework to occur collapse, the field data of internal memory when kernel development personnel cannot use application tool analysis to collapse carrys out the reason of positioning system collapse, when field data when not having system crash, kernel development personnel find out and the crash reason of resolution system will be very difficult.
Although the collapse dump service (being called for short kdump service) when operating in the linux system back-up system collapse under the architectural frameworks such as X86, Powerpc, Arm, Alpha at present, but owing to having a lot of difference between MIPS architectural framework and other several architecture, therefore, when collapse occurs linux system, just cannot realize collapsing dump function.
Summary of the invention
One of technical matters to be solved by this invention needs to provide a kind of (SuSE) Linux OS based on Loongson server to collapse the disposal route of dump.
In order to solve the problems of the technologies described above, the invention provides the disposal route of a kind of (SuSE) Linux OS based on Loongson server collapse dump, the method comprises:
When described Loongson server powers up start, the boot kernel program Transfer Parameters of the input-output system of described Loongson server also starts the system kernel of described (SuSE) Linux OS, wherein, described parameter comprises the parameter for the seizure kernel setup storage space for described (SuSE) Linux OS;
Open the collapse dump service of described (SuSE) Linux OS, the image file of described seizure kernel and start-up parameter are loaded in described storage space by described collapse dump service;
When collapse occurs the system kernel of described (SuSE) Linux OS, the processor performing described (SuSE) Linux OS controls described seizure kernel and starts, and the memory mirror of the described system kernel that collapse occurs is stored as dump file;
Restart the system kernel of the (SuSE) Linux OS on described Loongson server, described dump file analyzed, with before finding once the system kernel of described (SuSE) Linux OS there is the reason of collapse, wherein,
The system kernel operating in the described (SuSE) Linux OS on described Loongson server supports that described collapse dump is served, and the seizure kernel of described (SuSE) Linux OS supports the MIPS framework of described Loongson server.
Disposal route according to a further aspect of the invention, when compiling the system kernel of described (SuSE) Linux OS, in kernel option, select CONFIG_KEXEC_CRASH, CONFIG_KEXEC, CONFIG_SYSFS and CONFIG_DEBUG_INFO option, support that to make the system kernel of described (SuSE) Linux OS described collapse dump is served.
Disposal route according to a further aspect of the invention, when compiling the seizure kernel of described (SuSE) Linux OS, CONFIG_NUMA and CONFIG_SMP option is removed in kernel option, add CONFIG_CRASH_DUMP option, with the MIPS framework making the seizure kernel of described (SuSE) Linux OS support described Loongson server.
Disposal route according to a further aspect of the invention, the image file of described seizure kernel and start-up parameter are loaded in the step in described storage space by described collapse dump service, further comprising the steps:
Described collapse dump service routine execution/etc/init.d/kdump script;
Described/etc/init.d/kdump script performs the kexec order of kexec instrument, is loaded in described storage space by the image file of described seizure kernel and start-up parameter,
Wherein, described kexec order comprises four kexec sections, and these four sections are respectively: the image file section of crash kernel; Pass to crash kernel order line message segment; Standard kernel memory information content, vmcoreinfo file, each processor message segment; Deposit the backup area segments of Backup Data;
Wherein, the region of memory list section of the operating system obtaining current operation in kexec instrument is modified, and modify according to the start address of internal memory physics address layout figure to described four kexec sections of the MIPS framework of described Loongson server.
Disposal route according to a further aspect of the invention, when collapse occurs the system kernel of described (SuSE) Linux OS, the processor performing described (SuSE) Linux OS controls in the step of described seizure kernel startup further comprising the steps,
Described (SuSE) Linux OS enters in the collapse handling procedure of described system kernel, and described processor controls other processor shut-down operations on described Loongson server;
Described processor is for starting described seizure kernel warning order line parameter and environmental variance parameter;
Described processor is based on described command line parameter and described environmental variance parameter, and the first address jumping to described seizure kernel sentences the described seizure kernel of startup.
Disposal route according to a further aspect of the invention, the processor performing collapse handling procedure, by look-at-me between other processor sending processors on described Loongson server, makes other processor shut-down operations on described Loongson server.
Disposal route is according to a further aspect of the invention start in the step of described seizure kernel warning order line parameter and environmental variance parameter at described processor,
The command line parameter that described processor transmits when described environmental variance parameter and described kexec order are loaded the image file of described seizure kernel copies in memory headroom reserved in head.s, then to for representing the status flag preparing to start described seizure kernel.
Disposal route according to a further aspect of the invention, is stored as in the step of dump file by the memory mirror of the described system kernel that collapse occurs, further comprising the steps,
After described seizure kernel starts, perform vmcore_init initialization function;
Described vmcore_init initialization function judges whether the elfcorehdr parameter value in described command line parameter is ELFCORE_ADDR_ERR or ELFCORE_ADDR_MAX,
If judged result is described elfcorehdr parameter value neither ELFCORE_ADDR_ERR neither ELFCORE_ADDR_MAX, then establishment/proc/vmcore file, and by there is collapse the memory mirror of described system kernel with ELF stored in file format in described/proc/vmcore file;
Open collapse dump service routine to call the makedumpfile order of makedumpfile instrument, by save memory mirror that the described system kernel collapsed occurs /proc/vmcore file generated dump file is using as dump file.
Disposal route according to a further aspect of the invention, the interface function of the MIPS framework for described Loongson server is with the addition of in described makedumpfile instrument, comprising 3 interfaces, the virtual address when start address interface be respectively and obtain physical base address interface, obtaining address and the discontinuous memory field collapsing chained list first element in discontinuous memory field in kernel, reading collapse kernel information is to the translation interface of physical address.
Disposal route according to a further aspect of the invention, is analyzed described dump file by crash instrument, with before finding once the system kernel of described (SuSE) Linux OS there is the reason of collapse,
Wherein, the interface function of the MIPS framework for described Loongson server is with the addition of in described crash instrument, comprising 3 interfaces, to be respectively the virtual address translation of collapse kernel be physical address interface, the hardware all to MIPS framework machine carries out Initialize installation interface, obtains the page directory item pointer interface of appointment process.
Compared with prior art, one or more embodiment of the present invention can have the following advantages by tool:
The inventive method is at the system kernel to the (SuSE) Linux OS operated on Loongson server with when catching recompile kernel, add the support to the MIPS framework collapsing dump service and Loongson server, the (SuSE) Linux OS operating in domestic MIPS framework is made to realize Kernel Panic dump when system crash, then use the field data of dump during tool analysis system crash can make the reason of kernel development personnel positioning system collapse fast and accurately, improve efficiency and practicality that kernel development personnel carry out kernel tailoring.
Other features and advantages of the present invention will be set forth in the following description, and, partly become apparent from instructions, or understand by implementing the present invention.Object of the present invention and other advantages realize by structure specifically noted in instructions, claims and accompanying drawing and obtain.
Accompanying drawing explanation
Accompanying drawing is used to provide a further understanding of the present invention, and forms a part for instructions, with embodiments of the invention jointly for explaining the present invention, is not construed as limiting the invention.In the accompanying drawings:
Fig. 1 is the schematic flow sheet collapsing the disposal route of dump according to the (SuSE) Linux OS based on Loongson server of the embodiment of the present invention;
Fig. 2 (a) and Fig. 2 (b) carries out collapsing the operational flowchart of dump when being operational flowchart and the system generation collapse of opening collapse dump service according to the kernel of the embodiment of the present invention respectively;
Fig. 3 is the workflow schematic diagram of serving according to the kdump of the embodiment of the present invention;
Fig. 4 is the file layout schematic diagram of the vmcore internal memory crashdump file according to the embodiment of the present invention.
Embodiment
Describe embodiments of the present invention in detail below with reference to drawings and Examples, to the present invention, how application technology means solve technical matters whereby, and the implementation procedure reaching technique effect can fully understand and implement according to this.It should be noted that, only otherwise form conflict, each embodiment in the present invention and each feature in each embodiment can be combined with each other, and the technical scheme formed is all within protection scope of the present invention.
In addition, can perform in the computer system of such as one group of computer executable instructions in the step shown in the process flow diagram of accompanying drawing, and, although show logical order in flow charts, but in some cases, can be different from the step shown or described by order execution herein.
The embodiment of the present invention, to operate in the (SuSE) Linux OS on the Loongson server of domestic MIPS framework, illustrates when collapse occurs (SuSE) Linux OS, realizes the disposal route of Kernel Panic dump.
Fig. 1 is the schematic flow sheet collapsing the disposal route of dump according to the (SuSE) Linux OS based on Loongson server of the embodiment of the present invention, Fig. 2 (a) and Fig. 2 (b) opens according to the kernel of the embodiment of the present invention operational flowchart carrying out collapsing dump when kdump collapses operational flowchart and the system generation collapse of dump service, below with reference to Fig. 1 and Fig. 2, describe each step of the present invention in detail.
Step S110, when Loongson server powers up start, the boot kernel program Transfer Parameters of the input-output system of Loongson server (being called for short Pmon) also starts system kernel (the Standard kernel of (SuSE) Linux OS, also the first kernel is claimed), wherein, parameter comprises the parameter for seizure kernel (being called for short Crash kernel, is the second kernel run when the system crash) configuration storage space for (SuSE) Linux OS.
Particularly, first, start is powered up to the Loongson server of domestic MIPS framework, pmon boot kernel program is carried out passing ginseng and is operated and the first kernel starting linux system, have one to be " crashkernel=XXX@YYY " in the parameter transmitted, wherein, " XXX " is for presetting memory size, " YYY " is offset, the start address of to be system be region of memory that the second kernel retains.Transmitting this parameter object is that the internal memory of " XXX " size from the address represented by internal memory " YYY " is reserved to and catches kernel (crash kernel also claims the second kernel) and use.
It should be noted that, operate in the system kernel support collapse dump service of the (SuSE) Linux OS on Loongson server.
Collapse dump service (being called for short kdump service) can be supported to make the system kernel operated on the Loongson server of domestic MIPS framework, when compiling the system kernel of (SuSE) Linux OS, in kernel option, select the options such as CONFIG_KEXEC_CRASH, CONFIG_KEXEC, CONFIG_SYSFS, CONFIG_DEBUG_INFO.
Incidentally, the seizure kernel of the (SuSE) Linux OS of the embodiment of the present invention needs the MIPS framework supporting Loongson server.Crash kernel is the reflection of a vmlinux form, i.e. a unpressed ELF image file.When compiling seizure kernel, in order to support MIPS framework, needing to remove the kernel options such as CONFIG_NUMA and CONFIG_SMP under the platform of the domestic Loongson server of MIPS64 framework, adding CONFIG_CRASH_DUMP option.
Step S120, opens the collapse dump service of (SuSE) Linux OS, the image file and start-up parameter that catch kernel to be loaded in storage space.
Particularly, open the kdump service in the (SuSE) Linux OS operated on the Loongson server of domestic MIPS framework, / etc/init.d/kdump script will be performed while this service of unlatching, this script can perform/and the image file of crash kernel and the start-up parameter that passes to crash kernel be loaded on the storage space place of the setting in above-mentioned steps S110 by sbin/kexc order.
More specifically, open kdump service and will perform/etc/init.d/kdump script, this script performs the kexec order of kexec-tools instrument, and kexec order is by tissue 4 kexec sections, and these four sections are respectively: the image file section of (1) crash kernel; (2) crash kernel order line message segment is passed to; (3) standard kernel memory information content, vmcoreinfo file, each processor message segment; (4) the backup area segments of Backup Data is deposited.
Kexec order in the structure of kexec_info, copies the Information encapsulation between the memory field in the information of these four sections and standard kernel to kernel spacing by the system call of sys_kexec_load from user's space.Kernel according to the code of the crash kernel image file recorded in this structure and data segment, crash kernel order line message segment, standard kernel memory information content, vmcoreinfo, each processor message segment, deposit the information that should leave the size of the concrete start address of physical memory and section in of these four section correspondences of backup area segments of Backup Data, copy the content of each section to corresponding with it correct physical memory location from user's space.
It should be noted that, MIPS framework is supported in order to make kexec, have modified the region of memory list section obtaining current operational system in kexec instrument, generation second kernel section is used in modifiers, order line message segment, the section of the current information composition of each processor of the program header information of the ELF information of the first kernel, vmcoreinfo information, multiprocessing, start address in backup data information section 4 sections (i.e. above-mentioned 4 kexec sections), the internal memory physics address layout figure of the address foundation MIPS framework of these 4 sections is obtained by computing.
It should be noted that, the kexec in Kexec-tools kit is the start-up loading device of a kernel to kernel, starts the new kernel of another one and need not pass through BIOS in the space of the kernel that it can run at.In essence, the pre-loaded new kernel of kexec this kernel is left in internal memory.The internal memory storing new kernel reflection does not want Seeking Truth continuous print, and kexec preserves the trace information of the page of new kernel stored memory.When restarting, kexec copies new kernel reflection the position that it will be run to, and then perform some and arrange code, then control is given new kernel by kexec.
Kexec is mainly divided into two parts: kernel spacing part and user's space part.Kernel spacing part achieves new system call kexec load (), and it is with helping carry out pre-loaded to new kernel.User's space part, is also known as kexec instrument, is responsible for resolving kernel reflection, prepares suitable parameter section and arranges code segment and these data passed to by the system call of a up-to-date realization kernel run.
Step S130, when collapse occurs the system kernel of (SuSE) Linux OS, the processor performing (SuSE) Linux OS controls to catch kernel and starts, and the memory mirror of the system kernel that collapse occurs is stored as dump file.
Particularly, when there is collapse in the linux system on the Loongson server operating in domestic MIPS framework, now system enters in the collapse handling procedure of kernel, perform the processor of (SuSE) Linux OS, namely the processor of collapse handling procedure in collapse dump service is performed, other several processor shut-down operations on the Loongson server controlling domestic MIPS framework.Such as, Godson 3A SERVER always has 8 processors, now need to stop 7 extra processors.
Then, processor, for starting crash kernel warning order line parameter and environmental variance parameter, wherein can comprise elfcorehdr parameter, jump to the first address of crash kernel to start crashkernel in order line, finally, other processor of initialization makes other processor normally run.
In this step, the linux system generation collapse operated on the Loongson server of domestic MIPS framework is made can to have following several situation:
(1) firmly lock if there is one, and be configured with " NMI house dog ", kernel will trigger die nmi (), causes system crash.
(2) if having invoked die () or be provided with panic on oops when calling die () in interruption context, kernel can collapse.
(3) when system cloud gray model, pin the ALT on keyboard, SysRq and c button or at system terminal input echo c>/proc/sysrq-trigger simultaneously.
(4) in kernel, call BUG (), BUG ON () or panic () function.
(5) in the code of kernel state, referencing operation carried out to a null pointer or remove 0 mistake.
Above any one method is used to make system crash, now system will enter in collapse handling procedure, performing the processor of collapse handling procedure by look-at-me between other processor sending processor, other processor to receive the value pop down of each register in each self processor between processor after look-at-me, entering idle running afterwards and constantly judging to represent and prepared to start the Status Flag catching kernel (also weighing, it is ready to open).
Due to when Kernel Panic, whole system is not out of service, and each processor is still by periodic duty, and various interruption still can occur frequently, and so these situations can make the scene collapsed be destroyed.If can not when system crash saving scene, carry out core dump and become and seem meaningless.
Interruption is a kind of program of processor being hung up performing and turns the operation going to process special event.When processor receives a look-at-me time, it can by the value of each register of current processor press-in storehouse, and then redirect goes to perform interrupt response handling procedure.After interrupt handling routine is finished, the value of the register of original pop down is recovered, then continue to perform.Wherein, the operation of the value pop down of each register of processor is called that interrupt spot is protected.Similar with protection interrupt spot, each register of processor is also reappear the on-the-spot important component part of collapse.Therefore, when system crash, in order to instruction executing location and other information of program can be checked, need the value of each register in processor to preserve.For interruption, the value of processor register is kept in internal memory storehouse.And for core dump, these values are then written in file just can be guaranteed to lose.
In addition, due to pmon to standard kernel pass ginseng time, parameter information is left on the end address of standardkernel, the parameter information of the core position behind end address can be covered after standard kernel runs, and the parameter of environmental variance and order line is lost, owing to there is no command line parameter and environmental variance parameter and causing normally starting after such system crash enters crashkernel.For head it off needs the memory headroom of reserved one page in head.s, environmental variance and command line parameter are stored in static array after starting by standard kernel.
The command line parameter that the environmental variance parameter of the processor performing collapse handling procedure just in static array and kexec order transmit when loading crash kernel copies in the memory headroom of the one page reserved in head.s, wait ready after will to restarting ready status flag.
Now other processor jumps to kexec_smp_wait assembly code place, the processor performing collapse handling procedure jumps to relocate_new_kernel assembly code place, other processor after crash kernel is stored in tram by the processor of the medium pending collapse handling procedure of kexec_smp_wait, with the entrance performing the processor collapsing handling procedure and all can jump to crash kernel.
After the Loongson processor of domestic MIPS framework powers up, all processors all can run, and the executing location powering up the Article 1 instruction of preprocessor is at 0xffffffffbfc00000.Now only have No. 0 processor can go to load kernel and jump to kernel code porch and perform, other processor then dallies look-at-me between wait No. 0 processor sending processor thus complete initialization and start to perform kernel code.If the execution kernel code of multiprocessing is not such order, kernel will run abnormal.Due to the reason of this characteristic, when the entrance that all processors jump to crash kernel after system crash will start to perform, the processor that first will judge to perform kernel program is No. 0 processor, if not, then the idle running of other processors is waited for, if so, then No. 0 processor continues to perform and initialization will make other processor also start to perform kernel program to look-at-me between other processor sending processor after completing.
Be divided into crash pattern and normal pattern when restarting kernel, crash pattern refers to that the system run causes the pattern of collapse due to the mistake of system; Normal pattern refers to that any mistake does not occur the system run, but the kernel using kexec order to restart is loaded in internal memory, and the grammer loading kernel is as follows:
kexec-l<kernel-image>--append="<command-line-option>"。
Kernel loads successfully, and use kexec-e order just can restart the new kernel loaded, the present embodiment uses crash pattern.
The following detailed description of the step memory mirror of the system kernel that collapse occurs being stored as dump file.
Particularly, after crash kernel normally starts, vmcore_init initialization function can be performed, this function judges whether the elfcorehdr parameter value in order line is ELFCORE_ADDR_ERR or ELFCORE_ADDR_MAX, if be not all, then establishment/proc/vmcore file the running environment of the first kernel standard kernel and memory headroom ELF file layout are wrapped up, automatically perform/etc/init.d/kdump script unlatching kdump service, this script can judge/whether proc/vmcore file exist, if existed, will makedumpfile order that makedumpfile instrument provides be called to generate dump file, system is restarted after dump file successfully generates.
It should be noted that, these information are also stored in disk file by the running environment of makedumpfile tool analysis standard kernel and memory headroom, so that kernel development personnel carry out the reason of analytic system collapse according to the memory information stored in file.The interface function of the concrete MIPS framework corresponding to common interface is with the addition of in makedumpfile instrument, comprising 3 interfaces, be respectively obtain physical base address (physical base address refers to the specific address of the physical memory location at code or data place, this base address refer to be the start physical address of collapse core position, kernel place), obtain the start address of the address of discontinuous memory field chained list first element and discontinuous memory field in collapse kernel, virtual address when reading collapse kernel information to the conversion of physical address.
If elfcorehdr parameter value is ELFCORE_ADDR_ERR or ELFCORE_ADDR_MAX, then vmcore_init function do not create/proc/vmcore file directly returns.In this case owing to there is no establishment/proc/vmcore file, makedumpfile instrument generation dump file would not when kdump service opened by operation/etc/init.d/kdump script after system starts, be called.
Fig. 3 is the workflow schematic diagram of serving according to the kdump of the embodiment of the present invention, as shown in Figure 3, in time there is system crash panic in system, kdump can by call kexec and start preprepared this Starting mode of crash kernel. fast and fast start-up mechanism similar, through BIOS, warm start can not be belonged to.After Crash kernel starts, memory mirror when previous kernel runs can be saved to/proc/vmcore, can by order cp or scp of xcopy by its vmcore file copy on local disk or remote disk.
Fig. 4 is the file layout schematic diagram of the vmcore internal memory crashdump file according to the embodiment of the present invention, as shown in Figure 4, the core dump file of Vmcore file layout is made up of a dump head and a series of data page comprising Installed System Memory, and the layout of it and ELF is very similar.
Its form composition as shown in Figure 4.The head of file is made up of two parts, and one is universal memory dump head (Generic Dump Header), and it is a part irrelevant with system architecture; Another is framework associated internal memory dump head (Architecture DumpHeader), and it is a part be closely related with framework.The mode of this double head, can allow the data structure be closely related with CPU that different framework dump is different very well.Immediately downstream of which there be a series of parts be made up of dump top margin portion (Dump PageHeader) and dump page data (Dump PageData).Wherein, the information that top margin portion comprises has: the size of page, the mark (as whether compressed) relevant to page, the address etc. of page.Has been exactly page data after top margin portion, and page data can be compression, also can not compress.Each data page is preserved by continuous print, end mark (PAGE END) to the last.This mechanism allows multiple different compress mode, dissimilar data framework, the order etc. of different pages.
Step S140, restarts the system kernel of the (SuSE) Linux OS on Loongson server, analyzes dump file, and the reason of collapse occurs with the system kernel finding a front (SuSE) Linux OS.
After restarting system, namely by analysis tool, the vmcore file just now preserved is analyzed, search the reason causing panic.
Particularly, can by utilizing crash instrument to analyze the final Kernel Panic dump file vmcore generated.Crash instrument is designed to have nothing to do with concrete kernel version.Crash can by the support obtained the new kernel code affecting crash function of upgrading.The order that crash is used for kernel core analysis has the kernel stack of all processors to follow the tracks of, source code dis-assembling, format kernel data structure and variable display, virtual memory data, list display etc.Also comprise several order relevant to particular core subsystem in addition.Meanwhile, crash supports the order of gdb by the module expansion of gdb.Crash instrument is designed to have nothing to do with concrete kernel version.Crash can by the support obtained the new kernel code affecting crash function of upgrading.
In order to support MIPS framework, following amendment is carried out to crash instrument: in order to support MIPS framework, the main realization that with the addition of the part of interface function being directed to MIPS framework in crash instrument, comprising 3 interfaces, be respectively and the virtual address translation of collapsing kernel is physical address interface, all necessary hardware of MIPS framework machine is carried out to the page directory item pointer interface of Initialize installation interface, acquisition appointment process.
The support of the present invention by adding in kexec instrument, makedumpfile instrument and crash instrument MIPS framework, in the kdump mechanism of existing kernel, with the addition of the support in MIPS framework simultaneously, make to operate in LINUX operating system on the Loongson server of domestic MIPS framework when system crash, realize Kernel Panic dump, use the field data of dump during above-mentioned tool analysis system crash can make the reason of kernel development personnel positioning system collapse fast and accurately, improve efficiency and practicality that kernel development personnel carry out kernel tailoring.
Those skilled in the art should be understood that, above-mentioned of the present invention each module or each step can realize with general calculation element, they can concentrate on single calculation element, or be distributed on network that multiple calculation element forms, alternatively, they can realize with the executable program code of calculation element, thus, they can be stored and be performed by calculation element in the storage device, or they are made into each integrated circuit modules respectively, or the multiple module in them or step are made into single integrated circuit module to realize.Like this, the present invention is not restricted to any specific hardware and software combination.
Although the embodiment disclosed by the present invention is as above, the embodiment that described content just adopts for the ease of understanding the present invention, and be not used to limit the present invention.Technician in any the technical field of the invention; under the prerequisite not departing from the spirit and scope disclosed by the present invention; any amendment and change can be done what implement in form and in details; but scope of patent protection of the present invention, the scope that still must define with appending claims is as the criterion.

Claims (8)

1., based on a disposal route for the (SuSE) Linux OS collapse dump of Loongson server, it is characterized in that, comprising:
When described Loongson server powers up start, the boot kernel program Transfer Parameters of the input-output system of described Loongson server also starts the system kernel of described (SuSE) Linux OS, wherein, described parameter comprises the parameter for the seizure kernel setup storage space for described (SuSE) Linux OS;
Open the collapse dump service of described (SuSE) Linux OS, the image file of described seizure kernel and start-up parameter are loaded in described storage space by described collapse dump service;
When collapse occurs the system kernel of described (SuSE) Linux OS, the processor performing described (SuSE) Linux OS controls described seizure kernel and starts, and the memory mirror of the described system kernel that collapse occurs is stored as dump file;
Restart the system kernel of the (SuSE) Linux OS on described Loongson server, described dump file analyzed, with before finding once the system kernel of described (SuSE) Linux OS there is the reason of collapse, wherein,
When compiling the system kernel of described (SuSE) Linux OS, CONFIG_KEXEC_CRASH is selected in kernel option, CONFIG_KEXEC, CONFIG_SYSFS and CONFIG_DEBUG_INFO option, support that to make the system kernel of the described (SuSE) Linux OS operated on described Loongson server described collapse dump is served, when compiling the seizure kernel of described (SuSE) Linux OS, CONFIG_NUMA and CONFIG_SMP option is removed in kernel option, add CONFIG_CRASH_DUMP option, with the MIPS framework making the seizure kernel of described (SuSE) Linux OS support described Loongson server.
2. disposal route according to claim 1, is characterized in that, the image file of described seizure kernel and start-up parameter are loaded in the step in described storage space by described collapse dump service, further comprising the steps:
Described collapse dump service routine execution/etc/init.d/kdump script;
Described/etc/init.d/kdump script performs the kexec order of kexec instrument, is loaded in described storage space by the image file of described seizure kernel and start-up parameter,
Wherein, described kexec order comprises four kexec sections, and these four sections are respectively: the image file section of crash kernel; Pass to crash kernel order line message segment; Standard kernel memory information content, vmcoreinfo file, each processor message segment; Deposit the backup area segments of Backup Data;
Wherein, the region of memory list section of the operating system obtaining current operation in kexec instrument is modified, and modify according to the start address of internal memory physics address layout figure to described four kexec sections of the MIPS framework of described Loongson server.
3. disposal route according to claim 1, is characterized in that, when collapse occurs the system kernel of described (SuSE) Linux OS, the processor performing described (SuSE) Linux OS controls in the step of described seizure kernel startup further comprising the steps,
Described (SuSE) Linux OS enters in the collapse handling procedure of described system kernel, and described processor controls other processor shut-down operations on described Loongson server;
Described processor is for starting described seizure kernel warning order line parameter and environmental variance parameter;
Described processor is based on described command line parameter and described environmental variance parameter, and the first address jumping to described seizure kernel sentences the described seizure kernel of startup.
4. disposal route according to claim 3, is characterized in that,
The processor performing collapse handling procedure, by look-at-me between other processor sending processors on described Loongson server, makes other processor shut-down operations on described Loongson server.
5. disposal route according to claim 3, is characterized in that, is start in the step of described seizure kernel warning order line parameter and environmental variance parameter at described processor,
The command line parameter that described processor transmits when described environmental variance parameter and described kexec order are loaded the image file of described seizure kernel copies in memory headroom reserved in head.s, then to for representing the status flag preparing to start described seizure kernel.
6. disposal route according to claim 5, is characterized in that, is stored as in the step of dump file by the memory mirror of the described system kernel that collapse occurs, further comprising the steps,
After described seizure kernel starts, perform vmcore_init initialization function;
Described vmcore_init initialization function judges whether the elfcorehdr parameter value in described command line parameter is ELFCORE_ADDR_ERR or ELFCORE_ADDR_MAX,
If judged result is described elfcorehdr parameter value neither ELFCORE_ADDR_ERR neither ELFCORE_ADDR_MAX, then establishment/proc/vmcore file, and by there is collapse the memory mirror of described system kernel with ELF stored in file format in described/proc/vmcore file;
Open collapse dump service routine to call the makedumpfile order of makedumpfile instrument, by save memory mirror that the described system kernel collapsed occurs /proc/vmcore file generated dump file is using as dump file.
7. disposal route according to any one of claim 1 to 6, is characterized in that,
The interface function of the MIPS framework for described Loongson server is with the addition of in described makedumpfile instrument, comprising 3 interfaces, the virtual address when start address interface be respectively and obtain physical base address interface, obtaining address and the discontinuous memory field collapsing chained list first element in discontinuous memory field in kernel, reading collapse kernel information is to the translation interface of physical address.
8. the disposal route according to any one of claim 1 to 6, is characterized in that,
By crash instrument, described dump file is analyzed, with before finding once the system kernel of described (SuSE) Linux OS there is the reason of collapse,
Wherein, the interface function of the MIPS framework for described Loongson server is with the addition of in described crash instrument, comprising 3 interfaces, to be respectively the virtual address translation of collapse kernel be physical address interface, the hardware all to MIPS framework machine carries out Initialize installation interface, obtains the page directory item pointer interface of appointment process.
CN201210437092.0A 2012-11-05 2012-11-05 Method for treating crash dump of Linux operation system based on loongson server Active CN102929747B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210437092.0A CN102929747B (en) 2012-11-05 2012-11-05 Method for treating crash dump of Linux operation system based on loongson server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210437092.0A CN102929747B (en) 2012-11-05 2012-11-05 Method for treating crash dump of Linux operation system based on loongson server

Publications (2)

Publication Number Publication Date
CN102929747A CN102929747A (en) 2013-02-13
CN102929747B true CN102929747B (en) 2015-07-01

Family

ID=47644553

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210437092.0A Active CN102929747B (en) 2012-11-05 2012-11-05 Method for treating crash dump of Linux operation system based on loongson server

Country Status (1)

Country Link
CN (1) CN102929747B (en)

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226510B (en) * 2013-04-27 2015-09-30 华为技术有限公司 Resolve the method and apparatus of vmcore file
CN104750605B (en) * 2013-12-30 2018-08-14 伊姆西公司 Include in user's dump by kernel objects information
CN103927240A (en) * 2014-05-06 2014-07-16 成都西加云杉科技有限公司 Information dumping method and device answering to software breakdown
CN106293499B (en) * 2015-06-12 2019-08-27 联想(北京)有限公司 A kind of file acquisition method and baseboard management controller, basic input output system
CN105160001B (en) * 2015-09-09 2017-03-08 山东省计算中心(国家超级计算济南中心) A kind of linux system physical memory image file analysis method
CN105242981A (en) * 2015-10-30 2016-01-13 浪潮电子信息产业股份有限公司 Configuration method of Kdump and computer device
CN107153453A (en) * 2016-03-04 2017-09-12 中兴通讯股份有限公司 A kind of linux system reset processing method and device
CN105843705A (en) * 2016-03-22 2016-08-10 青岛海信移动通信技术股份有限公司 Mobile communication terminal and memory dumping method thereof
CN106339285A (en) * 2016-08-19 2017-01-18 浪潮电子信息产业股份有限公司 Analysis method for accidental restart of LINUX system
CN108073507B (en) * 2016-11-17 2021-04-13 联芯科技有限公司 Processing method and device for kernel crash field data
CN106776090A (en) * 2016-11-29 2017-05-31 郑州云海信息技术有限公司 A kind of method for collecting information when RHEL operating systems are without response
CN107357684A (en) * 2017-07-07 2017-11-17 郑州云海信息技术有限公司 A kind of kernel failure method for restarting and device
CN107368384A (en) * 2017-07-21 2017-11-21 郑州云海信息技术有限公司 A kind of Linux server abnormal information dump system and method
CN107506638B (en) * 2017-08-09 2020-10-16 南京大学 Kernel control flow abnormity detection method based on hardware mechanism
CN108228260A (en) * 2018-01-02 2018-06-29 联想(北京)有限公司 Kernel switching method and electronic equipment
CN108334462A (en) * 2018-03-05 2018-07-27 山东超越数控电子股份有限公司 A kind of optical channel card implementation method based on milky way kylin operating system
CN108920215A (en) * 2018-07-18 2018-11-30 郑州云海信息技术有限公司 A method of passing through initramfs collection system log
CN109582542B (en) * 2018-12-04 2023-02-21 中国航空工业集团公司西安航空计算技术研究所 Method for dumping core of embedded system
CN109597677B (en) * 2018-12-07 2020-05-22 北京百度网讯科技有限公司 Method and apparatus for processing information
CN110083477A (en) * 2019-05-06 2019-08-02 深圳市智微智能科技开发有限公司 Guarantee the method for memory mapping when a kind of Kernel Panic
CN110262918B (en) * 2019-06-19 2023-07-18 深圳市网心科技有限公司 Process crash analysis method and device, distributed equipment and storage medium
CN110673974A (en) * 2019-08-20 2020-01-10 中科创达软件股份有限公司 System debugging method and device
CN110647451A (en) * 2019-08-30 2020-01-03 深圳壹账通智能科技有限公司 Application program abnormity analysis method and generation method
CN111124488A (en) * 2019-12-11 2020-05-08 山东超越数控电子股份有限公司 Debian system transplanting method based on Loongson processor
CN113127263B (en) * 2020-01-15 2023-04-07 中移(苏州)软件技术有限公司 Kernel crash recovery method, device, equipment and storage medium
CN111459716A (en) * 2020-03-02 2020-07-28 天津众达智腾科技有限公司 Kernel backup loading mode based on domestic processor
US11556349B2 (en) 2020-03-04 2023-01-17 International Business Machines Corporation Booting a secondary operating system kernel with reclaimed primary kernel memory
CN112650610B (en) * 2020-12-11 2023-01-10 苏州浪潮智能科技有限公司 Linux system crash control method, system and medium
CN112395137B (en) * 2021-01-21 2021-11-09 北京太一星晨信息技术有限公司 Linux kernel exception processing method, equipment and device
CN113326213B (en) * 2021-05-24 2023-07-28 北京计算机技术及应用研究所 Method for realizing address mapping in driver under Feiteng server platform
CN113434150B (en) * 2021-08-30 2021-12-17 麒麟软件有限公司 Linux kernel crash information positioning method
CN118550762A (en) * 2023-02-24 2024-08-27 中兴通讯股份有限公司 Processing method of operating system fault and acceleration card
CN116775501B (en) * 2023-08-25 2023-12-12 荣耀终端有限公司 Software testing method, server, readable storage medium and chip system
CN117931608B (en) * 2024-03-14 2024-07-05 麒麟软件有限公司 Method and device for counting file cache occupation in vmcore and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101324850A (en) * 2007-06-12 2008-12-17 中兴通讯股份有限公司 LINUX inner core dynamic loading method
CN101820356A (en) * 2010-02-06 2010-09-01 大连大学 Network fault diagnosis system based on ARM-Linux
US7818616B2 (en) * 2007-07-25 2010-10-19 Cisco Technology, Inc. Warm reboot enabled kernel dumper
CN201774541U (en) * 2010-02-06 2011-03-23 大连大学 Portable network fault diagnostic device
CN102270173A (en) * 2011-07-21 2011-12-07 哈尔滨工业大学 Fault injection tool based on SCSI (small computer system interface) driver layer

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7149929B2 (en) * 2003-08-25 2006-12-12 Hewlett-Packard Development Company, L.P. Method of and apparatus for cross-platform core dumping during dynamic binary translation
JP5120664B2 (en) * 2009-07-06 2013-01-16 日本電気株式会社 Server system and crash dump collection method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101324850A (en) * 2007-06-12 2008-12-17 中兴通讯股份有限公司 LINUX inner core dynamic loading method
US7818616B2 (en) * 2007-07-25 2010-10-19 Cisco Technology, Inc. Warm reboot enabled kernel dumper
CN101820356A (en) * 2010-02-06 2010-09-01 大连大学 Network fault diagnosis system based on ARM-Linux
CN201774541U (en) * 2010-02-06 2011-03-23 大连大学 Portable network fault diagnostic device
CN102270173A (en) * 2011-07-21 2011-12-07 哈尔滨工业大学 Fault injection tool based on SCSI (small computer system interface) driver layer

Also Published As

Publication number Publication date
CN102929747A (en) 2013-02-13

Similar Documents

Publication Publication Date Title
CN102929747B (en) Method for treating crash dump of Linux operation system based on loongson server
EP3491519B1 (en) Optimized uefi reboot process
CN103377063B (en) From legacy operating systems environment recovery to the method and system of UEFI pre-boot environment
EP2189901B1 (en) Method and system to enable fast platform restart
US7017039B2 (en) Method of booting a computer operating system to run from a normally unsupported system device
US20040172578A1 (en) Method and system of operating system recovery
US7047403B2 (en) Method and system for operating system recovery and method of using build-to-configuration mode to model computer system
US7146512B2 (en) Method of activating management mode through a network for monitoring a hardware entity and transmitting the monitored information through the network
US7007192B2 (en) Information processing system, and method and program for controlling the same
US20200348946A1 (en) Information Handling System (IHS) And Method To Proactively Restore Firmware Components To A Computer Readable Storage Device Of An IHS
US20070168699A1 (en) Method and system for extracting log and trace buffers in the event of system crashes
CN104254840A (en) Memory dump and analysis in a computer system
CN106990985A (en) Apparatus and method based on BMC renewals and standby system UEFI firmwares
JP2005301639A (en) Method and program for handling os failure
US11030047B2 (en) Information handling system and method to restore system firmware to a selected restore point
TWI764454B (en) Firmware corruption recovery
JP4759941B2 (en) Boot image providing system and method, boot node device, boot server device, and program
CN114020340B (en) Server system and data processing method thereof
CN113342365A (en) Operating system deployment method, device, equipment and computer-readable storage medium
CN107168815B (en) Method for collecting hardware error information
CN116841629A (en) Network card function configuration method, device and medium thereof
WO2008048581A1 (en) A processing device operation initialization system
WO2011157105A2 (en) Method and device for component expansion
US8667335B2 (en) Information processing apparatus and method for acquiring information for hung-up cause investigation
CN116679992A (en) Information processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant