CN108153587B - Slow task reason detection method for big data platform - Google Patents
Slow task reason detection method for big data platform Download PDFInfo
- Publication number
- CN108153587B CN108153587B CN201711436008.2A CN201711436008A CN108153587B CN 108153587 B CN108153587 B CN 108153587B CN 201711436008 A CN201711436008 A CN 201711436008A CN 108153587 B CN108153587 B CN 108153587B
- Authority
- CN
- China
- Prior art keywords
- task
- time
- slow
- tasks
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
- G06F11/3433—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment for load management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Debugging And Monitoring (AREA)
Abstract
The processing process of the application program of the big data is generally divided into a plurality of stages, each stage divides a plurality of tasks to be executed on a plurality of nodes in parallel, the tasks generally execute the same code, and the next stage can be executed only when all the tasks of one stage are completely finished. During the processing process, the execution time of some tasks is too long due to a plurality of factors, the tasks greatly slow the execution time of the program, and the detection of the reasons (fault diagnosis) causing the slow tasks helps a big data application developer to improve the performance of the big data application. The slow task cause detection method for the big data platform provided by the invention obtains relevant characteristics by a periodic hardware information sampling and running log analysis method and obtains the cause of the slow task by applying a statistical method.
Description
Technical Field
The invention relates to big data application performance analysis, resource monitoring, performance bottleneck diagnosis and visualization.
Background
The development of the internet has made data exponentially accumulated over the last decade, and large data has become more and more widely used in various fields. The big data processing problem can be basically simplified into dividing the data into small data blocks, then each machine processes one small data block, which is called a task, and when some machine processes the task too slowly, the execution speed of the whole program is slowed down, and these tasks are called slow tasks (diaggler tasks). Under the cloud computing environment required by microsoft, the execution time length of 10% of slow tasks is 10 times of the median of the execution time of the tasks. Research into google cloud computing environments also found that the slowest 5% of tasks resulted in 99% delay. There are also similar log analyses in production environments, all consistently verifying that slow tasks cause significant delays.
The existing methods mainly focus on how to eliminate the influence of slow tasks by speculative execution, detect slow tasks during program running, and then distribute the tasks to idle machines for execution. The disadvantage of this approach is that the slow task generation is caused by a number of factors, and if the execution time is too long due to network congestion, moving the task to another machine causes greater congestion; if the execution time is too long due to data skew (too much data is processed by the task), migrating the task to another machine also does not allow the program to execute faster. Furthermore, speculative execution can take up additional resources, which can result in the entire cluster being in a highly loaded state, making it inconvenient for multiple users to share computing resources. The existing slow task cause positioning mainly comprises the following methods:
1. correlation analysis
The method mainly judges whether the appearance of the slow task is accompanied by the appearance of certain characteristics, and has great disadvantages that the characteristics accompanied with the appearance of the slow task are not necessarily the reasons for the appearance of the slow task, for example, the high resource occupancy rate sometimes can be caused by the task itself, and in this case, the high resource occupancy rate causes the slow task;
2. pile inserting
The method mainly comprises the steps of inserting piles into a big data platform, obtaining detailed scheduling information, and calculating the time for executing each step by a task so as to find out the reason of the slow task. The method has the disadvantages that the required information can be obtained only by pile insertion, and the method is inconvenient to deploy in a production environment; moreover, many features which may cause slow tasks to be generated cannot measure time, and the slow tasks obtained by the method are not comprehensive enough in reason;
3. top down analysis
The method mainly comprises the steps of appointing a characteristic sequence arranged according to priority, then sequentially checking whether the characteristics appear in the slow task in the execution process, and stopping searching once the characteristics are found to meet the conditions, wherein the method has the defects that a plurality of characteristics cannot be positioned for the slow task, and the determination of the priority has strong human factors and is not objective; the method also does not compare different tasks in the same stage, and can not accurately position the reason of the slow task.
Disclosure of Invention
The invention provides a slow task reason analysis method for a big data platform, which is characterized by comprising the following steps of obtaining characteristics based on a big data platform offline log analysis and sampling log analysis mode, and then comparing the characteristics of a slow task with the characteristics of different tasks at the same stage to obtain a slow task reason; the method has the advantages that the internal cause and the external cause of the task granularity can be judged, so that a user can conveniently position the bottleneck of the application program and improve the execution time of the application program. The method comprises the following steps (1) to (9):
step (1) obtaining original log information from a cluster scheduler;
the cluster scheduler is responsible for scheduling the user application program, and integrates the log information to form original log information after the user application program is finished, and the original log information is sent to the fault analyzer;
step (2) the fault analyzer analyzes original log information, a resource occupation sequence, a load generation time period sequence and a task object sequence are obtained; the fault analyzer analyzes original log information of different sources, analyzes resource occupation logs into resource occupation sequences which are separately stored according to computing nodes and are arranged according to a time sequence, analyzes load occupation logs into load generation time period sequences which are separately stored according to the computing nodes and are arranged according to the time sequence, and analyzes big data log information into a task object sequence which is arranged according to a task sequence number and contains original characteristics;
step (3) fusing resource occupation information and load generation information into a task object sequence;
traversing the task object sequence, finding out the node where the task is located and the time span information, finding out the corresponding resource occupation information from the resource occupation sequence, averaging and storing the information into the task object; traversing a load generation time period sequence, if the load generation time period is overlapped with the task span, storing the load information into a task object to show that the task runs under the influence of the load;
step (4) obtaining the execution time of each task and the execution times of all tasks at the stage of the task;
traversing all tasks, finding all tasks of the stage where each task is located, and recording the running time of the tasks into the stage object;
step (5) slow task information is obtained by comparing the execution time of the task with the median of the execution times of all tasks at the stage of the task, the median of the execution times of all tasks of the object at the stage of the task is found, if the execution time of a certain task is more than 1.5 times of the median, the task is considered to be a slow task, and the task is added into a slow task index;
step (6), cleaning and normalizing data from the task object, and extracting required features;
data cleansing refers to removing useless features; encoding discrete features extracted from an original task (the discrete features in a log are generally uncoded character strings, and the encoding of the discrete features follows the principle that the more the task performance is influenced, the larger the encoding result value is); the characteristics causing the slow task comprise discrete characteristics and numerical characteristics, the discrete characteristics comprise data locality characteristics, the numerical characteristics comprise time characteristics and non-time characteristics, and the non-time characteristics comprise resource occupation characteristics and common numerical characteristics; dividing the time characteristic by the time of the task execution to obtain a normalized time characteristic, and dividing the non-time characteristic by the average value of the characteristics of all the tasks in the stage to obtain a normalized non-time characteristic;
step (7) acquiring a feature set of all task objects of the application program, and counting global quantile information of each feature; the method comprises the following steps:
establishing a global index for each feature, traversing each task object, adding the features of the task into an array type data structure corresponding to the global feature index, and then counting quantiles for all the features according to a threshold value specified by a configuration file;
step (8) traversing each slow task; the following decision logic is performed for all slow tasks and each of their features, including the steps of:
(8-1) if the feature is a numerical feature, judging (8-3), otherwise, judging (8-2);
(8-2) if the feature is an abnormal feature and the feature of other tasks in the same stage is a non-abnormal feature, wherein the abnormal feature is that the value of the feature is not 0 and is more than a plurality of times of the average value of all the features in the same stage, and the times are specified by a configuration file, judging that the feature is a slow task reason, otherwise, judging that the feature is not the slow task reason, and ending the judgment;
(8-3) if the feature is a time feature, judging (8-4), otherwise, judging (8-5);
(8-4) judging whether the characteristic is larger than a preset threshold value, wherein the condition is that the duration represented by the characteristic has great influence on the task execution time, and the preset threshold value is set according to the configuration of a user. If yes, judging (8-5), otherwise, if not, ending the judgment, wherein the characteristic is not the reason of the slow task;
(8-5) whether the feature is larger than the global quantile of the feature or not, if so, judging (8-6), if not, judging that the feature is not a slow task reason, and ending the judgment; the condition is to ensure that the feature is not only larger than other features in the same stage, but also large enough in the global scope;
(8-6) whether the feature is a resource occupation feature, if so, executing an edge detection algorithm, and judging (8-7), otherwise, judging (8-8).
(8-7) whether the feature is raised at the beginning of the task and lowered at the end of the task, this step being mainly to filter the cases of too high resource occupancy caused by the task itself. If so, the feature is not considered as the reason of the slow task, and the judgment is finished, otherwise, the judgment is continued (8-8);
(8-8) whether the characteristic is more than a plurality of times of the median of the characteristics of other tasks in the stage, wherein the multiple is specified by a configuration file, if so, the characteristic is a slow task reason, otherwise, the characteristic is not the slow task reason, and the judgment is finished.
And (9) displaying the slow task reason to the user. And displaying the resource occupation sequence, the load generation sequence, the slow task sequence and the slow task reason analysis to a user through a graphical interface.
Further, in the above method for detecting slow task reasons for a big data platform, the cluster scheduler in step (1) may schedule a resource load generator of a computing node in addition to a user application, where the load generator includes: CPU occupation generator, disk occupation generator and network resource occupation generator.
Further, in the slow task cause detection method for the big data platform,
the data locality characteristics comprise:
(3-1-1) location of data deposit for task processing
Flocality=0,PROCESS_LOCAL
=1,NODE_LOCAL
=2,otherwise
Wherein, processing _ LOCAL represents the address space of the data needed by the task in the PROCESS, NODE _ LOCAL represents the data needed by the task on the NODE, and otherwise represents the data needed by the task at other positions;
the time characteristics comprise:
(3-2-1) time F taken for the task to perform serializationserializeThe calculation method is as follows:
Fserialize=T/Tavg
wherein T is the time of the task serialization, TavgAveraging serialization time for all tasks at the stage of the task;
(3-2-2) time F taken for task to perform deserializationdeserializeThe calculation method is as follows:
Fdeserialize=T/Tavg
wherein T is the time of the task deserialization, TavgAveraging deserialization time for all tasks at the stage of the task;
(3-2-3) time F occupied by garbage collection of task in JVMJVMThe calculation method is as follows:
FJVM=T/Tavg
wherein T is the time for the task to run JVM garbage collection, TavgAveraging the JVM garbage collection time for all tasks at the stage of the task, wherein the larger the index value is, the more time the task spends on JVM garbage collection is, the larger the influence on the performance is;
the resource occupation characteristics are specifically as follows:
(3-3-1) CPU resource occupancy rate FCPUNamely the system CPU resource occupancy rate obtained by sampling during the running period of the application program, the CPU characteristic during the running period of one task is calculated as follows:
wherein, user _ timetIs the user CPU time, total _ time, sampled each time during the task executiontIs the total CPU time, t, obtained during the execution of the task0Is the start time of the task, t1The task end time is, for a multi-core CPU, the above formula needs to average the number of cores, and the index reflects the busy degree of the CPU;
(3-3-2) occupancy rate of disk resources FdiskThat is, the disk I/O resource occupancy rate obtained by sampling during the running of the application program, the calculation formula is as follows:
I/O_timetis the system waiting I/O time, total _ time, in the last sampling interval obtained by each samplingtIs a sampling interval, the index reflects the busy degree of the system disk I/O;
(3-3-3) the occupancy rate of network resources, namely the sending and receiving network traffic (byte) obtained by sampling during the running period of the application program, and the calculation formula is as follows:
Bytes_senttis the data traffic (byte) sent by the monitored network card during sampling, Bytes _ receivedtThe network card is the data traffic (byte) received by the network card, and the index reflects the busy degree of the network equipment of the system;
the common numerical characteristics include:
(3-4-1) amount of data processed by task Fread_bytesThe calculation method is as follows:
Fread_bytes=R/Ravg
wherein, R is the byte number read by the task, RavgThe number of bytes read for all tasks in this stage (i.e. a group of tasks executing the same code) on average, which index reflects the severity of the task data skew (more data processed than other tasks);
(3-4-2) number of bytes read for task shuffle Fshuffle_read_bytesThe calculation method is as follows:
Fshuffle_read_bytes=B/Bavg
b is the number of bytes read by the task shuffle, BavgThe number of bytes read by the average shuffle of all the tasks at the stage is shown, and the index reflects the data inclination degree of the task processing shuffle read task;
(3-4-3) number of bytes read for task shuffle Fshuffle_write_bytesThe calculation method is as follows:
Fshuffle_write_bytes=B/Bavg
b is the number of bytes written in the task shuffle, BavgThe number of bytes written in all the tasks in the stage is averagely shuffled, and the index reflects the data inclination degree of the task processing the shuffled writing task;
(3-4-4) byte number F of overflow of task temporary storage data file into memorymemory_bytes_spilledThe calculation method is as follows:
Fmemory_bytes_spilled=B/Bavg
b is the number of bytes overflowing to the memory by the task, BavgThe number of bytes overflowing to the memory is the average number of bytes of all tasks at the stage of the task, and the index reflects the size of the data cached by the task in the memory;
(3-4-5) byte number F of overflow of task temporary storage data file to diskdisk_bytes_spilledThe calculation method is as follows:
Fdisk_bytes_spilled=B/Bavg
b is the number of bytes overflowing to the disk by the task, BavgAnd the index reflects the size of the data cached by the task in the disk for the number of bytes which are overflowed to the disk by all the tasks in the stage of the task on average.
Further, in the slow task cause detection method for the big data platform, different limiting conditions are applied for different types of features, specifically as follows:
for the data locality characteristics, if the data locality characteristics of a certain task are 2, and the sum of the characteristics of other tasks at the stage of the task is less than 1/2 of the number of tasks;
for a common numerical feature, if the feature of a task is greater than 0.75 quantile of the feature during the execution of the whole program and the feature is greater than 2 times the average of other features at the stage of the task;
for the time characteristic, in addition to the limit condition of the common numerical characteristic, an additional condition needs to be met, namely the time occupied by the characteristic is more than 0.2 time of the execution time of the whole task;
for the resource occupation characteristics, in addition to meeting the limiting conditions of the common numerical characteristics, an additional condition needs to be met, namely whether the average value of the resource occupation within 5 seconds before the start of the task is less than 0.5 time of the resource occupation characteristics of the task or not; whether the resource occupancy rate is reduced when the task is finished, namely whether the average value of the resource occupancy within 5 seconds after the task is started is less than 0.5 time of the resource occupancy characteristics of the task; this feature is not responsible for slow tasks if both of the above conditions are met.
The slow tasks are displayed on the time axis due to the fact that the difference of the execution time of the tasks at different stages is large, so that a user can conveniently position the slow tasks with long execution time and analyze the reason.
The scheduler needs to ask the user whether to apply resource load and how to schedule, and specifically includes the following information:
(1-1) what resource load (CPU, disk, network) is applied, the user can select zero or more resource loads;
(1-2) the length and interval of each resource load application, namely the resource load application to the nodes can be intermittent, so that a real environment can be simulated conveniently, and a user can select periodic load simulation with fixed length and fixed interval or periodic load simulation with random length and random interval. For random load application, the user can also select a distribution region of random numbers (the system randomly generates numbers in a uniform distribution in the lower and upper limits);
(1-3) the intensity of each resource load, for CPU load, the user can select the number of random arithmetic operations and the array length, for disk load, the user can select the length of a character string written into a disk, and for network load, the user can select the length of a character string sent to a server and the length of a character string returned by the server;
(1-4) the number of processes opened by each resource load, the number of processes started by a user when the resource load is applied each time can be selected, and the index also controls the intensity of the resource load application;
(1-5) each node to which the resource load is applied, the user may select one or more nodes to apply the resource load, so that the performance of the application when the load is not uniform can be tested.
The specific operation process of the scheduler is as follows:
(2-1) sending the load scheduling information to each computing node, starting scheduling, and writing the scheduling information into a log file;
and (2-2) starting a hardware sampling process and writing sampling information into a log file.
(2-3) starting the application configured by the user;
and (2-4) waiting for the user program to end, and collecting a load scheduling log and a hardware sampling log from the computing node to the main node.
(2-5) the master node sends a message to the compute node killing the load scheduling process and the hardware sampling process.
The specific analytic method in step (2) is as follows:
(3-1) acquiring a task object, wherein the object comprises a task ID, a stage ID, a node ID, resource information, environment information, time information, input and output information, data source information, inference execution information, serialization and deserialization information, data persistence information, Java virtual machine information, data block information and the like;
(3-2) acquiring a stage object, wherein the object comprises a stage ID, the number of tasks, data information of stage processing, time information and the like;
and (3-3) establishing mutual indexes of the phase objects and the task objects, wherein the slave task objects can index the corresponding phase objects, and the slave phase objects can index the task objects contained in the slave phase objects.
In the step (5), slow task information is obtained by comparing the task execution time with the median of the task execution time at the stage of the task, and the detailed process is as follows:
calculating a median according to the execution time of each task in the corresponding phase object calculation phase, then finding a slow task object, and establishing an index, wherein the slow task object is screened in the following mode:
Ttask>1.5*Median(Tstage)
i.e. the execution time of the task is greater than 1.5 times the median of the execution times of all tasks at this stage.
Further, in the slow task cause detection method for the big data platform, the detailed process of the data cleaning and normalization in the step (6) is as follows:
and calculating the characteristic information of the stage object, wherein the characteristic information comprises the average read data volume, the average shuffle write data volume, the average persisted data volume to the memory and the average persisted data volume to the disk of the stage object. Then all the characteristics of all the tasks are traversed, and for discrete characteristics, the original characteristics need to be converted into numerical numbers from character strings, and in the process, the criterion that the larger the value of the characteristics needs to be observed is caused to be slower in the execution of the application program. For the time characteristic, the task execution time is divided by the time characteristic, and for other numerical characteristics, the time characteristic is divided by the average characteristic of the stage where the task is located.
Further, in the slow task cause detection method for the big data platform, the slow task and cause analysis thereof are visualized in the step (9), and slow task causes are displayed to a user, and the method includes:
(5-1) marking the slow task on a time axis by using a black bold line, wherein the height (Straggler Scale) of the slow task represents that the execution time of the slow task is a multiple of the average value of the execution time of the stage of the slow task;
(5-2) marking the reason of the slow task in the vicinity of the slow task line;
(5-3) marking different resource occupancy rate change curves on a time axis, wherein the height (Feature Scale) of the different resource occupancy rate change curves represents the relative size of the resource occupancy rates;
and 5-4, marking the time span of the resource occupancy rate generator (AG). The invention provides a slow task cause detection method for a big data platform, which realizes slow task cause detection by a feature comparison method and overcomes the defects of a correlation statistical method in the prior art; meanwhile, the technical scheme provided by the invention can analyze slow task reasons without pile insertion, influence on the existing application program and change of the existing big data frame, and overcomes the performance problem caused by a pile insertion method; in addition, the invention further utilizes the multi-feature flat level analysis to overcome the defect of top-down analysis of hierarchical analysis.
Drawings
FIG. 1 is a schematic diagram of a system architecture for implementing the slow task cause detection method for a big data platform according to the present invention;
FIG. 2 is a flow chart of a slow task cause detection method for a big data platform according to the present invention;
FIG. 3 is a schematic diagram of application performance bottleneck diagnostics;
FIG. 4 is an inheritance relationship of a feature object.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The basic idea of the invention is to obtain features during task execution and then to perform slow task cause localization by comparison and filtering between features.
Fig. 1 is a schematic diagram of a system architecture for implementing the slow task cause detection method for a big data platform according to the present invention. The visualization node is responsible for displaying performance analysis results (system resource tracks, distribution of application program performance bottlenecks, performance bottleneck reasons, related optimization suggestions and the like) of the application programs to users, and the scheduler is responsible for scheduling the load generation programs, distributing application program running information to the big data framework, monitoring the states of the application programs and acquiring log information from the computing nodes when the application programs are finished.
FIG. 2 is a flowchart of the slow task cause detection method for a big data platform according to the present invention, and the detailed flow includes steps (1) to (9):
step (1) obtaining original log information from a cluster scheduler;
the cluster scheduler is responsible for scheduling the user application program, and integrates the log information to form original log information after the user application program is finished, and the original log information is sent to the fault analyzer;
step (2) the fault analyzer analyzes original log information, a resource occupation sequence, a load generation time period sequence and a task object sequence are obtained;
the fault analyzer analyzes original log information of different sources, analyzes resource occupation logs into resource occupation sequences which are separately stored according to computing nodes and are arranged according to a time sequence, analyzes load occupation logs into load generation time period sequences which are separately stored according to the computing nodes and are arranged according to the time sequence, and analyzes big data log information into a task object sequence which is arranged according to a task sequence number and contains original characteristics;
step (3) fusing resource occupation information and load generation information into a task object sequence;
traversing the task object sequence, finding out the node where the task is located and the time span information, finding out the corresponding resource occupation information from the resource occupation sequence, averaging and storing the information into the task object; traversing a load generation time period sequence, if the load generation time period is overlapped with the task span, storing the load information into a task object to show that the task runs under the influence of the load;
step (4) obtaining the execution time of each task and the execution times of all tasks at the stage of the task;
traversing all tasks, finding all tasks of the stage where each task is located, and recording the running time of the tasks into the stage object;
step (5) slow task information is obtained by comparing the execution time of the task with the median of the execution times of all tasks at the stage of the task, the median of the execution times of all tasks of the object at the stage of the task is found, if the execution time of a certain task is more than 1.5 times of the median, the task is considered to be a slow task, and the task is added into a slow task index;
step (6), cleaning and normalizing data from the task object, and extracting required features;
data cleansing refers to removing useless features; encoding discrete features extracted from an original task; the characteristics causing the slow task comprise discrete characteristics and numerical characteristics, the discrete characteristics comprise data locality characteristics, the numerical characteristics comprise time characteristics and non-time characteristics, and the non-time characteristics comprise resource occupation characteristics and common numerical characteristics; dividing the time characteristic by the time of the task execution to obtain a normalized time characteristic, and dividing the non-time characteristic by the average value of the characteristics of all the tasks in the stage to obtain a normalized non-time characteristic;
step (7) acquiring a feature set of all task objects of the application program, and counting global quantile information of each feature; the method comprises the following steps:
establishing a global index for each feature, traversing each task object, adding the features of the task into an array type data structure corresponding to the global feature index, and then counting quantiles for all the features according to a threshold value specified by a configuration file;
step (8) traversing each slow task; the following decision logic is performed for all slow tasks and each of their features, including the steps of:
(8-1) if the feature is a numerical feature, judging (8-3), otherwise, judging (8-2);
(8-2) if the characteristic is an abnormal characteristic (the characteristic numerical value is not 0) and the value is more than a plurality of times of the average value of the stage characteristic (the parameter is specified by a configuration file, and the condition is to ensure that other characteristics of the stage do not generate abnormality), judging that the characteristic is a slow task reason, and if not, finishing the judgment;
(8-3) if the feature is a time feature, judging (8-4), otherwise, judging (8-5);
(8-4) judging whether the characteristic is larger than a preset threshold value, wherein the preset threshold value is set according to the configuration of a user). If yes, judging (8-5), otherwise, if not, ending the judgment, wherein the characteristic is not the reason of the slow task;
(8-5) whether the feature is larger than the global quantile of the feature or not, if so, judging (8-6), if not, judging that the feature is not a slow task reason, and ending the judgment;
(8-6) whether the characteristic is a characteristic related to system resources (CPU, I/O, network occupancy rate) or not, if so, executing an edge detection algorithm, and judging (8-7), otherwise, judging (8-8);
(8-7) whether the characteristic rises at the beginning of the task and falls at the end of the task, if so, the characteristic is not considered as a slow task reason, and the judgment is finished, otherwise, the judgment is continued (8-8);
(8-8) whether the characteristic is more than a plurality of times of the median of the characteristics of other tasks in the stage (specified by the configuration file), if so, the characteristic is the reason of slow task, otherwise, the characteristic is not the reason of slow task, and the judgment is finished.
And (9) visualizing the slow task and reason analysis thereof to the user.
The load generation program in the step (2) is specifically implemented as follows:
(1-1) a CPU footprint generator that, within a specified time, initiates the following procedure to generate a higher CPU footprint:
(1-1-1) generating a plurality of processes to execute the following procedures;
(1-1-2) generating an array a of length 106, each of which is assigned a random number, a [ i ] ═ random ();
(1-1-3) performing a random arithmetic operation (addition, subtraction, multiplication, and random number) on each numerical value in the array a;
(1-1-4) returning to (1-1-2) until the time exceeds the limit;
(1-2) a disk occupancy generator that, within a specified time, initiates the following procedures to generate a higher disk occupancy:
(1-2-1) generating a plurality of processes to execute the following procedures;
(1-2-2) generating a character string (filled with "0" character) s of length 107;
(1-2-3) writing s into the random disk file f;
(1-2-4) pointing a disk file pointer to the beginning of the file;
(1-2-5) refreshing the buffer;
(1-2-6) back to (1-2-2) until the time exceeds the limit;
(1-3) a network resource occupation generator, which starts the following procedures to generate higher network occupation within a specified time:
(1-3-1) generating a plurality of processes to execute the following procedures;
(1-3-2) acquiring sockets of the server;
(1-3-3) transmitting a character string of length 1024 to the server through s;
(1-3-3) receiving a character string with the length of 1024 returned by the server;
(1-3-4) go back to (1-3-3) until the time exceeds the limit.
The resource occupation characteristics in the step (2) are specifically obtained in the following manner:
(2-1) for the CPU occupancy rate characteristic, the invention adopts an mpstat (multiprocesser statistics), samples once per second after starting, and writes into a log file; the invention adopts the proportion of the CPU time occupied by the user mode as the index of the CPU occupancy rate.
(2-2) for the occupancy rate characteristic of the disk, iostat (input/output statistics) is adopted, sampling is carried out once per second after startup, and the sampling is written into a log file. In order to shield the relevant details of the memory, all IO processing time in the statistical time is divided by the total statistical time to be used as an index of the disk occupancy rate;
(2-3) for the network resource occupancy rate, sampling is carried out once per second after starting by adopting sar (system activity report) and the sampling is written into a log file. The invention adopts the sum of the flows sent and received by the network card as the index of the occupancy rate of the network resources.
In step (2), we process the data as follows:
and (3-1) analyzing the big data log to obtain an original task object and a phase object and establish an index. The data structure adopted by the invention is a Python dictionary data structure, wherein the task object takes the task ID as a key, the information contained in the big data log is a value (a multilayer nested dictionary), and typical information is as follows:
TABLE 1 original task feature information for big data Log
The phase object takes the phase ID as a key, and the information contained in the big data log is a value (a multi-layer nested dictionary), and typical information is as follows:
Completion Time | time of stage completion |
Stage ID | Phase ID |
Stage Name | Phase name (information containing the source code being executed) |
Number of Tasks | Number of phase-containing tasks |
Submission Time | Time of task submission |
Accumulables | Accumulator information |
TABLE 2 Primary stage feature information for big data Log
And (3-2) initializing characteristic information of the task object and the phase object. Establishing a task index in a stage object, calculating the locality characteristics of all task data in the whole stage, taking the sum of common numerical value characteristics and resource occupation characteristics as stage characteristics, and normalizing the characteristics of tasks contained in the stage;
and (3) fusing resource occupation and load generation information into a task sequence, wherein the specific implementation details are as follows:
for the hardware sampling log, firstly, recording information corresponding to a time axis into an array (the length of the array is equal to the length of a sampling sequence), then traversing the whole task object array, finding out the hardware information sampled in a corresponding time span for each task, averaging and writing the hardware information into a task object. For load generation log information, if the time span of a certain load and the time span of a certain task coincide, the load information is written into the task object.
In the step (8), traversing each slow task and each feature of the slow task, and when judging the features, executing the business logic by using the feature object, wherein the specific process is as follows:
(4-1) initializing a feature object, wherein the inheritance relationship of the feature object is shown in FIG. 4, and a Root object is initialized, and the Root object is a dictionary data structure with a task id as a key and a slow task factor as a value;
(4-2) traversing each feature of the slow task;
(4-3) establishing a set of the characteristics contained in other tasks in the stage of the slow task;
(4-4) calling the anomally function of the feature object to obtain whether the feature is the reason causing the slow task, and if so, adding the reason into the Root object.
In the step (9), the slow task and reason analysis thereof are visualized, and the slow task reason is shown to the user, as shown in fig. 3, including:
(5-1) marking the slow task on a time axis by using a black bold line, wherein the height (Straggler Scale) of the slow task represents that the execution time of the slow task is a multiple of the average value of the execution time of the stage of the slow task;
(5-2) marking the reason of the slow task in the vicinity of the slow task line;
(5-3) marking different resource occupancy rate change curves on a time axis, wherein the height (Feature Scale) of the different resource occupancy rate change curves represents the relative size of the resource occupancy rates;
and 5-4, marking the time span of the resource occupancy rate generator (AG).
The step of presenting the reason for the slow task to the user comprises:
firstly, generating a resource change curve and a slow task generation curve of each computing node, and displaying the resource change curves and the slow task generation curves to a user;
then, generating the distribution of slow task reasons of the user program, namely the quantity of various characteristics in all slow task reasons, wherein the data is convenient for a user to analyze which characteristics account for the dominant factor in influencing the execution of the application program and the distribution of slow tasks in different computing nodes, so that the user can conveniently check whether a fault (such as hardware fault and application program abnormity) occurs in a specific node, sort the nodes according to the influence duration of the slow tasks and analyze the reason of each slow task after the sorting;
and finally, acquiring the optimization and solution corresponding to each feature from the database, and giving a user optimization suggestion. The method is convenient for a user to detect the change condition of the big data application performance along with the amount of available resources and analyze the sensitive condition of the program to various resources.
The invention has not been described in detail and is within the skill of the art.
The above description is only a part of the embodiments of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention.
Claims (4)
1. A slow task reason detection method for a big data platform is characterized in that resource occupation characteristics and big data platform log characteristics are extracted firstly, and then characteristics of a slow task are compared with characteristics of different tasks at the same stage to obtain slow task reasons; the method comprises the following steps:
step (1) obtaining original log information from a cluster scheduler;
the cluster scheduler is responsible for scheduling the user application program, and integrates the log information to form original log information after the user application program is finished, and the original log information is sent to the fault analyzer;
step (2) the fault analyzer analyzes original log information, a resource occupation sequence, a load generation time period sequence and a task object sequence are obtained;
the fault analyzer analyzes original log information of different sources, analyzes resource occupation logs into resource occupation sequences which are separately stored according to computing nodes and are arranged according to a time sequence, analyzes load occupation logs into load generation time period sequences which are separately stored according to the computing nodes and are arranged according to the time sequence, and analyzes big data log information into a task object sequence which is arranged according to a task sequence number and contains original characteristics;
step (3) fusing resource occupation information and load generation information into a task object sequence;
traversing the task object sequence, finding out the node where the task is located and the time span information, finding out the corresponding resource occupation information from the resource occupation sequence, averaging and storing the information into the task object; traversing a load generation time period sequence, if the load generation time period is overlapped with the task span, storing the load information into a task object to show that the task runs under the influence of the load;
step (4) obtaining the execution time of each task and the execution times of all tasks at the stage of the task;
traversing all tasks, finding all tasks of the stage where each task is located, and recording the running time of the tasks into the stage object;
step (5) slow task information is obtained by comparing the execution time of the task with the median of the execution times of all tasks at the stage of the task, the median of the execution times of all tasks of the object at the stage of the task is found, if the execution time of a certain task is more than 1.5 times of the median, the task is considered to be a slow task, and the task is added into a slow task index;
step (6), cleaning and normalizing data from the task object, and extracting required features;
data cleansing refers to removing useless features; encoding discrete features extracted from an original task; the characteristics causing the slow task comprise discrete characteristics and numerical characteristics, the discrete characteristics comprise data locality characteristics, the numerical characteristics comprise time characteristics and non-time characteristics, and the non-time characteristics comprise resource occupation characteristics and common numerical characteristics; dividing the time characteristic by the time of the task execution to obtain a normalized time characteristic, and dividing the non-time characteristic by the average value of the characteristics of all the tasks in the stage to obtain a normalized non-time characteristic;
step (7) acquiring a feature set of all task objects of the application program, and counting global quantile information of each feature; the method comprises the following steps:
establishing a global index for each feature, traversing each task object, adding the features of the task into an array type data structure corresponding to the global feature index, and then counting quantiles for all the features according to a threshold value specified by a configuration file;
step (8) traversing each slow task; the following decision logic is performed for all slow tasks and each of their features, including the steps of:
(8-1) if the feature is a numerical feature, judging (8-3), otherwise, judging (8-2);
(8-2) if the feature is an abnormal feature and the feature of other tasks in the same stage is a non-abnormal feature, wherein the abnormal feature is that the value of the feature is not 0 and is more than a plurality of times of the average value of all the features in the same stage, and the times are specified by a configuration file), judging that the feature is a slow task reason, otherwise, judging that the feature is not the slow task reason, and ending the judgment;
(8-3) if the feature is a time feature, judging (8-4), otherwise, judging (8-5);
(8-4) judging whether the characteristic is larger than a preset threshold value, wherein the preset threshold value is set according to the configuration of a user; if yes, judging (8-5), otherwise, if not, ending the judgment, wherein the characteristic is not the reason of the slow task;
(8-5) whether the feature is larger than the global quantile of the feature or not, if so, judging (8-6), if not, judging that the feature is not a slow task reason, and ending the judgment;
(8-6) whether the feature is a resource occupation feature or not, if so, executing an edge detection algorithm, and judging (8-7), otherwise, judging (8-8);
(8-7) whether the characteristic rises at the beginning of the task and falls at the end of the task, if so, the characteristic is not considered as a slow task reason, and the judgment is finished, otherwise, the judgment is continued (8-8);
(8-8) whether the characteristic is more than a plurality of times of the median of the characteristics of other tasks in the stage, wherein the multiple is specified by a configuration file, if so, the characteristic is a slow task reason, otherwise, the characteristic is not the slow task reason, and the judgment is finished;
step (9) visualizing the slow task and the reason analysis thereof;
the data locality characteristics comprise:
(3-1-1) location of data deposit for task processing
Flocality=0,PROCESS_LOCAL
=1,NODE_LOCAL
=2,otherwise
Wherein, processing _ LOCAL represents the address space of the data needed by the task in the PROCESS, NODE _ LOCAL represents the data needed by the task on the NODE, and otherwise represents the data needed by the task at other positions;
the time characteristics comprise:
(3-2-1) time F taken for the task to perform serializationserializeThe calculation method is as follows:
Fserialize=T/Tavg
wherein T is the time of the task serialization, TavgAveraging serialization time for all tasks at the stage of the task;
(3-2-2) time F taken for task to perform deserializationdeserializeThe calculation method is as follows:
Fdeserialize=T/Tavg
wherein T is the time of the task deserialization, TavgAveraging deserialization time for all tasks at the stage of the task;
(3-2-3) time F occupied by garbage collection of task in JVMJVMThe calculation method is as follows:
FJVM=T/Tavg
wherein T is the time for the task to run JVM garbage collection, TavgAveraging the JVM garbage collection time for all tasks at the stage of the task, wherein the larger the index value is, the more time the task spends on JVM garbage collection is, the larger the influence on the performance is;
the resource occupation characteristics are specifically as follows:
(3-3-1) CPU resource occupancy rate FCPUNamely the system CPU resource occupancy rate obtained by sampling during the running period of the application program, the CPU characteristic during the running period of one task is calculated as follows:
wherein, user _ timetIs the user CPU time, total _ time, sampled each time during the task executiontIs the total CPU time, t, obtained during the execution of the task0Is the start time of the task, t1The end time of the task is, for the multi-core CPU, the above formula needs to average the number of cores;
(3-3-2) occupancy rate of disk resources FdiskThat is, the disk I/O resource occupancy rate obtained by sampling during the running of the application program, the calculation formula is as follows:
I/O_timetis the system waiting I/O time, total _ time, in the last sampling interval obtained by each samplingtIs the sampling interval;
(3-3-3) the occupancy rate of network resources, namely the sending and receiving network traffic (byte) obtained by sampling during the running period of the application program, and the calculation formula is as follows:
Bytes_senttis the data traffic (byte) sent by the monitored network card during sampling, Bytes _ receivedtIs the data traffic (byte) that the network card accepts;
the common numerical characteristics include:
(3-4-1) amount of data processed by task Fread_bytesThe calculation method is as follows:
Fread_bytes=R/Ravg
wherein, R is the byte number read by the task, RavgThe number of bytes read for all tasks at this stage is averaged;
(3-4-2) number of bytes read for task shuffle Fshuffle_read_bytesThe calculation method is as follows:
Fshuffle_read_bytes=B/Bavg
b is the number of bytes read by the task shuffle, BavgThe average number of bytes read by shuffling is calculated for all tasks at the current stage;
(3-4-3) number of bytes read for task shuffle Fshuffle_write_bytesThe calculation method is as follows:
Fshuffle_write_bytes=B/Bavg
b is the number of bytes written in the task shuffle, BavgThe number of bytes written in is averagely shuffled for all tasks in the stage;
(3-4-4) byte number F of overflow of task temporary storage data file into memorymemory_bytes_spilledThe calculation method is as follows:
Fmemory_bytes_spilled=B/Bavg
b is the number of bytes overflowing to the memory by the task, BavgThe number of bytes overflowing to the memory for all tasks in the stage of the task is averaged;
(3-4-5) byte number F of overflow of task temporary storage data file to diskdisk_bytes_spilledThe calculation method is as follows:
Fdisk_bytes_spilled=B/Bavg
b is the number of bytes overflowing to the disk by the task, BavgAnd averaging the number of bytes overflowing to the disk for all the tasks at the stage of the task.
2. The method for detecting slow task cause for big data platform according to claim 1, wherein the cluster scheduler in step (1) is capable of scheduling resource load generators of compute nodes in addition to user applications, and the cluster scheduler comprises: CPU occupation generator, disk occupation generator and network resource occupation generator.
3. The method for detecting slow task reasons for a big data platform according to claim 1, wherein different constraints are applied to different types of features, specifically as follows:
for the data locality characteristics, if the data locality characteristics of a certain task are 2, and the sum of the characteristics of other tasks at the stage of the task is less than 1/2 of the number of tasks;
for a common numerical feature, if the feature of a task is greater than 0.75 quantile of the feature during the execution of the whole program and the feature is greater than 2 times the average of other features at the stage of the task;
for the time characteristic, in addition to the limit condition of the common numerical characteristic, an additional condition needs to be met, namely the time occupied by the characteristic is more than 0.2 time of the execution time of the whole task;
for the resource occupation characteristics, in addition to meeting the limiting conditions of the common numerical characteristics, an additional condition needs to be met, namely whether the average value of the resource occupation within 5 seconds before the start of the task is less than 0.5 time of the resource occupation characteristics of the task or not; whether the resource occupancy rate is reduced when the task is finished, namely whether the average value of the resource occupancy within 5 seconds after the task is started is less than 0.5 time of the resource occupancy characteristics of the task; this feature is not responsible for slow tasks if both of the above conditions are met.
4. The slow task cause detection method for large data platforms according to claim 1, characterized in that: the visual slow task and reason analysis in the step (9) comprises the following steps:
(5-1) marking the slow task on a time axis by using a black bold line, wherein the height (Straggler Scale) of the slow task represents that the execution time of the slow task is a multiple of the average value of the execution time of the stage of the slow task;
(5-2) marking the reason of the slow task in the vicinity of the slow task line;
(5-3) marking different resource occupancy rate change curves on a time axis, wherein the height (Feature Scale) of the different resource occupancy rate change curves represents the relative size of the resource occupancy rates;
and 5-4, marking the time span of the resource occupancy rate generator (AG).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711436008.2A CN108153587B (en) | 2017-12-26 | 2017-12-26 | Slow task reason detection method for big data platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711436008.2A CN108153587B (en) | 2017-12-26 | 2017-12-26 | Slow task reason detection method for big data platform |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108153587A CN108153587A (en) | 2018-06-12 |
CN108153587B true CN108153587B (en) | 2021-05-04 |
Family
ID=62463209
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711436008.2A Active CN108153587B (en) | 2017-12-26 | 2017-12-26 | Slow task reason detection method for big data platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108153587B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110928739B (en) * | 2018-09-19 | 2024-03-26 | 阿里巴巴集团控股有限公司 | Process monitoring method and device and computing equipment |
CN109240890A (en) * | 2018-09-25 | 2019-01-18 | 江苏润和软件股份有限公司 | A kind of Spark delay task diagnosis method based on statistical analysis |
CN111124627B (en) * | 2018-11-01 | 2023-08-01 | 百度在线网络技术(北京)有限公司 | Method and device for determining call initiator of application program, terminal and storage medium |
CN111045950B (en) * | 2019-12-16 | 2023-06-30 | 上海钧正网络科技有限公司 | Performance problem point determining method, device, data analysis system and storage medium |
CN111309712A (en) * | 2020-03-16 | 2020-06-19 | 北京三快在线科技有限公司 | Optimized task scheduling method, device, equipment and medium based on data warehouse |
CN113746883B (en) * | 2020-05-29 | 2023-05-19 | 华为技术有限公司 | Link tracking method and system |
CN111835854B (en) * | 2020-07-17 | 2021-08-10 | 北京航空航天大学 | Slow task prediction method based on grey prediction algorithm |
CN112463334B (en) * | 2020-12-04 | 2023-08-18 | 苏州浪潮智能科技有限公司 | Training task queuing reason analysis method, system, equipment and medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105824737A (en) * | 2016-03-31 | 2016-08-03 | 华中科技大学 | Memory data set replacing system and replacing method for big data processing system |
-
2017
- 2017-12-26 CN CN201711436008.2A patent/CN108153587B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105824737A (en) * | 2016-03-31 | 2016-08-03 | 华中科技大学 | Memory data set replacing system and replacing method for big data processing system |
Non-Patent Citations (2)
Title |
---|
"Data Analysis and Synchronization on Inter-continent Data Placement Laboratory";Kun Qian 等;《2015 International Conference on Cloud Computing and Big Data (CCBD)》;20151106;全文 * |
"Data Mining Based Root-Cause Analysis of Performance Bottleneck for Big Data Workload";Weichen Qi 等;《2017 IEEE 19th International Conference on High Performance Computing and Communications》;20171220;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN108153587A (en) | 2018-06-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108153587B (en) | Slow task reason detection method for big data platform | |
US9594754B2 (en) | Purity analysis using white list/black list analysis | |
US9864676B2 (en) | Bottleneck detector application programming interface | |
US8627150B2 (en) | System and method for using dependency in a dynamic model to relate performance problems in a complex middleware environment | |
US20130074056A1 (en) | Memoizing with Read Only Side Effects | |
US20130074057A1 (en) | Selecting Functions for Memoization Analysis | |
US20130067445A1 (en) | Determination of Function Purity for Memoization | |
US20160283345A1 (en) | Application Execution Path Tracing With Configurable Origin Definition | |
EP2831740B1 (en) | Logical grouping of profile data | |
WO2014074163A1 (en) | Input vector analysis for memoization estimation | |
CN107562532B (en) | Method and device for predicting hardware resource utilization rate of equipment cluster | |
US9959197B2 (en) | Automated bug detection with virtual machine forking | |
US20110276949A1 (en) | Memory leak detection | |
CN108509324B (en) | System and method for selecting computing platform | |
CN107102922B (en) | Memory detection method and device and electronic equipment | |
CN111563014A (en) | Interface service performance test method, device, equipment and storage medium | |
US9442817B2 (en) | Diagnosis of application server performance problems via thread level pattern analysis | |
US10754744B2 (en) | Method of estimating program speed-up in highly parallel architectures using static analysis | |
Ghanbari et al. | Stage-aware anomaly detection through tracking log points | |
Bei et al. | MEST: A model-driven efficient searching approach for MapReduce self-tuning | |
Yan et al. | A practice guide of software aging prediction in a web server based on machine learning | |
US20180137036A1 (en) | Determining potential test actions | |
Kehrer et al. | Serverless skeletons for elastic parallel processing | |
CN117724980A (en) | Method and device for testing software framework performance, electronic equipment and storage medium | |
Willnecker et al. | Model-based prediction of automatic memory management and garbage collection behavior |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |