Disclosure of Invention
The invention aims to provide a fault elimination method and a fault elimination system, which can realize quick fault elimination for faults occurring when a user terminal based on HLS accesses a CDN live broadcast system.
In order to achieve the above object, the technical solution of the present invention provides a fault removing method, including:
step S1: the method comprises the steps of obtaining information of CDN equipment to be checked in a CDN live broadcast system, and generating a checking task of the CDN equipment to be checked according to fault reporting information submitted by a user, wherein the fault reporting information comprises at least one of access source IP (Internet protocol), access url (url) and access request time;
step S2: associating the information of the CDN equipment to be investigated with the ID of the investigation task, writing the investigation task, the ID of the investigation task and the information of the CDN equipment to be investigated into a database, and pushing the ID of the investigation task into a task message queue for waiting processing;
step S3: acquiring the ID of the troubleshooting task from a task message queue, acquiring the troubleshooting task and the information of the CDN equipment to be debugged from the database according to the ID of the troubleshooting task, and sending the troubleshooting task to the CDN equipment to be debugged according to the information of the CDN equipment to be debugged so that the CDN equipment to be debugged executes troubleshooting operation according to the received troubleshooting task;
step S4: and receiving a troubleshooting task result sent by the CND equipment to be debugged, and obtaining a user access failure troubleshooting result according to the troubleshooting task result.
Further, the database is a database based on distributed file storage.
Further, in step S3, the checking task is sent to the CDN device to be checked through an infrastructure tool and bastion protocol forwarding.
Further, after the step S4, the method further includes:
and generating a troubleshooting result page to display the user access troubleshooting result to the user.
In order to achieve the above object, the present invention further provides a troubleshooting system, including:
the task generating module is used for acquiring information of CDN equipment to be checked in a CDN live broadcast system and generating a checking task of the CDN equipment to be checked according to fault reporting information submitted by a user, wherein the fault reporting information comprises at least one of access source IP (Internet protocol), access url (url) and access request time;
the writing-in module is used for associating the information of the CDN equipment to be checked with the ID of the checking task, writing the checking task, the ID of the checking task and the information of the CDN equipment to be checked into a database, and pushing the ID of the checking task into a task message queue to wait for processing;
the task execution module is used for acquiring the ID of the troubleshooting task from a task message queue, acquiring the troubleshooting task and the information of the CDN equipment to be debugged from the database according to the ID of the troubleshooting task, and sending the troubleshooting task to the CDN equipment to be debugged according to the information of the CDN equipment to be debugged so that the CDN equipment to be debugged executes troubleshooting operation according to the received troubleshooting task;
and the processing module is used for receiving the troubleshooting task result sent by the CND equipment to be debugged and obtaining a user access failure troubleshooting result according to the troubleshooting task result.
Further, the database is a database based on distributed file storage.
Further, the task execution module forwards the troubleshooting task to the CDN device to be debugged through an infrastructure tool and a bastion machine protocol.
Further, still include:
and the page generation module is used for generating a troubleshooting result page so as to display the user access troubleshooting result to the user.
The troubleshooting method provided by the invention can realize quick troubleshooting of faults when the HLS-based user terminal accesses the CDN live broadcast system.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a flowchart of a fault removal method according to an embodiment of the present invention, where the fault removal method includes:
step S1: the method comprises the steps of obtaining information of CDN equipment to be checked in a CDN live broadcast system, and generating a checking task of the CDN equipment to be checked according to fault reporting information submitted by a user, wherein the fault reporting information comprises at least one of access source IP (Internet protocol), access url (url) and access request time;
step S2: associating the information of the CDN equipment to be investigated with the ID of the investigation task, writing the investigation task, the ID of the investigation task and the information of the CDN equipment to be investigated into a database, and pushing the ID of the investigation task into a task message queue for waiting processing;
step S3: acquiring the ID of the troubleshooting task from a task message queue, acquiring the troubleshooting task and the information of the CDN equipment to be debugged from the database according to the ID of the troubleshooting task, and sending the troubleshooting task to the CDN equipment to be debugged according to the information of the CDN equipment to be debugged so that the CDN equipment to be debugged executes troubleshooting operation according to the received troubleshooting task;
step S4: and receiving a troubleshooting task result sent by the CND equipment to be debugged, and obtaining a user access failure troubleshooting result according to the troubleshooting task result.
The troubleshooting method provided by the embodiment of the invention can realize quick troubleshooting of the fault when the HLS-based user terminal accesses the CDN live broadcast system.
The CDN device to be checked in the CDN live webcast system obtained in step S1 may be a CDN device specified by a user, or a CDN device automatically selected from the CDN live webcast system according to a barrier task type selected by the user, for example, a plurality of barrier task types may be preset according to a current common fault, a CDN device selection rule may be set for each barrier task type, after the barrier task type selected by the user is received, a corresponding CDN device selection rule is obtained from pre-stored information according to the barrier task type selected by the user, and then the CDN device is selected from the CDN live webcast system as the CDN device to be checked according to the corresponding CDN device selection rule;
for example, if the type of the troubleshooting task selected by the user is source station availability verification, the corresponding CDN device selection rule may be to randomly select 3 pieces of multi-line back-to-source devices from the CDN live webcast system as CDN devices to be checked.
In an embodiment of the present invention, the database may be a database based on distributed file storage (MongoDB).
For example, in the step S3, the troubleshooting task is sent to the CDN device to be troubleshot through an infrastructure tool and bastion protocol forwarding.
In the embodiment of the present invention, after step S4, the method further includes:
and generating a troubleshooting result page to display the user access troubleshooting result to the user.
For example, referring to fig. 2, a fault removal method provided by an embodiment of the present invention may include:
(1) after logging in, a user selects a troubleshooting task type and submits fault reporting information, wherein the fault reporting information comprises but is not limited to an access source IP, an access URL and access request time;
(2) selecting CDN equipment from a CDN live broadcast system as CDN equipment to be checked according to the type of the checking task selected by the user, deeply analyzing fault report information submitted by the user, generating a detailed checking task and associating the detailed checking task with the CDN equipment to be checked;
in order to realize real-time knowledge of information of each device in the CDN live broadcast system, information of each device in the CDN live broadcast system can be acquired through a third party API system (such as RCMS-API), and then the devices can be classified and cached according to device roles (common edge nodes or multi-line return source devices), regions and operators;
then writing the investigation task, the ID of the investigation task and the information of the CDN equipment to be investigated into a database MongoDB, and pushing the ID of the investigation task into a task message queue to wait for processing;
(3) the consumer task program takes the ID of the troubleshooting task from the task message queue and pushes the ID of the troubleshooting task to the temporary queue for task processing, acquires the detailed troubleshooting task from the MongoDB after taking the ID of the troubleshooting task to be processed from the task message queue, and then forwards the troubleshooting task to be distributed to CDN equipment to be debugged through an infrastructure tool and a bastion machine protocol to execute troubleshooting operation;
after receiving the troubleshooting task, each CDN device to be debugged executes the troubleshooting operation, searches for a filter log, and returns a troubleshooting task result after statistical sorting;
(4) after receiving the investigation task results sent by each CDN device to be investigated, the consumer task program performs initial simple arrangement and writes the result into the MongoDB, and the task processing process is completed;
if the processing is successful, popping the ID of the troubleshooting task from the temporary queue, and if the processing is failed, rebounding the ID of the troubleshooting task to the task message queue again to wait for the next processing;
(5) after the troubleshooting task results sent by the CND devices to be debugged are received, the troubleshooting task results reported by the CDN devices to be debugged are analyzed in series to obtain information such as complete behaviors of users accessing the CDN live broadcast system, finally, the troubleshooting results are analyzed to obtain user access failure troubleshooting results including failure points and user access failure reasons, and a failure troubleshooting result page is generated and displayed to the users.
All behaviors and attributes of a user accessing the CDN live broadcast system can be recorded in an access log of the CDN live broadcast system, wherein the behaviors and attributes comprise access sources ip, url, response time, response state and the like.
The troubleshooting method provided by the embodiment of the invention can quickly find out the corresponding access jam and failure reasons from massive CDN live broadcast systems under the conditions of known audience exit ip, access url and approximate request time aiming at the current hls live broadcast and other access scenes of http, and can be completed by adopting a web browser end and be compatible with mobile phone end operation.
In addition, an embodiment of the present invention further provides a troubleshooting system, including:
the task generating module is used for acquiring information of CDN equipment to be checked in a CDN live broadcast system and generating a checking task of the CDN equipment to be checked according to fault reporting information submitted by a user, wherein the fault reporting information comprises at least one of access source IP (Internet protocol), access url (url) and access request time;
the writing-in module is used for associating the information of the CDN equipment to be checked with the ID of the checking task, writing the checking task, the ID of the checking task and the information of the CDN equipment to be checked into a database, and pushing the ID of the checking task into a task message queue to wait for processing;
the task execution module is used for acquiring the ID of the troubleshooting task from a task message queue, acquiring the troubleshooting task and the information of the CDN equipment to be debugged from the database according to the ID of the troubleshooting task, and sending the troubleshooting task to the CDN equipment to be debugged according to the information of the CDN equipment to be debugged so that the CDN equipment to be debugged executes troubleshooting operation according to the received troubleshooting task;
and the processing module is used for receiving the troubleshooting task result sent by the CND equipment to be debugged and obtaining a user access failure troubleshooting result according to the troubleshooting task result.
In an embodiment of the present invention, the database is a database based on distributed file storage.
In the embodiment of the invention, the task execution module forwards the troubleshooting task to the CDN device to be investigated through an infrastructure tool and a bastion machine protocol.
Among them, in the embodiment of the present invention, the obstacle evacuation system further includes:
and the page generation module is used for generating a troubleshooting result page so as to display the user access troubleshooting result to the user.
Although the invention has been described in detail above with reference to a general description and specific examples, it will be apparent to one skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.