[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN114205220B - Multi-machine room fault transfer method and system based on abnormal counting of clients - Google Patents

Multi-machine room fault transfer method and system based on abnormal counting of clients Download PDF

Info

Publication number
CN114205220B
CN114205220B CN202111508502.1A CN202111508502A CN114205220B CN 114205220 B CN114205220 B CN 114205220B CN 202111508502 A CN202111508502 A CN 202111508502A CN 114205220 B CN114205220 B CN 114205220B
Authority
CN
China
Prior art keywords
parameter
obtaining
result
client
failover
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111508502.1A
Other languages
Chinese (zh)
Other versions
CN114205220A (en
Inventor
胡宝银
林刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Bairong Ruibo Technology Co ltd
Original Assignee
Beijing Bairong Ruibo Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Bairong Ruibo Technology Co ltd filed Critical Beijing Bairong Ruibo Technology Co ltd
Priority to CN202111508502.1A priority Critical patent/CN114205220B/en
Publication of CN114205220A publication Critical patent/CN114205220A/en
Application granted granted Critical
Publication of CN114205220B publication Critical patent/CN114205220B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0806Configuration setting for initial configuration or provisioning, e.g. plug-and-play

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer And Data Communications (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention provides a multi-machine room fault transfer method and a system based on abnormal counting of clients, wherein the method comprises the following steps: acquiring a first initial parameter, initializing the configuration of a client, and acquiring a first initial parameter configuration result; obtaining a first obtaining result; judging whether the first acquisition result meets a first preset condition or not; updating the first acquisition result to the local of the client when the first acquisition result is satisfied; obtaining a first call exception, and completing the failover of the thread; judging whether the switching mark of the first parameter adjustment result meets a first condition or not; and when the first condition is met, carrying out primary domain name recovery monitoring, carrying out first back-cut operation, and obtaining a second parameter adjustment result. The method solves the technical problems that after the domain name resolution IP address of external service is manually modified in the prior art, the response time is long, the rapid fault switching cannot be realized, a large number of requests fail under the condition of high concurrency, and the service availability of cloud service providers is reduced.

Description

Multi-machine room fault transfer method and system based on abnormal counting of clients
Technical Field
The invention relates to the field of machine room fault transfer, in particular to a multi-machine room fault transfer method and system based on abnormal counting of clients.
Background
In order to ensure that a high availability service is provided for clients, a cloud service provider uses domain names and a plurality of machine rooms to provide service to the outside, when one machine room fails, the flow of one machine room is switched to another machine room by modifying the IP pointed by domain name resolution to realize the failover, but the failover effect of using DNS to realize the machine room can be influenced by the DNS cache time. Generally, a cloud service provider has a large number of client accesses, DNS of each client can set different DNS buffering time, the DNS buffering time of each client is usually set to be different from a few seconds to a few minutes, the DNS buffering time of each client is difficult to unify, when all client requests of the cloud service provider are concurrent and large, a large number of client requests fail from the time when a machine room fault modifies domain name to point to IP to the time when all client updates a server domain name to resolve IP, and thus a large amount of losses are caused to the cloud service provider.
However, in the process of implementing the technical scheme of the invention in the embodiment of the application, the inventor of the application finds that at least the following technical problems exist in the above technology:
When a machine room fails, after the domain name resolution IP address of external service is manually modified, the response time is long, the rapid failure switching cannot be realized, a large number of requests fail under the condition of high concurrency, and the service availability of cloud service providers is reduced.
Disclosure of Invention
The embodiment of the application solves the technical problems that in the prior art, when a machine room fails, after the domain name resolution IP address of external service is manually modified, the response time is long, the rapid failover cannot be realized, a large number of requests fail under the condition of high concurrency, and the service availability of cloud service providers is reduced. When one of the machine rooms fails, the client can autonomously detect the machine room failure and realize millisecond-level failure transfer, and when the cloud service provider machine room or the network link from the client to the server fails, the quick failure switching can be realized, the influence of the failure on the cloud service availability is greatly reduced, and the technical effect of automatically recovering the original flow distribution proportion of each machine room after the failure recovery is achieved.
In view of the above problems, embodiments of the present application provide a multi-machine room failover method and system based on client anomaly counting.
In a first aspect, an embodiment of the present application provides a multi-machine room failover method based on client anomaly counting, the method including: acquiring a first initial parameter through the client, and initializing the configuration of the client according to the acquired result to acquire a first initial parameter configuration result; acquiring the first initial parameter configuration result through the server to acquire a first acquisition result; judging whether the first acquisition result meets a first preset condition or not; when the first acquisition result meets the first preset condition, updating the first acquisition result to the local client; acquiring a first call exception of the client, starting a failover thread by the client based on an SDK packet, acquiring a first parameter adjustment result of the failover thread, and completing the failover of the thread; judging whether the switching mark of the first parameter adjustment result meets a first condition or not; and when the switching identifier meets the first condition, carrying out primary domain name recovery monitoring through the failover thread, and when the primary domain name meets the second preset condition, carrying out first switching operation through the client, and obtaining a second parameter adjustment result according to the first parameter adjustment result.
In another aspect, embodiments of the present application provide a multi-machine room failover system based on client anomaly counting, the system comprising: the first obtaining unit is used for obtaining a first initial parameter through the client, carrying out configuration initialization of the client according to the obtained result and obtaining a first initial parameter configuration result; the second obtaining unit is used for obtaining the first initial parameter configuration result through a server to obtain a first obtaining result; the first judging unit is used for judging whether the first acquisition result meets a first preset condition or not; the first execution unit is used for updating the first acquisition result to the local client when the first acquisition result meets the first preset condition; the third obtaining unit is used for obtaining a first call exception of the client, starting a failover thread based on an SDK packet through the client, obtaining a first parameter adjustment result of the failover thread, and completing the failover of the thread; the second judging unit is used for judging whether the switching mark of the first parameter adjustment result meets a first condition or not; and the fourth obtaining unit is used for carrying out primary domain name recovery monitoring through the failover thread when the switching identifier meets the first condition, carrying out first back-cut operation through the client when the primary domain name meets the second preset condition, and obtaining a second parameter adjustment result according to the first parameter adjustment result.
In a third aspect, the present invention provides a multiple computer room failover system based on client anomaly counting, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, said processor implementing the steps of the method of the first aspect when said program is executed.
One or more technical solutions provided in the embodiments of the present application at least have the following technical effects or advantages:
the method comprises the steps of acquiring a first initial parameter through the client, and initializing the configuration of the client according to an acquisition result to obtain a first initial parameter configuration result; acquiring the first initial parameter configuration result through the server to acquire a first acquisition result; judging whether the first acquisition result meets a first preset condition or not; when the first acquisition result meets the first preset condition, updating the first acquisition result to the local client; acquiring a first call exception of the client, starting a failover thread by the client based on an SDK packet, acquiring a first parameter adjustment result of the failover thread, and completing the failover of the thread; judging whether the switching mark of the first parameter adjustment result meets a first condition or not; when the switching identifier meets the first condition, the original main domain name is recovered and monitored through the failover thread, when the original main domain name meets the second preset condition, the client performs a first back switching operation, and a second parameter adjustment result is obtained according to the first parameter adjustment result.
The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.
Drawings
Fig. 1 is a flow chart of a multi-machine room failover method based on client anomaly counting according to an embodiment of the present application;
fig. 2 is a schematic flow chart of a multi-machine room failover method based on abnormal count of a client to obtain a first parameter adjustment result according to an embodiment of the present application;
fig. 3 is a schematic flow chart of a multi-machine room failover method based on abnormal count of a client to obtain a second parameter adjustment result according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a multi-machine room failover system based on client anomaly counting according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an exemplary electronic device according to an embodiment of the present application.
Reference numerals illustrate: the device comprises a first obtaining unit 11, a second obtaining unit 12, a first judging unit 13, a first executing unit 14, a third obtaining unit 15, a second judging unit 16, a fourth obtaining unit 17, an electronic device 300, a memory 301, a processor 302, a communication interface 303 and a bus architecture 304.
Detailed Description
The embodiment of the application solves the technical problems that in the prior art, when a machine room fails, after the domain name resolution IP address of external service is manually modified, the response time is long, the rapid failover cannot be realized, a large number of requests fail under the condition of high concurrency, and the service availability of cloud service providers is reduced. When the cloud service provider machine room or the network link from the client to the server fails, the quick fault switching can be realized, the influence of the fault on the cloud service availability is greatly reduced, and the original flow distribution proportion of each machine room is automatically recovered after the fault is recovered.
Summary of the application
The DNS-based fault transfer technical scheme applies for a plurality of domain names for cloud service providers, establishes a plurality of machine rooms for providing consistent cloud service, each machine room is provided with an external network IP address, and each IP address is provided with a domain name corresponding to the IP address. Under normal conditions, each domain name is served to a specific client group according to a request balancing strategy by a service provider, when one of the machine rooms fails, operation and maintenance personnel resolve the domain name corresponding to the failed machine room to other available machine room IP addresses to achieve failover, but the scheme failover really achieves that the time depends on the failure time of a client domain name resolution cache, and the longer the failure time of the domain name resolution is set by a client, the longer the real effective time of the scheme failover is. In the prior art, when a machine room fails, after the domain name resolution IP address of external service is manually modified, the response time is long, the rapid failure switching cannot be realized, a large number of requests fail under the condition of high concurrency, and the service availability of cloud service providers is reduced.
Aiming at the technical problems, the technical scheme provided by the application has the following overall thought:
the embodiment of the application provides a multi-machine room fault transfer method based on client exception counting, which comprises the following steps: acquiring a first initial parameter through the client, and initializing the configuration of the client according to the acquired result to acquire a first initial parameter configuration result; acquiring the first initial parameter configuration result through the server to acquire a first acquisition result; judging whether the first acquisition result meets a first preset condition or not; when the first acquisition result meets the first preset condition, updating the first acquisition result to the local client; acquiring a first call exception of the client, starting a failover thread by the client based on an SDK packet, acquiring a first parameter adjustment result of the failover thread, and completing the failover of the thread; judging whether the switching mark of the first parameter adjustment result meets a first condition or not; and when the switching identifier meets the first condition, carrying out primary domain name recovery monitoring through the failover thread, and when the primary domain name meets the second preset condition, carrying out first switching operation through the client, and obtaining a second parameter adjustment result according to the first parameter adjustment result.
Having described the basic principles of the present application, various non-limiting embodiments of the present application will now be described in detail with reference to the accompanying drawings.
Example 1
As shown in fig. 1, an embodiment of the present application provides a multi-machine room failover method based on a client anomaly count, where the method is applied to a failover system, and the system is communicatively connected to a server and a client, and the method includes:
s100: acquiring a first initial parameter through the client, and initializing the configuration of the client according to the acquired result to acquire a first initial parameter configuration result;
further, the step S100 includes:
s110: obtaining a local domain name list;
s120: obtaining a failover configuration, taking the local domain name list and the failover configuration as the first initial parameter;
s130: and initializing the configuration of the client through the first initial parameters to obtain a configuration result of the first initial parameters.
Specifically, a client, also referred to as a client, refers to a program that provides local services to a client. The server is for the client, and the content of the service such as providing resources to the client, storing client data, etc. The server side provides service to the outside by using a plurality of machine rooms, each machine room is provided with an external network IP address, each IP address is provided with a domain name corresponding to the IP address, and the server side provides a domain name list and a fault transfer configuration acquisition interface. The local domain name list is a domain name list corresponding to a plurality of machine rooms used by the server. Failover is the process of performing a machine room switchover when a network link between a client to a server machine room and the server machine room fail. The failover is configured to configure an anomaly number threshold and a time threshold, that is, how many anomalies occur within how long a machine room switch is performed, and multiple anomaly number thresholds and time thresholds may be configured. And taking the local domain name list and the fault transfer configuration as the first initial parameters.
The client initiates a call to the cloud service using an SDK package, which is a software development kit that is used by a software engineer to create a collection of development tools for application software for a particular software package, software framework, hardware platform, operating system, etc. When a client program is started, an SDK packet is used for initiating call to cloud service, a local domain name list and a failover configuration, namely the first initial parameter, are obtained for configuration initialization, and therefore a configuration result of the first initial parameter is obtained. As an example, without limitation: when the client program is started, domain name list configuration is carried out, a default failover frequency threshold value and a default failover time threshold value are obtained, then configuration initialization can be realized through Java codes, and a first initial parameter configuration result is obtained.
S200: acquiring the first initial parameter configuration result through the server to acquire a first acquisition result;
s300: judging whether the first acquisition result meets a first preset condition or not;
specifically, when the client program is started, a local domain name list and a failover configuration are obtained and used for configuration initialization, meanwhile, a server configuration parameter initialization thread is started, the currently used domain name is determined to be the first available domain name of the domain name list, and after the server configuration initialization thread is started, the first initial parameter configuration result is obtained and used as the first obtaining result. The first preset condition is preset, the first preset condition is that the configuration of the acquired server is successful, and whether the first acquisition result meets the first preset condition or not is judged, namely whether the configuration of the acquired server is successful or not is judged.
S400: when the first acquisition result meets the first preset condition, updating the first acquisition result to the local client;
s500: acquiring a first call exception of the client, starting a failover thread by the client based on an SDK packet, acquiring a first parameter adjustment result of the failover thread, and completing the failover of the thread;
further, as shown in fig. 2, the step S500 includes:
s510: acquiring a starting identification parameter of the fault thread, and acquiring a first starting identification parameter adjustment result according to the starting identification parameter;
s520: obtaining a first abnormal time parameter, and obtaining a first abnormal parameter time recording result according to the first abnormal time parameter;
s530: obtaining an abnormal counting parameter, and obtaining a counting adjustment result according to the abnormal counting parameter;
s540: and obtaining the first parameter adjustment result according to the first starting identification parameter adjustment result, the first abnormal parameter time recording result and the counting adjustment result.
Specifically, when the first obtaining result meets the first preset condition, the first obtaining result is updated to the local client side, the local client side comprises a memory and a file, the first call abnormality occurs when the cloud service is called by the client side for the first time, the first call abnormality comprises a server machine room fault, a host machine fault, a client side-to-server network fault and the like, after the first call abnormality of the client side is obtained, the client side starts a fault transfer thread based on an SDK packet, and a fault transfer thread starting identifier is set to be in an on state. Wherein the SDK package is a software development kit, which is used by a software engineer to create a collection of development tools for application software for a particular software package, software framework, hardware platform, operating system, etc. Obtaining the starting identification parameters of the fault thread, and obtaining a first starting identification parameter adjustment result according to the starting identification parameters. And obtaining a first abnormal time parameter by recording the first abnormal time, and obtaining a first abnormal parameter time recording result according to the first abnormal time parameter. And performing abnormal counting to obtain an abnormal counting parameter, and obtaining a counting adjustment result according to the abnormal counting parameter. The first parameter adjustment result obtained by the method comprises the first starting identification parameter adjustment result, the first abnormal parameter time recording result and the counting adjustment result, so that the fault transfer of the thread is completed.
S600: judging whether the switching mark of the first parameter adjustment result meets a first condition or not;
s700: and when the switching identifier meets the first condition, carrying out primary domain name recovery monitoring through the failover thread, and when the primary domain name meets the second preset condition, carrying out first switching operation through the client, and obtaining a second parameter adjustment result according to the first parameter adjustment result.
Specifically, the first condition is whether a switching identifier is equal to 1, whether the switching identifier of the first parameter adjustment result meets the first condition is judged, and when the switching identifier meets the first condition, original main domain name recovery monitoring is performed through the failover thread. The primary domain name is the first domain name of the domain name list. The second preset condition is that the preset number of successful requests is reached within a preset time, the second preset condition is used for judging whether the main domain name is available, after the recovery condition is reached, the first back-cut operation, namely the domain name back-cut operation, is executed through the client, when the subsequent request is sent to the original main domain name, a second parameter adjustment result is obtained according to the first parameter adjustment result, the second parameter result comprises setting a fault transfer thread starting identifier and a domain name switching identifier to be closed, clearing an abnormal count, completing operation of the fault transfer thread until the fault transfer thread is completed, automatically switching back flow after the fault recovery, and automatically recovering flow allocation of original machine rooms. When the cloud service provider machine room or the network link from the client to the server fails, the quick fault switching can be realized, the influence of the fault on the cloud service availability is greatly reduced, and the original flow distribution proportion of each machine room is automatically recovered after the fault is recovered.
Further, the embodiment of the application further includes:
s550: obtaining a first abnormal frequency switching threshold, a second abnormal frequency switching threshold and a third abnormal frequency switching threshold;
s560: and when the counting adjustment result meets any one threshold value of the first abnormal times switching threshold value, the second abnormal times switching threshold value and the third abnormal times switching threshold value, domain name switching is carried out according to the first initial parameter configuration result, and the domain name switching identification parameter is adjusted to be on.
Specifically, after the first parameter adjustment result is obtained, the first abnormal frequency switching threshold, the second abnormal frequency switching threshold, and the third abnormal frequency switching threshold are obtained, where the first abnormal frequency switching threshold, the second abnormal frequency switching threshold, and the third abnormal frequency switching threshold are abnormal switching conditions, the first abnormal frequency switching threshold may preferably be switched by exceeding 5 times of abnormality within 5 seconds, the second abnormal frequency switching threshold may preferably be switched by exceeding 7 times of abnormality within 10 seconds, and the third abnormal frequency switching threshold may preferably be switched by exceeding 11 times of abnormality within 30 seconds. And when the count adjustment result meets any one of the first abnormal frequency switching threshold value, the second abnormal frequency switching threshold value and the third abnormal frequency switching threshold value, performing domain name switching according to the first initial parameter configuration result, performing machine room switching by the client, switching the currently used domain name into the next available domain name in the domain name list, and sending a subsequent request to a new domain name (namely, another machine room corresponding to the new domain name). The domain name switching identification parameter is used for identifying the domain name switching state, and the client adjusts the domain name switching identification parameter to be on. The method has the advantages that abnormal call counts are recorded in real time under the condition that the performance of calling cloud services by the client is not affected, millisecond-level autonomous switching can be realized at the client according to the threshold value set in advance, and the cloud service availability can be effectively ensured by applying the method.
Further, as shown in fig. 3, the step S700 includes:
s710: obtaining a second starting identification parameter adjustment result according to the starting identification parameter;
s720: obtaining a count zero clearing adjustment result according to the abnormal count parameter;
s730: obtaining a domain name switching closing adjustment result according to the domain name switching identification parameter;
s740: and obtaining the second parameter adjustment result according to the second starting identification parameter adjustment result, the count zero clearing adjustment result and the domain name switching closing adjustment result.
Specifically, the starting identification parameter is used for identifying whether the client starts the failover thread, and the starting identification of the failover thread is set to be closed according to the starting identification parameter, so that the second starting identification parameter adjustment result is obtained. The anomaly counting parameter is the anomaly number obtained by performing anomaly counting when the client calls the cloud service to generate anomaly for the first time, and the anomaly counting is cleared according to the anomaly counting parameter. The domain name switching identification parameter is a parameter for identifying the domain name switching state, and the domain name switching identification is set to be closed according to the domain name switching identification parameter, so that a domain name switching closing adjustment result is obtained. And taking the second starting identification parameter adjustment result, the count zero clearing adjustment result and the domain name switching closing adjustment result as the second parameter adjustment result.
Further, the step S300 includes:
s310: when the first obtaining result does not meet the first preset condition, carrying out dormancy waiting on the obtaining thread to obtain a first dormancy waiting time;
s320: and when the first dormancy waiting time meets a first preset time, continuing to acquire the first initial parameter configuration result.
Specifically, the first preset condition is to determine whether the configuration of the obtaining server is successful, and when the first obtaining result does not meet the first preset condition, that is, the obtaining of the configuration fails, the thread sleeps and waits, the first sleep waiting time is obtained, and the time can be set by itself, for example, the sleep time is set to 2 minutes through Java code. The first preset time is the time of success of the estimated configuration acquisition, and when the dormancy time meets the first preset time, the acquisition of the first initial parameter configuration result can be continued.
Further, the embodiment of the application further includes:
s810: when the switching identifier does not meet the first condition, judging whether the survival time of the transfer thread meets a second condition according to the first parameter adjustment result;
S820: ending the failover thread when the migration thread survival time meets a second condition;
s830: when the survival time of the transfer thread does not meet a second condition, dormancy is carried out on the transfer thread;
s840: when the switching identifier meets the first condition, a first time constraint condition and a first request success time constraint condition are obtained;
s850: and when the original main domain name simultaneously meets the first time constraint condition and the first request success time constraint condition, judging that the original main domain name meets the second preset condition, and performing the first back cut operation.
Specifically, the first condition is whether a switching identifier is equal to 1, whether the switching identifier of the first parameter adjustment result meets the first condition is judged, and when the switching identifier does not meet the first condition, whether the survival time of the transfer thread meets a second condition is judged according to the first parameter adjustment result. And judging whether the second condition is that the survival time threshold value is exceeded or not through survival time monitoring, and when the survival time of the transfer thread meets the second condition, namely that the survival time exceeds a preset threshold value, executing the first back-cut operation, namely that of the domain name back-cut operation, through the client, and ending the fault transfer thread. And when the survival time of the transfer thread does not meet the second condition, if the survival time does not exceed the preset threshold value, dormancy is performed on the transfer thread. When the switching identifier meets the first condition, a first time constraint condition and a first request success time constraint condition are obtained, wherein the first time constraint condition and the first request success time constraint condition are manually set, and the first time constraint condition and the first request success time constraint condition are obtained. And when the original main domain name simultaneously meets the first time constraint condition and the first request success time constraint condition, judging that the original main domain name meets the second preset condition, and performing a first back switching operation through the client.
In summary, the multi-machine room fault transfer method and system based on the abnormal count of the client provided by the embodiment of the application have the following technical effects:
1. the method comprises the steps of acquiring a first initial parameter through the client, and initializing the configuration of the client according to an acquisition result to obtain a first initial parameter configuration result; acquiring the first initial parameter configuration result through the server to acquire a first acquisition result; judging whether the first acquisition result meets a first preset condition or not; when the first acquisition result meets the first preset condition, updating the first acquisition result to the local client; acquiring a first call exception of the client, starting a failover thread by the client based on an SDK packet, acquiring a first parameter adjustment result of the failover thread, and completing the failover of the thread; judging whether the switching mark of the first parameter adjustment result meets a first condition or not; when the switching identifier meets the first condition, the original main domain name is recovered and monitored through the failover thread, when the original main domain name meets the second preset condition, the client performs a first back switching operation, and a second parameter adjustment result is obtained according to the first parameter adjustment result.
2. By adopting the method of multistage configuration of the fault transfer strategy, network link faults between the client and the server machine room and faults of the server machine room can be effectively monitored and perceived, and the technical effect of fault transfer is realized.
Example two
Based on the same inventive concept as the multi-machine room failover method based on the client anomaly count in the foregoing embodiment, as shown in fig. 4, an embodiment of the present application provides a multi-machine room failover system based on the client anomaly count, where the system includes:
a first obtaining unit 11, where the first obtaining unit 11 is configured to obtain a first initial parameter through a client, and perform configuration initialization of the client according to an obtaining result, to obtain a first initial parameter configuration result;
a second obtaining unit 12, where the second obtaining unit 12 is configured to obtain, through a server, the first initial parameter configuration result, and obtain a first obtaining result;
a first judging unit 13, where the first judging unit 13 is configured to judge whether the first obtaining result meets a first preset condition;
a first execution unit 14, where the first execution unit 14 is configured to update the first acquisition result to the local client when the first acquisition result meets the first preset condition;
A third obtaining unit 15, where the third obtaining unit 15 is configured to obtain a first call exception of the client, start a failover thread by using the client based on an SDK packet, obtain a first parameter adjustment result of the failover thread, and complete failover of the thread;
a second judging unit 16, where the second judging unit 16 is configured to judge whether the switching identifier of the first parameter adjustment result meets a first condition;
and a fourth obtaining unit 17, where the fourth obtaining unit 17 is configured to perform primary domain name recovery monitoring through the failover thread when the switching identifier meets the first condition, perform a first back-switching operation through the client when the primary domain name meets a second preset condition, and obtain a second parameter adjustment result according to the first parameter adjustment result.
Further, the system includes:
a fifth obtaining unit, configured to obtain a local domain name list;
the second execution unit is used for obtaining a failover configuration, and the local domain name list and the failover configuration are used as the first initial parameters;
and the sixth obtaining unit is used for initializing the configuration of the client through the first initial parameters and obtaining the configuration result of the first initial parameters.
Further, the system includes:
a seventh obtaining unit, configured to obtain a start identifier parameter of the faulty thread, and obtain a first start identifier parameter adjustment result according to the start identifier parameter;
an eighth obtaining unit, configured to obtain a first abnormal time parameter, and obtain a first abnormal parameter time recording result according to the first abnormal time parameter;
a ninth obtaining unit, configured to obtain an abnormal count parameter, and obtain a count adjustment result according to the abnormal count parameter;
a tenth obtaining unit, configured to obtain the first parameter adjustment result according to the first start identification parameter adjustment result, the first abnormal parameter time recording result, and the count adjustment result.
Further, the system includes:
an eleventh obtaining unit, configured to obtain a first abnormal-number switching threshold, a second abnormal-number switching threshold, and a third abnormal-number switching threshold;
and the third execution unit is used for carrying out domain name switching according to the first initial parameter configuration result and adjusting the domain name switching identification parameter to be on when the counting adjustment result meets any one of the first abnormal frequency switching threshold, the second abnormal frequency switching threshold and the third abnormal frequency switching threshold.
Further, the system includes:
a twelfth obtaining unit, configured to obtain a second startup identification parameter adjustment result according to the startup identification parameter;
a thirteenth obtaining unit for obtaining a count clear adjustment result according to the anomaly count parameter;
a fourteenth obtaining unit, configured to obtain a domain name switching closing adjustment result according to the domain name switching identifier parameter;
a fifteenth obtaining unit, configured to obtain the second parameter adjustment result according to the second start identifier parameter adjustment result, the count zero adjustment result, and the domain name switching off adjustment result.
Further, the system includes:
a sixteenth obtaining unit, configured to, when the first obtaining result does not meet the first preset condition, perform sleep waiting on an obtaining thread, and obtain a first sleep waiting time;
and the fourth execution unit is used for continuing to acquire the first initial parameter configuration result when the first dormancy waiting time meets a first preset time.
Further, the system includes:
the third judging unit judges whether the survival time of the transfer thread meets a second condition according to the first parameter adjustment result when the switching identification does not meet the first condition;
a fifth execution unit, configured to end the failover thread when the survival time of the failover thread meets a second condition;
the sixth execution unit is used for dormancy of the transfer thread when the survival time of the transfer thread does not meet a second condition;
a seventeenth obtaining unit, configured to obtain a first time constraint condition and a first request success number constraint condition when the handover identifier meets the first condition;
and the seventh execution unit is used for judging that the original main domain name meets the second preset condition and performing the first back cut operation when the original main domain name simultaneously meets the first time constraint condition and the first request success time constraint condition.
Exemplary electronic device
An electronic device of an embodiment of the present application is described below with reference to fig. 5.
Based on the same inventive concept as the multi-machine room failover method based on the abnormal count of the client in the foregoing embodiment, the embodiments of the present application further provide a multi-machine room failover system based on the abnormal count of the client, including: a processor coupled to a memory for storing a program that, when executed by the processor, causes the system to perform the method of any of the first aspects.
The electronic device 300 includes: a processor 302, a communication interface 303, a memory 301. Optionally, the electronic device 300 may also include a bus architecture 304. Wherein the communication interface 303, the processor 302 and the memory 301 may be interconnected by a bus architecture 304; the bus architecture 304 may be a peripheral component interconnect (peripheral component interconnect, PCI) bus, or an extended industry standard architecture (extended industry Standard architecture, EISA) bus, among others. The bus architecture 304 may be divided into address buses, data buses, control buses, and the like. For ease of illustration, only one thick line is shown in fig. 5, but not only one bus or one type of bus.
Processor 302 may be a CPU, microprocessor, ASIC, or one or more integrated circuits for controlling the execution of the programs of the present application.
The communication interface 303 uses any transceiver-like means for communicating with other devices or communication networks, such as ethernet, radio access network (radio access network, RAN), wireless local area network (wireless local area networks, WLAN), wired access network, etc.
The memory 301 may be, but is not limited to, ROM or other type of static storage device that may store static information and instructions, RAM or other type of dynamic storage device that may store information and instructions, or may be an EEPROM (electrically erasable Programmable read-only memory), a compact disc-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory may be self-contained and coupled to the processor through bus architecture 304. The memory may also be integrated with the processor.
The memory 301 is used for storing computer-executable instructions for executing the embodiments of the present application, and is controlled by the processor 302 to execute the instructions. The processor 302 is configured to execute computer-executable instructions stored in the memory 301, thereby implementing a multiple-room failover method based on the client anomaly count provided in the above-described embodiments of the present application.
Alternatively, the computer-executable instructions in the embodiments of the present application may be referred to as application program codes, which are not specifically limited in the embodiments of the present application.
The embodiment of the application provides a multi-machine room fault transfer method based on client anomaly counting, wherein the method comprises the following steps: acquiring a first initial parameter through the client, and initializing the configuration of the client according to the acquired result to acquire a first initial parameter configuration result; acquiring the first initial parameter configuration result through the server to acquire a first acquisition result; judging whether the first acquisition result meets a first preset condition or not; when the first acquisition result meets the first preset condition, updating the first acquisition result to the local client; acquiring a first call exception of the client, starting a failover thread by the client based on an SDK packet, acquiring a first parameter adjustment result of the failover thread, and completing the failover of the thread; judging whether the switching mark of the first parameter adjustment result meets a first condition or not; and when the switching identifier meets the first condition, carrying out primary domain name recovery monitoring through the failover thread, and when the primary domain name meets the second preset condition, carrying out first switching operation through the client, and obtaining a second parameter adjustment result according to the first parameter adjustment result.
Those of ordinary skill in the art will appreciate that: the various numbers of first, second, etc. referred to in this application are merely for convenience of description and are not intended to limit the scope of embodiments of the present application, nor to indicate a sequence. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one" means one or more. At least two means two or more. "at least one," "any one," or the like, refers to any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one of a, b, or c (species ) may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device including one or more servers, data centers, etc. that can be integrated with the available medium. The usable medium may be a magnetic medium (e.g., a floppy Disk, a hard Disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.
The various illustrative logical blocks and circuits described in the embodiments of the present application may be implemented or performed with a general purpose processor, a digital signal processor, an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the general purpose processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a digital signal processor and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a digital signal processor core, or any other similar configuration.
The steps of a method or algorithm described in the embodiments of the present application may be embodied directly in hardware, in a software element executed by a processor, or in a combination of the two. The software elements may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. In an example, a storage medium may be coupled to the processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC, which may reside in a terminal. In the alternative, the processor and the storage medium may reside in different components in a terminal. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Although the present application has been described in connection with specific features and embodiments thereof, it will be apparent that various modifications and combinations can be made without departing from the spirit and scope of the application. Accordingly, the specification and drawings are merely exemplary illustrations of the present application as defined in the appended claims and are considered to cover any and all modifications, variations, combinations, or equivalents that fall within the scope of the present application. It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to include such modifications and variations.

Claims (6)

1. A multi-machine room failover method based on client anomaly counting, the method being applied to a failover system communicatively connected to a server and a client, the method comprising:
acquiring a first initial parameter through the client, and initializing the configuration of the client according to the acquired result to acquire a first initial parameter configuration result;
acquiring the first initial parameter configuration result through the server to acquire a first acquisition result;
Judging whether the first acquisition result meets a first preset condition or not, wherein the first preset condition is that the server side is successfully configured;
when the first acquisition result meets the first preset condition, updating the first acquisition result to the local client;
acquiring a first call exception of the client, starting a failover thread by the client based on an SDK packet, acquiring a first parameter adjustment result of the failover thread, and completing the failover of the thread;
judging whether the switching mark of the first parameter adjustment result meets a first condition or not;
when the switching identifier meets the first condition, carrying out primary domain name recovery monitoring through the failover thread, and when the primary domain name meets a second preset condition, carrying out first switching back operation through the client, and obtaining a second parameter adjustment result according to the first parameter adjustment result;
the step of obtaining the first initial parameter through the client, the step of initializing the configuration of the client according to the obtained result, and the step of obtaining the first initial parameter configuration result further comprises the following steps:
obtaining a local domain name list;
obtaining a failover configuration, taking the local domain name list and the failover configuration as the first initial parameter;
Initializing the configuration of the client through the first initial parameters to obtain a configuration result of the first initial parameters;
the obtaining the first parameter adjustment result of the failover thread further includes:
acquiring a starting identification parameter of the fault transfer thread, and acquiring a first starting identification parameter adjustment result according to the starting identification parameter;
obtaining a first abnormal time parameter, and obtaining a first abnormal parameter time recording result according to the first abnormal time parameter;
obtaining an abnormal counting parameter, and obtaining a counting adjustment result according to the abnormal counting parameter;
obtaining a first parameter adjustment result according to the first starting identification parameter adjustment result, the first abnormal parameter time recording result and the counting adjustment result;
the method further comprises the steps of:
obtaining a first abnormal frequency switching threshold, a second abnormal frequency switching threshold and a third abnormal frequency switching threshold;
and when the counting adjustment result meets any one threshold value of the first abnormal times switching threshold value, the second abnormal times switching threshold value and the third abnormal times switching threshold value, domain name switching is carried out according to the first initial parameter configuration result, and the domain name switching identification parameter is adjusted to be on.
2. The method of claim 1, wherein the performing, by the client, a first loop-back operation and obtaining a second parameter adjustment result according to the first parameter adjustment result, further comprises:
obtaining a second starting identification parameter adjustment result according to the starting identification parameter;
obtaining a count zero clearing adjustment result according to the abnormal count parameter;
obtaining a domain name switching closing adjustment result according to the domain name switching identification parameter;
and obtaining the second parameter adjustment result according to the second starting identification parameter adjustment result, the count zero clearing adjustment result and the domain name switching closing adjustment result.
3. The method of claim 1, wherein the determining whether the first acquisition result meets a first preset condition further comprises:
when the first obtaining result does not meet the first preset condition, carrying out dormancy waiting on the obtaining thread to obtain a first dormancy waiting time;
and when the first dormancy waiting time meets a first preset time, continuing to acquire the first initial parameter configuration result.
4. The method of claim 1, wherein the method further comprises:
When the switching identification does not meet the first condition, judging whether the survival time of the failover thread meets a second condition according to the first parameter adjustment result;
ending the failover thread when the failover thread survival time meets a second condition;
when the survival time of the fault transfer thread does not meet a second condition, dormancy is performed on the fault transfer thread;
when the switching identifier meets the first condition, a first time constraint condition and a first request success time constraint condition are obtained;
and when the original main domain name simultaneously meets the first time constraint condition and the first request success time constraint condition, judging that the original main domain name meets the second preset condition, and performing the first back cut operation.
5. A multiple-room failover system based on client anomaly counting, the system comprising:
the first obtaining unit is used for obtaining a first initial parameter through the client, carrying out configuration initialization of the client according to the obtained result and obtaining a first initial parameter configuration result;
the second obtaining unit is used for obtaining the first initial parameter configuration result through a server to obtain a first obtaining result;
The first judging unit is used for judging whether the first acquisition result meets a first preset condition or not, wherein the first preset condition is that the server side is successfully configured;
the first execution unit is used for updating the first acquisition result to the local client when the first acquisition result meets the first preset condition;
the third obtaining unit is used for obtaining a first call exception of the client, starting a failover thread based on an SDK packet through the client, obtaining a first parameter adjustment result of the failover thread, and completing the failover of the thread;
the second judging unit is used for judging whether the switching mark of the first parameter adjustment result meets a first condition or not;
the fourth obtaining unit is used for carrying out original main domain name recovery monitoring through the failover thread when the switching identifier meets the first condition, carrying out first back-cut operation through the client when the original main domain name meets the second preset condition, and obtaining a second parameter adjustment result according to the first parameter adjustment result;
The system further comprises:
a fifth obtaining unit, configured to obtain a local domain name list;
the second execution unit is used for obtaining a failover configuration, and the local domain name list and the failover configuration are used as the first initial parameters;
a sixth obtaining unit, configured to initialize the configuration of the client through the first initial parameter, to obtain a configuration result of the first initial parameter;
the system further comprises:
a seventh obtaining unit, configured to obtain a start identifier parameter of the failover thread, and obtain a first start identifier parameter adjustment result according to the start identifier parameter;
an eighth obtaining unit, configured to obtain a first abnormal time parameter, and obtain a first abnormal parameter time recording result according to the first abnormal time parameter;
a ninth obtaining unit, configured to obtain an abnormal count parameter, and obtain a count adjustment result according to the abnormal count parameter;
a tenth obtaining unit, configured to obtain the first parameter adjustment result according to the first start identification parameter adjustment result, the first abnormal parameter time recording result, and the count adjustment result;
The system further comprises:
an eleventh obtaining unit, configured to obtain a first abnormal-number switching threshold, a second abnormal-number switching threshold, and a third abnormal-number switching threshold;
and the third execution unit is used for carrying out domain name switching according to the first initial parameter configuration result and adjusting the domain name switching identification parameter to be on when the counting adjustment result meets any one of the first abnormal frequency switching threshold, the second abnormal frequency switching threshold and the third abnormal frequency switching threshold.
6. A multi-machine room failover system based on client anomaly counting, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any one of claims 1-4 when executing the program.
CN202111508502.1A 2021-12-10 2021-12-10 Multi-machine room fault transfer method and system based on abnormal counting of clients Active CN114205220B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111508502.1A CN114205220B (en) 2021-12-10 2021-12-10 Multi-machine room fault transfer method and system based on abnormal counting of clients

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111508502.1A CN114205220B (en) 2021-12-10 2021-12-10 Multi-machine room fault transfer method and system based on abnormal counting of clients

Publications (2)

Publication Number Publication Date
CN114205220A CN114205220A (en) 2022-03-18
CN114205220B true CN114205220B (en) 2024-04-05

Family

ID=80652207

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111508502.1A Active CN114205220B (en) 2021-12-10 2021-12-10 Multi-machine room fault transfer method and system based on abnormal counting of clients

Country Status (1)

Country Link
CN (1) CN114205220B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107204873A (en) * 2017-05-04 2017-09-26 网宿科技股份有限公司 A kind of method and relevant device for switching target domain name resolution server
CN113301177A (en) * 2021-04-27 2021-08-24 百果园技术(新加坡)有限公司 Domain name anti-blocking method and device
CN113347020A (en) * 2021-04-29 2021-09-03 上海淇玥信息技术有限公司 Domain name service disaster recovery method, system, device and medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107204873A (en) * 2017-05-04 2017-09-26 网宿科技股份有限公司 A kind of method and relevant device for switching target domain name resolution server
CN113301177A (en) * 2021-04-27 2021-08-24 百果园技术(新加坡)有限公司 Domain name anti-blocking method and device
CN113347020A (en) * 2021-04-29 2021-09-03 上海淇玥信息技术有限公司 Domain name service disaster recovery method, system, device and medium

Also Published As

Publication number Publication date
CN114205220A (en) 2022-03-18

Similar Documents

Publication Publication Date Title
US20210176143A1 (en) Monitoring wireless access point events
CN106789153B (en) Multi-channel self-adaptive log recording and outputting method and system for terminal equipment of Internet of things system
US20180217852A1 (en) System service reloading method and apparatus
CN106341270B (en) A kind of fault handling method and device
CN105589712A (en) BMC module updating method and apparatus
CN111641733A (en) Network bridge equipment management method and device and readable storage medium
CN112261133A (en) CDN node control method, device, server and storage medium
CN114884840A (en) Application health state checking method and electronic equipment
US10931569B2 (en) Internet reachability detection and internet high availability for multi-homed network devices
US10505787B2 (en) Automatic recovery in remote management services
CN114745413B (en) Access control method and device for server, computer equipment and storage medium
CN110099015B (en) Method executed by network switching equipment, network switching equipment and medium
CN114205220B (en) Multi-machine room fault transfer method and system based on abnormal counting of clients
CN106911508B (en) DNS configuration recovery method and device
CN112889305B (en) Short-term lease allocation for network address conflict reduction in DHCP failover deployments
CN111817953A (en) Method and device for electing master equipment based on Virtual Router Redundancy Protocol (VRRP)
CN116886286A (en) Big data authentication service self-adaption method, device and equipment
CN118743203A (en) Network controller, fault injection communication protocol and fault injection module for a production network environment
CN114945177A (en) Dual-cloud-card communication method, electronic equipment and machine-readable storage medium
CN114006935B (en) Private network terminal network access method, device and equipment
CN117835099B (en) FTTR-based fault self-diagnosis and self-repair method, FTTR-based fault self-diagnosis and self-repair device, FTTR-based fault self-diagnosis and self-repair equipment and medium
US20240372891A1 (en) Automated remediation of denial-of-service attacks in cloud-based 5g networks
US11038836B2 (en) Computer server and method of obtaining information on network connection of computer server
CN113839815A (en) Server network port configuration method and device, server and readable storage medium
CN117560744A (en) Control method, device, equipment and medium for cold start of client

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20240226

Address after: No. 96, 8th Floor, Building 7, No. 30 Shixing Street, Shijingshan District, Beijing, 100041

Applicant after: Beijing Bairong Ruibo Technology Co.,Ltd.

Country or region after: Zhong Guo

Address before: Floor 1-3, block a, global creative Plaza, No. 10, Furong street, Chaoyang District, Beijing 100102

Applicant before: Beijing Rongda Tianxia Information Technology Co.,Ltd.

Country or region before: Zhong Guo

GR01 Patent grant
GR01 Patent grant