[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN105335362A - Real-time data processing method and system, and instant processing system - Google Patents

Real-time data processing method and system, and instant processing system Download PDF

Info

Publication number
CN105335362A
CN105335362A CN201410229319.1A CN201410229319A CN105335362A CN 105335362 A CN105335362 A CN 105335362A CN 201410229319 A CN201410229319 A CN 201410229319A CN 105335362 A CN105335362 A CN 105335362A
Authority
CN
China
Prior art keywords
node
real
processing system
time
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410229319.1A
Other languages
Chinese (zh)
Other versions
CN105335362B (en
Inventor
王永伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Network Technology Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201410229319.1A priority Critical patent/CN105335362B/en
Publication of CN105335362A publication Critical patent/CN105335362A/en
Application granted granted Critical
Publication of CN105335362B publication Critical patent/CN105335362B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Processing Of Solid Wastes (AREA)

Abstract

The present application provides a real-time data processing method and system, and an instant processing system. According to the real-time data processing method and system and the instant processing system provided by the embodiments of the present application, real-time data sent by a real-time processing system is acquired by a merge node; and further according to a distribution policy, in at least one local node or in a merge node and the at least one local node, a node is determined as a processing node so that the processing node can write the real-time data into the distant processing system, thereby implementing that the instant processing system can perform a flexible computation and query on real-time data.

Description

Real-time data processing method and system and real-time processing system
[ technical field ] A method for producing a semiconductor device
The present application relates to data processing technologies, and in particular, to a method and a system for processing real-time data, and an instant processing system.
[ background of the invention ]
The instant processing system can perform flexible calculation or inquiry aiming at offline data, namely calculation of unpredictable rules. For example, the proportion of men and women of the honest and non-honest members in the members is calculated, or, for example, the proportion of transaction medals of the honest and non-honest members in the members is calculated, or, for example, the proportion of field authentication of the honest and non-honest members in the members is calculated.
However, existing point-in-time processing systems do not allow for flexible computation or querying of real-time data.
[ summary of the invention ]
Various aspects of the present application provide a method and a system for processing real-time data, and an instant processing system, so as to implement that the instant processing system can flexibly calculate or query real-time data, and is particularly suitable for calculating or querying mass real-time data.
In one aspect of the present application, a method for processing real-time data is provided, where the method is applied to an instant processing system, where the instant processing system includes a merge node and at least one local node, and the method includes:
the merging node obtains real-time data sent by a real-time processing system;
the merging node determines a node as a processing node according to a distribution strategy, and the node is used for writing the real-time data into the instant processing system; wherein,
the processing node comprises the merge node or one of the at least one local node.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where after the merging node determines a node according to the distribution policy, the method further includes:
and the processing node writes the real-time data into the instant processing system.
The above aspect and any possible implementation manner further provide an implementation manner that the writing, by the processing node, the real-time data into the just-in-time processing system includes:
the processing node creates a full-text index file of the real-time data;
and the processing node writes the full-text index file into the instant processing system.
The above aspect and any possible implementation manner further provide an implementation manner, where the writing, by the processing node, the full-text index file into the just-in-time processing system includes:
the processing node monitors the state condition of the real-time data;
and if the state condition meets a first writing condition, the processing node writes the full-text index file into a quick storage device of the instant processing system.
The foregoing aspects and any possible implementations further provide an implementation, where if the status condition satisfies a first writing condition, the writing, by the processing node, the full-text index file into a fast storage device of the instant processing system includes:
and if the receiving time of the real-time data reaches a first maximum visible time or the quantity of the real-time data reaches a first maximum document number, the processing node writes the full-text index file into a fast storage device of the instant processing system.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where the processing node writes the full-text index file into the instant processing system, and further includes:
and if the state condition meets a second writing condition, the processing node writes the full-text index file into the storage device of the instant processing system and writes the full-text index file into the slow storage device of the instant processing system.
The foregoing aspects and any possible implementations further provide an implementation, where if the status condition satisfies a second writing condition, the writing, by the processing node, the full-text index file into a slow storage device of the instant processing system includes:
and if the receiving time of the real-time data reaches a second maximum visible time or the quantity of the real-time data reaches a second maximum document quantity, the processing node writes the full-text index file into a slow storage device of the instant processing system.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where after the processing node monitors the status condition of the real-time data, the processing node further includes:
and if the state condition meets at least one of the first writing condition and the second writing condition, starting a new query engine by the processing node for carrying out instant query on the real-time data.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where after the processing node starts a new query engine to perform an instant query on the real-time data, the processing node further includes:
the merging node receives a data query request, wherein the data query request comprises query conditions;
the merging node distributes a data query request to the at least one local node;
each local node in the at least one local node, or each local node in the at least one local node and the merge node, respectively execute the calculation operation corresponding to the query condition to obtain a query result, and return the query result to the merge node;
and the merging node merges the query results to obtain a final query result.
In another aspect of the present application, there is provided an instant processing system comprising a merge node and at least one local node, wherein,
the merging node is used for acquiring real-time data sent by the real-time processing system; determining a processing node according to the distribution strategy; wherein the processing node comprises one of the merge node or the at least one local node;
and the processing node is used for writing the real-time data into the instant processing system.
The above-described aspects and any possible implementations further provide an implementation in which the merge node and the at least one local node form a distributed cloud architecture.
The above-described aspects and any possible implementation further provide an implementation of the processing node, which is specifically configured to
Creating a full-text index file of the real-time data;
and writing the full-text index file into the instant processing system.
The above-described aspects and any possible implementation further provide an implementation of the processing node, which is specifically configured to
Monitoring the state condition of the real-time data; and
and if the state condition meets a first writing condition, writing the full-text index file into a quick storage device of the instant processing system.
The above-described aspects and any possible implementation further provide an implementation of the processing node, which is specifically configured to
And if the receiving time of the real-time data reaches a first maximum visible time or the quantity of the real-time data reaches a first maximum document number, the processing node writes the full-text index file into a fast storage device of the instant processing system.
The above-described aspects and any possible implementation further provide an implementation of the processing node, which is specifically configured to
And if the state condition meets a second writing condition, writing the full-text index file into the storage device of the instant processing system and writing the full-text index file into the slow storage device of the instant processing system.
The above-described aspects and any possible implementation further provide an implementation of the processing node, which is specifically configured to
And if the receiving time of the real-time data reaches a second maximum visible time or the quantity of the real-time data reaches a second maximum document quantity, the processing node writes the full-text index file into a slow storage device of the instant processing system.
The above-described aspects and any possible implementation further provide an implementation in which the processing node is further configured to
And if the state condition meets at least one of the first writing condition and the second writing condition, starting a new query engine for performing instant query on the real-time data.
The above-mentioned aspects and any possible implementation further provide an implementation that the merge node is further configured to
Receiving a data query request, wherein the data query request comprises query conditions;
distributing the data query request to the at least one local node, so that each local node in the at least one local node, or each local node in the at least one local node and the merge node, respectively execute the calculation operation corresponding to the query condition to obtain a query result;
obtaining the query result;
and combining the query results to obtain a final query result.
In another aspect of the present application, a real-time data processing system is provided, which includes a real-time processing system and the real-time processing system provided in the above aspect; wherein,
and the real-time processing system is used for sending the real-time data to the instant processing system.
According to the technical scheme, the real-time data sent by the real-time processing system is obtained through the merging node, and then one node is determined in the at least one local node or the merging node and the at least one local node according to the distribution strategy to serve as the processing node, so that the processing node can write the real-time data into the real-time processing system, and the real-time processing system can flexibly calculate or inquire the real-time data.
In addition, by adopting the technical scheme provided by the application, the real-time processing system can flexibly calculate or query the real-time data, so that the flexibility of real-time data real-time query can be effectively improved.
In addition, by adopting the technical scheme provided by the application, the full-text index file of the real-time data is created and written into the instant processing system, so that the full-text index file of the real-time data can be utilized for instant query of the real-time data, and query in the whole real-time data is not needed, therefore, the system overhead of the instant processing system can be reduced, and the efficiency of instant query of the real-time data is improved.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and those skilled in the art can also obtain other drawings according to the drawings without inventive labor.
Fig. 1 is a schematic flowchart of a method for processing real-time data according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a real-time processing system according to another embodiment of the present application;
fig. 3 is a schematic structural diagram of a real-time data processing system according to another embodiment of the present application.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
Fig. 1 is a schematic flowchart of a processing method of real-time data according to an embodiment of the present application, and is applied to a real-time processing system, as shown in fig. 2, where the real-time processing system includes a merge node and at least one local node, as shown in fig. 1.
101. And the merging node acquires real-time data sent by the real-time processing system.
Optionally, in a possible implementation manner of this embodiment, the real-time data received by the merge node may be stream data (StreamingData) generated by a real-time data source directly received by a real-time processing system, for example, stream data generated by an application system such as a network monitoring system, a financial analysis system, a traffic flow prediction system, and the world wide Web (Web for short), or may also be a calculation result obtained by the real-time processing system performing calculation, such as summation, according to a fixed rule on the stream data generated by the received real-time data source, which is not particularly limited in this embodiment.
Specifically, the real-time processing system may receive, by a request processing component of the instant processing system, a data write request sent by calling an Application Programming Interface (API) of the instant processing system, where the data write request includes real-time data. And the request processing component sends the data write request to a merge node, and the merge node is responsible for processing the data write request.
The merge node may be a pre-configured fixed node, or may also be a local node randomly selected by the request processing component, or may also be a local node selected by the request processing component according to an election policy, which is not particularly limited in this embodiment.
102. And the merging node determines a node as a processing node according to a distribution strategy, and is used for writing the real-time data into the instant processing system.
Wherein the processing node comprises one of the merge node or the at least one local node.
Optionally, in a possible implementation manner of this embodiment, if the merge node is a pre-configured fixed node, in 102, the merge node may determine a node from the at least one local node according to a distribution policy, so as to serve as the processing node. In this case, the processing node may be one of the at least one local node.
Optionally, in a possible implementation manner of this embodiment, if the merge node is a local node randomly selected by the request processing component, or a local node selected by the request processing component according to an election policy, in 102, the merge node may determine a node from the merge node and the at least one local node according to a distribution policy, so as to serve as the processing node. In this case, the processing node may be the merge node or one of the at least one local node.
Specifically, the distribution policy may include, but is not limited to, a hash operation policy and a polling policy, which is not particularly limited in this embodiment.
For example, the merge node may perform a hash operation on identification Information (ID) of the real-time data to determine a corresponding node as a processing node; or, for another example, the merge node may further adopt a polling policy to sequentially select one node as a processing node; this embodiment is not particularly limited.
Optionally, in a possible implementation manner of this embodiment, after 102, the processing node writes the real-time data into the immediate processing system.
In particular, the merge node may deliver a data write request to a processing node. It is understood that, if the merge node is a pre-configured fixed node, the merge node may specifically forward the data write request to the processing node; if the merge node is a local node randomly selected by the request processing component or a local node selected by the request processing component according to the election policy, the merge node may specifically not forward any more, and directly perform subsequent operations on the data write request.
After receiving the data write request, the processing node may specifically create a full-text index file of the real-time data. The processing node may then write the full-text index file to the instant processing system. Specifically, the detailed description of the method for creating the full-text index file can refer to the related contents in the prior art, and is not repeated here.
Therefore, the processing node creates the full-text index file of the real-time data and writes the full-text index file into the instant processing system, so that the full-text index file of the real-time data can be used for instant query of the real-time data, and query in the whole real-time data is not needed, therefore, the system overhead of the instant processing system can be reduced, and the efficiency of instant query of the real-time data is improved.
The term "real-time processing" as used herein, also referred to as streaming processing, i.e., process query and calculation, refers to processing performed whenever the application is running. For example, in a double 11 campaign of large selling websites, the current transaction amount is calculated at any time. In Alibaba, this application is called the trade live room. The data streams are generated continuously with time, and the time interval of each data stream (which can be set to be in the order of seconds, minutes and the like according to the application) is calculated once.
The "immediate processing" referred to herein means processing performed in a short time after the application is run. For example, the user makes a request, completes the internal computation of the application in a short time, and then returns the result.
After the processing node creates the full-text index file of the real-time data, the full-text index file can be immediately written into the instant processing system, or the full-text index file can not be immediately written into the instant processing system, but the full-text index file is written into the instant processing system after the real-time data meets certain conditions. Therefore, the system overhead of the instant processing system can be effectively reduced.
For example, the processing node may selectively execute the operation of writing the full-text index file into the real-time processing system by monitoring a status condition of the real-time data.
For example, if the status condition satisfies a first writing condition, for example, the receiving time of the real-time data reaches a first maximum visible time, maxTime1, or, for example, the amount of the real-time data reaches a first maximum document number, maxDoc1, etc., the processing node may write the full-text index file into a fast storage device, for example, a memory, of the instant processing system. The memory of the instant processing system may be a memory of a computer, or may also be a running memory of a mobile phone, i.e., a system memory, such as a Random Access Memory (RAM), which is not limited in this embodiment. In some cases, the full-text index file write operation performed when the status condition of the real-time data satisfies the first write condition may also be referred to as a soft commit operation.
For another example, if the state condition satisfies a second writing condition, for example, the receiving time of the real-time data reaches a second maximum visible time, that is, maxTime2, or for example, the number of the real-time data reaches a second maximum document number, that is, maxDoc2, and the like, the processing node may write the full-text index file into a slow storage device of the instant processing system, for example, a hard disk, or may also be a non-operating memory of a mobile phone, that is, a physical memory, for example, a Read-only memory (ROM), a memory card, and the like, which is not limited in this embodiment. In some cases, the full-text index file write operation performed when the status condition of the real-time data satisfies the first write condition may also be referred to as a hard commit operation.
Or, for another example, if the status condition satisfies at least one of the first writing condition and the second writing condition, for example, the first writing condition may be that the receiving time of the real-time data reaches a first maximum visible time, maxTime1, or for another example, the number of the real-time data reaches a second maximum number of documents, maxDoc1, or the like; for example, the second writing condition may be that the receiving time of the real-time data reaches a first maximum visible time, maxTime2, or, for example, the amount of the real-time data reaches a second maximum document number, maxDoc2, and the like, and the processing node may further start a new query engine to perform an instant query on the real-time data.
It is to be understood that the first writing condition and the second writing condition have no relation, and this embodiment is not particularly limited thereto.
Therefore, the instant processing system can write the real-time data into the instant processing system in a full-text index file form, so that the instant query of the real-time data is ensured.
In this way, the application system program can receive the sent data query request by the request processing component of the instant processing system by calling the query API of the instant processing system, wherein the data query request includes the query condition. And the request processing component sends the data query request to a merging node, and the merging node is responsible for processing the data query request.
In particular, the merge node may distribute the data query request to the at least one local node. It can be understood that, if the merge node is a pre-configured fixed node, the merge node may specifically distribute the data query request to the at least one local node, and the merge node does not perform subsequent operations on the data query request any more; if the merge node is a local node randomly selected by the request processing component or a local node selected by the request processing component according to the election policy, the merge node may specifically distribute the data query request to the at least one local node, and at the same time, the merge node continues to perform subsequent operations on the data query request.
After receiving the data query request, each local node in the at least one local node, or each local node in the at least one local node and the merge node, respectively perform query, that is, execute a calculation operation corresponding to a query condition, to obtain a query result, and return the query result to the merge node, and then merge the query results by the merge node, to obtain a final query result, and return the final query result to the application system.
At this point, the instant processing system finishes one flexible calculation or query aiming at the real-time data.
It should be noted that, in this embodiment, the merge node and the at least one local node form a distributed cloud architecture. Specifically, centralized information configuration and management can be performed through a configuration management center, such as ZooKeeper, and automatic fault tolerance and automatic load balancing of requests can be realized. For example, each node, i.e., the merge node and each of the at least one local node, may include an active node and at least one standby node. If the main node fails and is not available, one standby node can be used as the main node to continue providing service so as to realize automatic fault tolerance. Each standby node can be an active node according to a balancing strategy, and provides service, so that automatic load balancing of requests is realized.
With the development of the information society, more and more information is digitalized, especially with the development of the internet, data is explosively increased, and a large amount of real-time data appears, which can be called as mass real-time data. Due to the adoption of the distributed cloud architecture formed by the merging nodes and the at least one local node and the combination of the technical scheme provided by the invention, the massive real-time data can be well subjected to instant query or calculation.
In this embodiment, the real-time data sent by the real-time processing system is obtained through the merge node, and then a node is determined as a processing node in the at least one local node or the merge node and the at least one local node according to the distribution policy, so that the processing node can write the real-time data into the real-time processing system, thereby implementing flexible calculation or query of the real-time data by the real-time processing system.
In addition, by adopting the technical scheme provided by the application, the real-time processing system can flexibly calculate or query the real-time data, so that the flexibility of real-time data real-time query can be effectively improved.
In addition, by adopting the technical scheme provided by the application, the full-text index file of the real-time data is created and written into the instant processing system, so that the full-text index file of the real-time data can be utilized for instant query of the real-time data, and query in the whole real-time data is not needed, therefore, the system overhead of the instant processing system can be reduced, and the efficiency of instant query of the real-time data is improved.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
Fig. 2 is a schematic structural diagram of a real-time processing system according to another embodiment of the present application, as shown in fig. 2. The instant processing system of the present embodiment may include a merge node 21 and at least one local node 221、222……22n. Wherein n is an integer greater than or equal to 1. The merge node 21 and the local node 221、222……22nThe connection is made through a wired or wireless network. A merge node 21 and at least one local node 221、222……22nThe nodes form a cluster, which may be formed by 1 or more servers, and each node occupies a part of the processing resources of the cluster, that is, each node corresponds to one server, and each server may have a plurality of nodes deployed thereon.
Wherein,
the merge node 21 is configured to obtain real-time data sent by a real-time processing system; determining a processing node according to the distribution strategy; wherein the processing nodes comprise the merge node 21 or the at least one local node 221、222……22nA local node 22 ofiAnd i is an integer greater than 0 and less than or equal to n.
And the processing node is used for writing the real-time data into the instant processing system.
Optionally, in a possible implementation manner of this embodiment, the real-time data received by the merge node 21 may be stream data (StreamingData) generated by a real-time data source directly received by a real-time processing system, for example, stream data generated by an application system such as a network monitoring system, a financial analysis system, a traffic flow prediction system, a Web, and the like, or may also be a calculation result obtained by the real-time processing system by performing calculation, such as summation and the like, according to a fixed rule on the stream data generated by the received real-time data source, which is not particularly limited in this embodiment.
Specifically, the real-time processing system 25 may receive, by calling an Application Programming Interface (API) 23 of the instant processing system, a data write request sent by a request processing component of the instant processing system, where the data write request includes real-time data. The request processing component sends the data write request to the merge node 21, and the merge node 21 is responsible for processing the data write request.
Wherein, the merge node 21 may be a pre-configured fixed node,or alternatively a local node 22 randomly selected by the request processing componentjJ is an integer greater than 0 and less than or equal to n, or may be a local node 22 selected by the request processing component according to the election policyjJ is an integer greater than 0 and less than or equal to n, which is not particularly limited in this embodiment.
Optionally, in a possible implementation manner of this embodiment, if the merge node 21 is a pre-configured fixed node, the merge node 21 may send the at least one local node 22 according to a distribution policy1、222……22nDetermining a node as the processing node. In this case, the processing node may be the at least one local node 221、222……22nA local node 22 ofiAnd i is an integer greater than 0 and less than or equal to n.
Optionally, in a possible implementation manner of this embodiment, if the merge node 21 is a local node 22 that is randomly selected by the request processing componentjOr a local node 22 selected by the request processing component according to the election policyjThen, the merge node 21 may then follow a distribution policy, the merge node 21 following the distribution policy from the merge node 21 and the at least one local node 221、222……22nDetermining a node as the processing node. In this case, the processing node may be the merge node 21 or the at least one local node 221、222……22nA local node 22 ofi
Specifically, the distribution policy may include, but is not limited to, a hash operation policy and a polling policy, which is not particularly limited in this embodiment.
For example, the merge node 21 may perform a hash operation on identification Information (ID) of the real-time data to determine a corresponding node as a processing node; or, for another example, the merge node 21 may further adopt a polling policy to sequentially select one node as a processing node; this embodiment is not particularly limited.
In particular, the merge node 21 may deliver a data write request to a processing node. It is understood that, if the merge node 21 is a pre-configured fixed node, the merge node 21 may specifically forward the data write request to the processing node; if the merge node 21 is a local node 22 randomly selected by the request processing componentjOr a local node 22 selected by the request processing component according to the election policyjSpecifically, the merge node 21 may not forward any more, and directly perform subsequent operations on the data write request.
After receiving the data write request, the processing node may be specifically configured to create a full-text index file of the real-time data; and writing the full-text index file into the instant processing system. Specifically, the detailed description of the method for creating the full-text index file can refer to the related contents in the prior art, and is not repeated here.
Therefore, the processing node creates the full-text index file of the real-time data and writes the full-text index file into the instant processing system, so that the full-text index file of the real-time data can be used for instant query of the real-time data, and query in the whole real-time data is not needed, therefore, the system overhead of the instant processing system can be reduced, and the efficiency of instant query of the real-time data is improved.
The term "real-time processing" as used herein, also referred to as streaming processing, i.e., process query and calculation, refers to processing performed whenever the application is running. For example, in a double 11 campaign of large selling websites, the current transaction amount is calculated at any time. In Alibaba, this application is called the trade live room. The data streams are generated continuously with time, and the time interval of each data stream (which can be set to be in the order of seconds, minutes and the like according to the application) is calculated once.
The "immediate processing" referred to herein means processing performed in a short time after the application is run. For example, the user makes a request, completes the internal computation of the application in a short time, and then returns the result.
After the processing node creates the full-text index file of the real-time data, the full-text index file can be immediately written into the instant processing system, or the full-text index file can not be immediately written into the instant processing system, but the full-text index file is written into the instant processing system after the real-time data meets certain conditions. Therefore, the system overhead of the instant processing system can be effectively reduced.
For example, the processing node may selectively execute the operation of writing the full-text index file into the real-time processing system by monitoring a status condition of the real-time data.
For example, if the status condition satisfies a first writing condition, for example, the receiving time of the real-time data reaches a first maximum visible time, maxTime1, or, for example, the amount of the real-time data reaches a first maximum document number, maxDoc1, etc., the processing node may write the full-text index file into a fast storage device, for example, a memory, of the instant processing system. The memory of the instant processing system may be a memory of a computer, or may also be a running memory of a mobile phone, i.e., a system memory, such as a Random Access Memory (RAM), which is not limited in this embodiment. In some cases, the full-text index file write operation performed when the status condition of the real-time data satisfies the first write condition may also be referred to as a soft commit operation.
For another example, if the state condition satisfies a second writing condition, for example, the receiving time of the real-time data reaches a second maximum visible time, that is, maxTime2, or for example, the number of the real-time data reaches a second maximum document number, that is, maxDoc2, and the like, the processing node may write the full-text index file into a slow storage device of the instant processing system, for example, a hard disk, or may also be a non-operating memory of a mobile phone, that is, a physical memory, for example, a Read-only memory (ROM), a memory card, and the like, which is not limited in this embodiment. In some cases, the full-text index file write operation performed when the status condition of the real-time data satisfies the first write condition may also be referred to as a hard commit operation.
Or, for another example, if the status condition satisfies at least one of the first writing condition and the second writing condition, for example, the first writing condition may be that the receiving time of the real-time data reaches a first maximum visible time, maxTime1, or for another example, the number of the real-time data reaches a second maximum number of documents, maxDoc1, or the like; for example, the second writing condition may be that the receiving time of the real-time data reaches a first maximum visible time, maxTime2, or, for example, the amount of the real-time data reaches a second maximum document number, maxDoc2, and the like, and the processing node may further start a new query engine to perform an instant query on the real-time data.
It is to be understood that the first writing condition and the second writing condition have no relation, and this embodiment is not particularly limited thereto.
Therefore, the instant processing system can write the real-time data into the instant processing system in a full-text index file form, so that the instant query of the real-time data is ensured.
Thus, the application system program 26 may receive a data query request sent by a request processing component of the instant processing system, the data query request including a query condition, by calling the query API24 of the instant processing system. The request processing component sends the data query request to the merge node 21, and the merge node 21 is responsible for processing the data query request.
In particular, the merge node 21 may request a data queryDistributing to the at least one local node 221、222……22n. It is understood that, if the merge node 21 is a pre-configured fixed node, the merge node 21 may specifically distribute the data query request to the at least one local node 221、222……22nThe merge node 21 no longer performs subsequent operations on the data query request; if the merge node 21 is a local node 22 randomly selected by the request processing componentjOr a local node 22 selected by the request processing component according to the election policyjThe merge node 21 may specifically distribute the data query request to the at least one local node 221、222……22nMeanwhile, the merge node 21 continues to perform subsequent operations on the data query request.
After receiving the data query request, the at least one local node 221、222……22nOf each local node, or of the at least one local node 221、222……22nEach local node and the merge node 21 in the node system respectively perform query, that is, execute the calculation operation corresponding to the query condition to obtain a query result, and return the query result to the merge node 21, and the merge node 21 further merges the query results to obtain a final query result, and returns the final query result to the application system.
At this point, the instant processing system finishes one flexible calculation or query aiming at the real-time data.
It should be noted that, in this embodiment, the merge node and the at least one local node form a distributed cloud architecture. Specifically, centralized information configuration and management can be performed through a configuration management center, such as ZooKeeper, and automatic fault tolerance and automatic load balancing of requests can be realized. For example, each node, i.e. the merge node 21 and the at least one local node 221、222……22nEach local node in the network can comprise an active node and at least one standby nodeThe node is used. If the main node fails and is not available, one standby node can be used as the main node to continue providing service so as to realize automatic fault tolerance. Each standby node can be an active node according to a balancing strategy, and provides service, so that automatic load balancing of requests is realized.
With the development of the information society, more and more information is digitalized, especially with the development of the internet, data is explosively increased, and a large amount of real-time data appears, which can be called as mass real-time data. Due to the adoption of the distributed cloud architecture formed by the merging nodes and at least one local node and the combination of the technical scheme provided by the invention, the massive real-time data can be well processed.
In this embodiment, the real-time data sent by the real-time processing system is obtained through the merge node, and then a node is determined as a processing node in the at least one local node or the merge node and the at least one local node according to the distribution policy, so that the processing node can write the real-time data into the real-time processing system, thereby implementing flexible calculation or query of the real-time data by the real-time processing system.
In addition, by adopting the technical scheme provided by the application, the real-time processing system can flexibly calculate or query the real-time data, so that the flexibility of real-time data real-time query can be effectively improved.
In addition, by adopting the technical scheme provided by the application, the full-text index file of the real-time data is created and written into the instant processing system, so that the full-text index file of the real-time data can be utilized for instant query of the real-time data, and query in the whole real-time data is not needed, therefore, the system overhead of the instant processing system can be reduced, and the efficiency of instant query of the real-time data is improved.
Fig. 3 is a schematic structural diagram of a real-time data processing system according to another embodiment of the present application, as shown in fig. 3. The real-time data processing system of the present embodiment may include a real-time processing system 31 and an instant processing system 32 provided in the embodiment corresponding to fig. 2. Wherein,
the real-time processing system 31 is configured to send the real-time data to the instant processing system 32.
For a detailed description of the instant processing system 32, reference may be made to relevant contents in the embodiment corresponding to fig. 2, which is not described herein again.
In this embodiment, the real-time data sent by the real-time processing system is obtained through the merge node, and then a node is determined as a processing node in the at least one local node or the merge node and the at least one local node according to the distribution policy, so that the processing node can write the real-time data into the real-time processing system, thereby implementing flexible calculation or query of the real-time data by the real-time processing system.
In addition, by adopting the technical scheme provided by the application, the real-time processing system can flexibly calculate or query the real-time data, so that the flexibility of real-time data real-time query can be effectively improved.
In addition, by adopting the technical scheme provided by the application, the full-text index file of the real-time data is created and written into the instant processing system, so that the full-text index file of the real-time data can be utilized for instant query of the real-time data, and query in the whole real-time data is not needed, therefore, the system overhead of the instant processing system can be reduced, and the efficiency of instant query of the real-time data is improved.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or page components may be combined or integrated into another system, or some features may be omitted or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (19)

1. A method for processing real-time data, which is applied to a real-time processing system, wherein the real-time processing system comprises a merge node and at least one local node, and the method comprises the following steps:
the merging node obtains real-time data sent by a real-time processing system;
the merging node determines a node as a processing node according to a distribution strategy, and the node is used for writing the real-time data into the instant processing system; wherein,
the processing node comprises the merge node or one of the at least one local node.
2. The method of claim 1, wherein the merge node determines a node as a processing node according to the distribution policy, and further comprising:
and the processing node writes the real-time data into the instant processing system.
3. The method of claim 2, wherein the processing node writes the real-time data into the just-in-time processing system, comprising:
the processing node creates a full-text index file of the real-time data;
and the processing node writes the full-text index file into the instant processing system.
4. The method of claim 3, wherein the processing node writing the full-text index file to the point-in-time processing system comprises:
the processing node monitors the state condition of the real-time data;
and if the state condition meets a first writing condition, the processing node writes the full-text index file into a quick storage device of the instant processing system.
5. The method of claim 4, wherein if the status condition satisfies a first writing condition, the processing node writing the full-text index file to a flash memory device of the instant processing system, comprising:
and if the receiving time of the real-time data reaches a first maximum visible time or the quantity of the real-time data reaches a first maximum document number, the processing node writes the full-text index file into a fast storage device of the instant processing system.
6. The method of claim 3, wherein the processing node writes the full-text index file into the point-in-time processing system, further comprising:
and if the state condition meets a second writing condition, the processing node writes the full-text index file into the storage device of the instant processing system and writes the full-text index file into the slow storage device of the instant processing system.
7. The method of claim 6, wherein if the status condition satisfies a second writing condition, the processing node writing the full-text index file to a slow storage device of the instant processing system, comprising:
and if the receiving time of the real-time data reaches a second maximum visible time or the quantity of the real-time data reaches a second maximum document quantity, the processing node writes the full-text index file into a slow storage device of the instant processing system.
8. The method of any of claims 4 to 7, wherein after the processing node monitors the status condition of the real-time data, the method further comprises:
and if the state condition meets at least one of the first writing condition and the second writing condition, starting a new query engine by the processing node for carrying out instant query on the real-time data.
9. The method of claim 8, wherein the processing node, after initiating a new query engine for performing the instant query of the real-time data, further comprises:
the merging node receives a data query request, wherein the data query request comprises query conditions;
the merging node distributes a data query request to the at least one local node;
each local node in the at least one local node, or each local node in the at least one local node and the merge node, respectively execute the calculation operation corresponding to the query condition to obtain a query result, and return the query result to the merge node;
and the merging node merges the query results to obtain a final query result.
10. An instant processing system comprising a merge node and at least one local node; wherein,
the merging node is used for acquiring real-time data sent by the real-time processing system; determining a processing node according to the distribution strategy; wherein the processing node comprises one of the merge node or the at least one local node;
and the processing node is used for writing the real-time data into the instant processing system.
11. The just-in-time processing system of claim 10, wherein the merge node and the at least one local node form a distributed cloud architecture.
12. The just-in-time processing system of claim 10, wherein the processing node is specifically configured to
Creating a full-text index file of the real-time data;
and writing the full-text index file into the instant processing system.
13. The just-in-time processing system of claim 12, wherein the processing node is specifically configured to
Monitoring the state condition of the real-time data; and
and if the state condition meets a first writing condition, writing the full-text index file into a quick storage device of the instant processing system.
14. The just-in-time processing system of claim 13, wherein the processing node is specifically configured to
And if the receiving time of the real-time data reaches a first maximum visible time or the quantity of the real-time data reaches a first maximum document number, the processing node writes the full-text index file into a fast storage device of the instant processing system.
15. The just-in-time processing system of claim 12, wherein the processing node is specifically configured to
And if the state condition meets a second writing condition, writing the full-text index file into the storage device of the instant processing system and writing the full-text index file into the slow storage device of the instant processing system.
16. The just-in-time processing system of claim 15, wherein the processing node is specifically configured to
And if the receiving time of the real-time data reaches a second maximum visible time or the quantity of the real-time data reaches a second maximum document quantity, the processing node writes the full-text index file into a slow storage device of the instant processing system.
17. The immediate processing system of any of claims 12 to 16, wherein the processing node is further configured to
And if the state condition meets at least one of the first writing condition and the second writing condition, starting a new query engine for performing instant query on the real-time data.
18. The just-in-time processing system of claim 17, wherein the merge node is further configured to
Receiving a data query request, wherein the data query request comprises query conditions;
distributing the data query request to the at least one local node, so that each local node in the at least one local node, or each local node in the at least one local node and the merge node, respectively execute the calculation operation corresponding to the query condition to obtain a query result;
obtaining the query result;
and combining the query results to obtain a final query result.
19. A real-time data processing system comprising a real-time processing system and the instant processing system of any one of claims 10 to 18; wherein,
and the real-time processing system is used for sending the real-time data to the instant processing system.
CN201410229319.1A 2014-05-28 2014-05-28 The processing method and system of real time data, instant disposal system for treating Active CN105335362B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410229319.1A CN105335362B (en) 2014-05-28 2014-05-28 The processing method and system of real time data, instant disposal system for treating

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410229319.1A CN105335362B (en) 2014-05-28 2014-05-28 The processing method and system of real time data, instant disposal system for treating

Publications (2)

Publication Number Publication Date
CN105335362A true CN105335362A (en) 2016-02-17
CN105335362B CN105335362B (en) 2019-06-11

Family

ID=55285906

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410229319.1A Active CN105335362B (en) 2014-05-28 2014-05-28 The processing method and system of real time data, instant disposal system for treating

Country Status (1)

Country Link
CN (1) CN105335362B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107391541A (en) * 2017-05-16 2017-11-24 阿里巴巴集团控股有限公司 A kind of real time data merging method and device
CN109074377A (en) * 2016-03-29 2018-12-21 亚马逊科技公司 Managed function for real-time processing data stream executes
CN111310170A (en) * 2020-01-16 2020-06-19 深信服科技股份有限公司 Anti-leakage method and device for application program and computer readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090313400A1 (en) * 2006-06-13 2009-12-17 International Business Machines Corp. Dynamic stabilization for a stream processing system
CN102118405A (en) * 2009-12-31 2011-07-06 比亚迪股份有限公司 P2P (Peer-to-Peer) network system applied to real-time video data transmission
CN102880475A (en) * 2012-10-23 2013-01-16 上海普元信息技术股份有限公司 Real-time event handling system and method based on cloud computing in computer software system
CN103152287A (en) * 2013-03-27 2013-06-12 恒生电子股份有限公司 Method and device for reliably receiving real-time data
CN103338261A (en) * 2013-07-04 2013-10-02 北京泰乐德信息技术有限公司 Storage and processing method and system of rail transit monitoring data
CN103560943A (en) * 2013-10-31 2014-02-05 北京邮电大学 Network analytic system and method supporting real-time mass data processing

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090313400A1 (en) * 2006-06-13 2009-12-17 International Business Machines Corp. Dynamic stabilization for a stream processing system
CN102118405A (en) * 2009-12-31 2011-07-06 比亚迪股份有限公司 P2P (Peer-to-Peer) network system applied to real-time video data transmission
CN102880475A (en) * 2012-10-23 2013-01-16 上海普元信息技术股份有限公司 Real-time event handling system and method based on cloud computing in computer software system
CN103152287A (en) * 2013-03-27 2013-06-12 恒生电子股份有限公司 Method and device for reliably receiving real-time data
CN103338261A (en) * 2013-07-04 2013-10-02 北京泰乐德信息技术有限公司 Storage and processing method and system of rail transit monitoring data
CN103560943A (en) * 2013-10-31 2014-02-05 北京邮电大学 Network analytic system and method supporting real-time mass data processing

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109074377A (en) * 2016-03-29 2018-12-21 亚马逊科技公司 Managed function for real-time processing data stream executes
CN109074377B (en) * 2016-03-29 2024-04-16 亚马逊科技公司 Managed function execution for real-time processing of data streams
CN107391541A (en) * 2017-05-16 2017-11-24 阿里巴巴集团控股有限公司 A kind of real time data merging method and device
CN107391541B (en) * 2017-05-16 2020-10-20 创新先进技术有限公司 Real-time data merging method and device
CN111310170A (en) * 2020-01-16 2020-06-19 深信服科技股份有限公司 Anti-leakage method and device for application program and computer readable storage medium

Also Published As

Publication number Publication date
CN105335362B (en) 2019-06-11

Similar Documents

Publication Publication Date Title
CN108924250B (en) Service request processing method and device based on block chain and computer equipment
US20200374288A1 (en) Block chain-based multi-chain management method and system, electronic device, and storage medium
CN110401720B (en) Information processing method, device, system, application server and medium
CN109040227B (en) Service request response method and device based on block chain and computer equipment
CN111459986B (en) Data computing system and method
US9588813B1 (en) Determining cost of service call
US9191236B2 (en) Message broadcasting in a clustered computing environment
CN109376172B (en) Data acquisition method and system based on block chain
US20180227139A1 (en) Method of Terminal-Based Conference Load-Balancing, and Device and System Utilizing Same
CN107527222B (en) Information processing method, device and system
CN103516763B (en) Method for processing resource and system and device
CN113361913A (en) Communication service arranging method, device, computer equipment and storage medium
US20190370293A1 (en) Method and apparatus for processing information
CN111813868B (en) Data synchronization method and device
CN105335362B (en) The processing method and system of real time data, instant disposal system for treating
US20150079966A1 (en) Methods for facilitating telecommunication network administration and devices thereof
US10033737B2 (en) System and method for cross-cloud identity matching
WO2018188607A1 (en) Stream processing method and device
CN111159131A (en) Performance optimization method, device, equipment and computer readable storage medium
CN104852964A (en) Multifunctional server scheduling method
US10572486B2 (en) Data communication in a distributed data grid
CN116703071A (en) Resource sharing method, device and equipment and storage medium
CN106936643B (en) Equipment linkage method and terminal equipment
US8700954B2 (en) Common trouble case data generating method and non-transitory computer-readable medium storing common trouble case data generating program
CN111355689A (en) Stream data processing method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20211125

Address after: No. 699, Wangshang Road, Binjiang District, Hangzhou, Zhejiang

Patentee after: Alibaba (China) Network Technology Co.,Ltd.

Address before: Box 847, four, Grand Cayman capital, Cayman Islands, UK

Patentee before: ALIBABA GROUP HOLDING Ltd.