CN108197233A - A kind of data managing method, middleware and data management system - Google Patents
A kind of data managing method, middleware and data management system Download PDFInfo
- Publication number
- CN108197233A CN108197233A CN201711473196.6A CN201711473196A CN108197233A CN 108197233 A CN108197233 A CN 108197233A CN 201711473196 A CN201711473196 A CN 201711473196A CN 108197233 A CN108197233 A CN 108197233A
- Authority
- CN
- China
- Prior art keywords
- data
- subject information
- gathered
- middleware
- gathered data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1805—Append-only file systems, e.g. using logs or journals to store data
- G06F16/1815—Journaling file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
- G06F16/116—Details of conversion of file system types or formats
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application provides a kind of data managing methods,Middleware and data management system,The method and middleware are by obtaining the gathered data of data acquisition side,It generates the subject information of the gathered data and encapsulates the gathered data and the subject information,And it sends and is packaged with the gathered data of subject information to data storage side (so that data store root will be at storage location corresponding to the acquired data storage to corresponding theme according to the subject information of the gathered data),It realizes and the gathered data of data acquisition side is written to data storage side,So as to utilize application scheme,It can be realized by middleware form and the gathered data of the data such as Flume OG acquisition side is written to data storage sides such as Kafka,Solve in the prior art early stage Flume versions because there is no Kafka plug-in units,Caused by can not by the daily record being collected into be written Kafka the problem of.
Description
Technical field
The invention belongs to a kind of middleware Technology field more particularly to data managing method, middleware and data management systems
System.
Background technology
Flume OG are that a High Availabitity, highly reliable, the distributed massive logs that Cloudera is provided acquire, is poly-
Conjunction and Transmission system.
Flume supports to customize Various types of data sender, for collecting data, the data needs collected in log system
The storage systems such as Kafka are written, for calculating in real time and data cleansing, but since early stage Flume version such as Flume OG do not have
Kafka plug-in units the daily record being collected into can not be written in the storage systems such as Kafka, it is therefore desirable to develop a set of middleware to connect
Flume daily records are received, and are written into the storage systems such as Kafka.
Invention content
In view of this, the purpose of the present invention is to provide a kind of data managing method, middleware and data management systems, use
In solve the problem of early stage Flume version such as Flume OG do not have Kafka plug-in units can not by the daily record being collected into be written Kafka.
For this purpose, the present invention is disclosed directly below technical solution:
A kind of data managing method, applied to middleware, the method includes:
Obtain the gathered data of data acquisition side;
The subject information of the gathered data is generated, and encapsulates the gathered data and the subject information, including
The gathered data of subject information;
The gathered data including subject information is sent to data storage side, so that data store root according to the acquisition
The subject information of data will be at the storage location corresponding to the acquired data storage to corresponding theme.
The above method, it is preferred that the data acquisition side is Flume log systems, then the data acquisition side of obtaining
Gathered data, including:
Every daily record data of Flume log systems acquisition is received based on predetermined protocol;
Every daily record data of reception is buffered in the obstruction queue being pre-created.
The above method, it is preferred that message system is subscribed to for Kafka distributed posts by the data storage side, then the hair
The gathered data including subject information is sent to data storage side, including:
Based on the thread pool being pre-created, daily record data to the Kafka distributed posts that transmission includes subject information are subscribed to
Message system.
The above method, it is preferred that before the gathered data to data storage side of subject information is included described in the transmission,
It further includes:
Obtain black and white lists subject information;
When the corresponding subject information of the gathered data is blacklist subject information or non-white list subject information, filtering
Fall the gathered data;
When the corresponding subject information of the gathered data is non-blacklist subject information or white list subject information, triggering
The step of sending at the storage location corresponding to the gathered data to the corresponding theme of data storage side including subject information.
The above method, it is preferred that further include:
The data traffic of the middleware is monitored, and is alerted in data traffic exception.
A kind of middleware, including:
Data capture unit, for obtaining the gathered data of data acquisition side;
Theme generation unit for generating the subject information of the gathered data, and encapsulates the gathered data and described
Subject information obtains the gathered data for including subject information;
Data transmission unit, for sending the gathered data including subject information to data storage side, so that data
Store root according to the subject information of the gathered data by the acquired data storage to corresponding theme corresponding to storage location
Place.
Above-mentioned middleware, it is preferred that the data acquisition side is Flume log systems, and the data storage side is Kafka
Distributed post subscribes to message system;
The then data capture unit, is specifically used for:
Every daily record data of Flume log systems acquisition is received based on predetermined protocol;By every daily record data of reception
It is buffered in the obstruction queue being pre-created;
Correspondingly, the data transmission unit, is specifically used for:
Based on the thread pool being pre-created, daily record data to the Kafka distributed posts that transmission includes subject information are subscribed to
Message system.
Above-mentioned middleware, it is preferred that further include:
Black and white lists administrative unit, is used for:
Obtain black and white lists subject information;When the corresponding subject information of the gathered data is blacklist subject information or non-
During white list subject information, the gathered data is filtered out;When the corresponding subject information of the gathered data is non-blacklist master
When inscribing information or white list subject information, the transmitting element is triggered.
Above-mentioned middleware, it is preferred that further include:
Traffic monitoring unit for monitoring the data traffic of the middleware, and is alerted in data traffic exception.
A kind of data management system, including server cluster, wherein, each server disposition in the server cluster
There are one middlewares as described above.
Above system, it is preferred that further include:
Load balancing managing device, for carrying out load monitoring and pipe to each server in the server cluster
Reason so that in the server cluster each server load balancing.
It is described by above scheme it is found that this application provides a kind of data managing method, middleware and data management system
Method and middleware are generated by obtaining the gathered data of data acquisition side described in subject information and the encapsulation of the gathered data
Gathered data and the subject information and transmission are packaged with the gathered data of subject information to data storage side (so that data
Store root according to the subject information of the gathered data by the acquired data storage to corresponding theme corresponding to storage location
Place), it realizes and the gathered data of data acquisition side is written to data storage side, it, can be in so as to utilize application scheme
Between part form realization the gathered data of the data such as Flume OG acquisition side is written to data storage sides such as Kafka, solve show
Have in technology early stage Flume version because there is no Kafka plug-in units, caused by asking for Kafka can not be written in the daily record being collected into
Topic.
Description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, to embodiment or will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of invention, for those of ordinary skill in the art, without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of data managing method flow chart provided in an embodiment of the present invention;
Fig. 2 is the circuit theory schematic diagram of Flume log systems;
Fig. 3 is another data managing method flow chart provided in an embodiment of the present invention;
Fig. 4 is another data managing method flow chart provided in an embodiment of the present invention;
Fig. 5 is a kind of middleware structure schematic diagram provided in an embodiment of the present invention;
Fig. 6 is another middleware structure schematic diagram provided in an embodiment of the present invention;
Fig. 7 is another middleware structure schematic diagram provided in an embodiment of the present invention;
Fig. 8 is a kind of structure diagram of data management system provided in an embodiment of the present invention;
Fig. 9 is the structure diagram of another data management system provided in an embodiment of the present invention.
Specific embodiment
For the sake of quoting and understanding, the technical term that hereinafter uses is write a Chinese character in simplified form or summary of abridging is explained as follows:
Flume:Flume is the High Availabitity that Cloudera is provided, and highly reliable, distributed massive logs are adopted
The system of collection, polymerization and transmission, Flume supports to customize Various types of data sender in log system, for collecting data;Together
When, Flume is provided carries out simple process, and write the ability of various data receivings (customizable) to data.
Kafka:Kafka is that a kind of distributed post of high-throughput subscribes to message system, it can handle consumer's rule
Everything flow data in the website of mould.This action (web page browsing, search and the action of other users) is in modern net
One key factor of many social functions on network.These data are often as the requirement of handling capacity and by handling daily record
It is solved with log aggregation.For the daily record data as Hadoop and off-line analysis system, but require processing in real time
Limitation, this is a feasible solution.The purpose of Kafka is come on unified line by the loaded in parallel mechanism of Hadoop
With offline Message Processing, also for providing real-time consumption by cluster machine.
Avro:Avro is the system of a Data Serialization.It can be provided:Abundant type of data structure, quickly may be used
The binary data form of compression stores the document container of persistant data, remote procedure call, simple dynamic language knot
Function is closed, after Avro and dynamic language combine, data file is read and write and generation code is not all needed to using RPC agreements, and code
Generation is only worth realizing in static types language as a kind of optional optimization.
LVS:A high-performance is realized using Clustering and (SuSE) Linux OS, the server of High Availabitity has very well
Scalability (Scalability), good reliability (Reliability), good manageability
(Manageability)。
Zookeeper:ZooKeeper is one distributed, and the distributed application program coordination service of open source code is
Mono- realization increased income of Chubby of Google is the significant components of Hadoop and Hbase.It is one and is carried for Distributed Application
For the software of Consistency service, the function of providing includes:Configuring maintenance, domain name service, distributed synchronization, group service etc..
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained every other without making creative work
Embodiment shall fall within the protection scope of the present invention.
The embodiment of the present application discloses a kind of data managing method first, and this method can be applied in middleware, for solution
Certainly early stage Flume versions such as Flume OG are not because having Kafka plug-in units, and lead to not the daily record being collected into write-in Kafka's
Problem, the data managing method flow chart with reference to shown in figure 1, this method include:
Step 101, the gathered data for obtaining data acquisition side.
The form that middleware can be used in the data managing method of the present embodiment is realized, for being adopted data by middleware
The data write-in data storage side that collection side acquires.
Wherein, the data acquisition side can be but not limited to Flume log systems, such as can be specifically Flume
OG.Flume OG be Cloudera provide a High Availabitity, highly reliable, distributed massive logs acquisition, polymerization and
Transmission system, as shown in Fig. 2, the logical architecture of Flume officials includes engine (agent), collector (collector) and storage
Device (storage), wherein, engine is the place that data flow is generated in flume, and (each application system is such as acquired for gathered data
Daily record data etc.), the effect of collector is will to be loaded into memory after the data summarization of multiple engines, and storage is to deposit
Storage system, can be an ordinary file (file) or Kafka distributed posts subscribe to message system or HDFS
(Hadoop Distributed File System, Hadoop distributed file system) etc..
The data storage side is the storage system, can be ordinary file, Kafka distributions as described above
Formula distribution subscription message system or HDFS etc., the present embodiment does not limit to it.
The present embodiment next will be specifically with the data acquisition side for Flume log systems, and data storage side is Kafka
Distributed post is illustrated application scheme for subscribing to message system.
When Flume log systems is in the data acquisition side, when the gathered data for obtaining data acquisition side in this step
When, the gathered data that is obtained is correspondingly the daily record data of Flume log systems, specifically Flume log systems from
The daily record data acquired in various application systems.
Wherein, when implementing application scheme using middleware form, using the Data Transport Protocol made an appointment such as
Avro agreements, HTTP (HyperText Transfer Protocol, hypertext transfer protocol) etc. realize daily record data from
Flume log systems, can be previously according to Flume API by taking Avro agreements as an example to the data transmission of middleware
(Application Programming Interface, application programming interface) is abided by middle unit development one is corresponding
The AvroSource interfaces of Avro agreements are followed, and develop Avro services, on this basis, can will develop the AvroSource completed
The interface message of interface is configured in Flume, and starts the Avro services in middleware, later, is serviced by the Avro logical
It crosses the AvroSource interfaces and daily record data is obtained from Flume log systems with Avro agreements.
Wherein, the daily record data obtained for Avro services in middle unit development and can start a LogHandler in advance
(log management) services, and blocks queue by the service-creation one, on this basis, can service reception by LogHandler and come from
The daily record data of Avro services, and the daily record data of reception is buffered in the obstruction queue, wait for subsequent processing.
The subject information of step 102, the generation gathered data, and the gathered data and the subject information are encapsulated,
Obtain the gathered data for including subject information.
Kafka distributed posts subscribe to message system and often carry out data production (the data production i.e. finger by theme
Flume sends the logs to Kafka), and different data channel is correspondingly provided, it enables to based on different data
The log information of channel reception/production different themes, in consideration of it, in order to which Flume daily record datas are written Kafka, with
The different data channel of Kafka is docked, in the present embodiment, when middleware obtains every daily record number of Flume log systems
According to rear, the subject information (topic) of this daily record data is generated, and the subject information of generation and this daily record data are encapsulated as
One.
Wherein, the subject information of daily record data can be the theme divided according to the classification belonging to daily record data, such as army
Thing, amusement, education etc.;Or can also be the theme divided according to the source of daily record data, such as application system 1, using system
System 2 etc., the present embodiment does not limit to it, and in practical application, the log topic generated in middleware should meet Kafka
The used theme dividing mode when data storage side is carrying out data storage by theme.
Step 103 sends the gathered data including subject information to data storage side, so that data store root evidence
The subject information of the gathered data will be at the storage location corresponding to the acquired data storage to corresponding theme.
The subject information of the daily record data is being generated, and the subject information of generation and the daily record data are being packaged as a whole
Afterwards, can according to the subject information encapsulated in daily record data, using with the corresponding data channel of the subject information, by daily record data
The data such as Kafka storage side is sent to, so that the data such as Kafka store root according to described in the subject information general of the gathered data
At storage location corresponding to acquired data storage to corresponding theme.
In the specific implementation, Kafka can be read in advance to be configured and initialize a thread pool, wherein, it is every in thread pool
A thread works independently, and creates HashMap and List in each thread for log cache, herein
On the basis of, thread every time from it is described obstruction queue obtain a daily record data after can is buffered in List, and using topic as
List is stored in HashMap by Key (keyword), subsequently, when the daily record quantity stored in List reaches scheduled quantity threshold
During value, the daily record in List is used in corresponding data channel centralized production (transmitting) to Kafka, while clear by topic
Empty List.What the processing mode can effectively reduce Kafka and middleware links number or quantity, so as to reduce the pressure of Kafka
Power.
Data managing method provided in this embodiment by obtaining the gathered data of data acquisition side, generates the acquisition
The subject information of data simultaneously encapsulates the gathered data and the subject information and sends the acquisition number for being packaged with subject information
According to data storage side (so that data store root according to the subject information of the gathered data by the acquired data storage to phase
Answer at the storage location corresponding to theme), it realizes and the gathered data of data acquisition side is written to data storage side, so as to profit
With application scheme, it can be realized by middleware form and the gathered data of the data such as Flume OG acquisition side is written to Kafka
Etc. data storage side, solve in the prior art early stage Flume versions because there is no Kafka plug-in units, caused by can not will collect
The problem of daily record write-in Kafka arrived.
In the next another embodiment of the application, another data managing method flow chart with reference to shown in figure 3, institute
Stating data managing method can also include before the step 103:
Step 104 obtains black and white lists subject information;
Step 105 is believed for blacklist subject information or non-white list theme when the corresponding subject information of the gathered data
During breath, the gathered data is filtered out;When the corresponding subject information of the gathered data is non-blacklist subject information or white name
During single subject information, the step 103 is triggered.
The present embodiment also provides the black and white lists management function of topic for the middleware, specifically, can basis
The real data production requirement of Kafka safeguards the black and white lists of topic, wherein, have recorded phase in the black and white lists of the topic
The white list subject information or blacklist subject information answered, the white list subject information include the required data of Kafka
Corresponding theme, the blacklist subject information accordingly include the theme corresponding to the unwanted data of Kafka.
In order to realize the daily record data that its required theme is targetedly transmitted to Kafka, when middleware is from flume daily records
It, can be by the theme of the daily record data and the topic safeguarded black and white after system obtains daily record data and generates its corresponding theme
List is matched, wherein, when the theme of the daily record data is blacklist subject information or non-white list subject information, represent
The daily record data is not the data of theme needed for Kafka, so as to which the data filtering is fallen, if conversely, the master of the daily record data
Entitled non-blacklist subject information or white list subject information, then it represents that the daily record data is the data of theme needed for Kafka, from
And the daily record data can be written to kafka using corresponding data channel according to the theme of the daily record data, and then by Kafka
It is stored according to the theme of the daily record data to the storage location corresponding to corresponding theme, for example, being " education " by theme
Daily record data write-in kafka at storage location corresponding to theme " education " etc..
When implementing the application, the web container Jetty based on Java of a lightweight can be started, run and use on Jetty
In the Web service for carrying out topic black and white lists management, wherein, daily record data will be carried out no longer for the topic for adding in blacklist
Production (daily record data that the theme is no longer transmitted to kafka).
The present embodiment provides the log management work(based on topic black and white lists by safeguarding the black and white lists of topic
Can, it may filter that the unwanted daily record data in the data such as Kafka storage side, realize targetedly to theme needed for its transmission
Daily record data.
In the next embodiment of the application, another data managing method flow chart with reference to shown in figure 4, the number
It can also include according to management method:
The data traffic of step 106, the monitoring middleware, and alerted in data traffic exception.
It is realized in the present embodiment using a counting module and traffic monitoring is carried out to the middleware.
Specifically, during middleware obtains Flume daily record datas, using in the counting module real-time statistics
Between part receive the daily record quantity of daily record and daily record size (daily record size can be obtained by the accumulative byte number received), in this base
It can know the data traffic of middleware according to the daily record quantity of statistics and daily record size, the number to middleware realized with this on plinth
It is monitored according to flow, and is alerted when detecting data traffic exception, for example, the stream counted within a certain monitoring period
Amount data monitor the flow when difference of data on flows counted in the period is more than the threshold value of setting or counted compared to upper one
When data are not in the range of scheduled normal discharge, data traffic abnormality alarming can be carried out.
As a kind of possible realization method, specifically one can be often obtained from the obstruction queue in the thread in thread pool
Daily record and when being buffered in List, while increase in counting module the information of the data, such as journal number, daily record
Size etc. realizes that the daily record quantity that daily record is received to middleware and daily record size count with this, and then realizes to centre
The data traffic of part is monitored.
It during practical application the application, can dispose, start a Zookeeper, and be configured in counting module in advance
Zookeeper addresses and counting module send/issue data to the time interval of Zookeeper, on this basis, can be in
Between a counter is respectively started for each topic on part, and node is created for memory counter on Zookeeper
The count results of publication are realized with this and carry out traffic monitoring to middleware, and shown and counted in a manner of JSON etc. on Zookeeper
It counts the statistical result of device and is alerted in Traffic Anomaly.It simultaneously can also be by the black and white lists management function portion of topic
It affixes one's name in Zookeeper, black and white lists management is carried out to topic to realize.
The present embodiment realizes the data traffic for monitoring middleware in real time by counting module, and can be in the number of middleware
According to alarm is sent out during Traffic Anomaly in time, the prior art is effectively overcome because that can be inconsistent caused by real-time monitoring data flow amount
Close the problem of production environment needs.
A kind of middleware disclosed in the next embodiment of the application, the structure of the middleware with reference to shown in figure 5 are shown
It is intended to, the middleware includes:
Data capture unit 501, for obtaining the gathered data of data acquisition side;
Theme generation unit 502 for generating the subject information of the gathered data, and encapsulates the gathered data and institute
Subject information is stated, obtains the gathered data for including subject information;
Data transmission unit 503, for sending the gathered data including subject information to data storage side, so that number
According to storage root according to the subject information of the gathered data by the storage position corresponding to the acquired data storage to corresponding theme
Put place.
In an embodiment of the embodiment of the present application, the data capture unit 501 is specifically used for:Based on predetermined association
View receives every daily record data of Flume log systems acquisition;Every daily record data of reception is buffered in the resistance being pre-created
It fills in queue;Correspondingly, the data transmission unit 503, is specifically used for:Based on the thread pool being pre-created, transmission includes
The daily record data of subject information to Kafka distributed posts are subscribed at the storage location in message system corresponding to corresponding theme.
In an embodiment of the embodiment of the present application, as shown in fig. 6, the middleware can also include:Black and white lists
Administrative unit 504, is used for:Obtain black and white lists subject information;When the corresponding subject information of the gathered data is blacklist master
When inscribing information or non-white list subject information, the gathered data is filtered out;When the corresponding subject information of the gathered data is
When non-blacklist subject information or white list subject information, the transmitting element is triggered.
In an embodiment of the embodiment of the present application, as shown in fig. 7, the middleware can also include:Traffic monitoring
Unit 505 for monitoring the data traffic of the middleware, and is alerted in data traffic exception.
For middleware disclosed by the embodiments of the present invention, due to its with above example disclosed in data managing method
Corresponding, so description is fairly simple, related similarity refers to saying for data managing method part in above example
Bright, no further details here.
A kind of data management system disclosed in the next embodiment of the application, with reference to the data management shown in figure 8
The structure diagram of system, the data management system include server cluster 801, wherein, it is each in the server cluster
On server deployment is there are one middleware as described above, and the data management system can be based on disposing on each server
The data of the data such as Flume acquisition side are written to data storage sides such as Kafka Middleware implementation.
The structure diagram of another data management system with reference to shown in figure 9, in addition to including the server cluster
801, the data management system can also include load balancing managing device 802, for each in the server cluster
A server carries out load monitoring and management so that in the server cluster each server load balancing.
It is more to realize by the way that LVS (Linux Virtual Server, Linux virtual server) is configured in the present embodiment
The load balancing management of machine middleware.
Specifically, volume that can be on a certain server in the server cluster or independently of the server cluster
LVS, and Servers-all one virtual IP address of unified configuration to be deployed with middleware in server cluster are disposed on outer server
(Internet Protocol, Internet protocol), on this basis, can be based on the LVS to each in server cluster
Server carries out load monitoring and management so that in the server cluster each server load balancing.
It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment weight
Point explanation is all difference from other examples, and just to refer each other for identical similar part between each embodiment.
For convenience of description, it describes to be divided into various modules when system above or device with function or unit describes respectively.
Certainly, the function of each unit is realized can in the same or multiple software and or hardware when implementing the application.
As seen through the above description of the embodiments, those skilled in the art can be understood that the application can
It is realized by the mode of software plus required general hardware platform.Based on such understanding, the technical solution essence of the application
On the part that the prior art contributes can be embodied in the form of software product in other words, the computer software product
It can be stored in storage medium, such as ROM/RAM, magnetic disc, CD, be used including some instructions so that a computer equipment
(can be personal computer, server either network equipment etc.) performs the certain of each embodiment of the application or embodiment
Method described in part.
Finally, it is to be noted that, herein, the relational terms of such as first, second, third and fourth or the like
It is used merely to distinguish one entity or operation from another entity or operation, without necessarily requiring or implying these
There are any actual relationship or orders between entity or operation.Moreover, term " comprising ", "comprising" or its is any
Other variants are intended to non-exclusive inclusion, so that process, method, article or equipment including a series of elements
Not only include those elements, but also including other elements that are not explicitly listed or further include as this process, side
Method, article or the intrinsic element of equipment.In the absence of more restrictions, limited by sentence "including a ..."
Element, it is not excluded that also there are other identical elements in the process, method, article or apparatus that includes the element.
The above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications also should
It is considered as protection scope of the present invention.
Claims (11)
1. a kind of data managing method, which is characterized in that applied to middleware, the method includes:
Obtain the gathered data of data acquisition side;
The subject information of the gathered data is generated, and encapsulates the gathered data and the subject information, obtains including theme
The gathered data of information;
The gathered data including subject information is sent to data storage side, so that data store root according to the gathered data
Subject information will be at the storage location corresponding to the acquired data storage to corresponding theme.
2. according to the method described in claim 1, it is characterized in that, the data acquisition side be Flume log systems, then it is described
The gathered data of data acquisition side is obtained, including:
Every daily record data of Flume log systems acquisition is received based on predetermined protocol;
Every daily record data of reception is buffered in the obstruction queue being pre-created.
3. according to the method described in claim 2, it is characterized in that, the data storage side is subscribed to for Kafka distributed posts
Message system, then the gathered data including subject information described in the transmission is to data storage side, including:
Based on the thread pool being pre-created, daily record data to the Kafka distributed posts that transmission includes subject information subscribe to message
System.
4. according to the method described in claim 1, it is characterized in that, in the gathered data for including subject information described in the transmission
To before data storage side, further include:
Obtain black and white lists subject information;
When the corresponding subject information of the gathered data is blacklist subject information or non-white list subject information, institute is filtered out
State gathered data;
When the corresponding subject information of the gathered data is non-blacklist subject information or white list subject information, triggering is sent
The step of at storage location corresponding to the gathered data including subject information to the corresponding theme of data storage side.
5. it according to the method described in claim 1, it is characterized in that, further includes:
The data traffic of the middleware is monitored, and is alerted in data traffic exception.
6. a kind of middleware, which is characterized in that including:
Data capture unit, for obtaining the gathered data of data acquisition side;
Theme generation unit for generating the subject information of the gathered data, and encapsulates the gathered data and the theme
Information obtains the gathered data for including subject information;
Data transmission unit, for sending the gathered data including subject information to data storage side, so that data store
Root will be at the storage location corresponding to the acquired data storage to corresponding theme according to the subject information of the gathered data.
7. middleware according to claim 6, which is characterized in that the data acquisition side is Flume log systems, described
Message system is subscribed to for Kafka distributed posts in data storage side;
The then data capture unit, is specifically used for:
Every daily record data of Flume log systems acquisition is received based on predetermined protocol;Every daily record data of reception is cached
In the obstruction queue being pre-created;
Correspondingly, the data transmission unit, is specifically used for:
Based on the thread pool being pre-created, daily record data to the Kafka distributed posts that transmission includes subject information subscribe to message
System.
8. middleware according to claim 6, which is characterized in that further include:
Black and white lists administrative unit, is used for:
Obtain black and white lists subject information;When the corresponding subject information of the gathered data is blacklist subject information or non-white name
During single subject information, the gathered data is filtered out;When the corresponding subject information of the gathered data is believed for non-blacklist theme
When breath or white list subject information, the transmitting element is triggered.
9. middleware according to claim 6, which is characterized in that further include:
Traffic monitoring unit for monitoring the data traffic of the middleware, and is alerted in data traffic exception.
10. a kind of data management system, which is characterized in that including server cluster, wherein, it is each in the server cluster
There are one such as claim 6-9 any one of them middlewares for server disposition.
11. system according to claim 10, which is characterized in that further include:
Load balancing managing device, for carrying out load monitoring and management to each server in the server cluster, with
Cause the load balancing of each server in the server cluster.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711473196.6A CN108197233A (en) | 2017-12-29 | 2017-12-29 | A kind of data managing method, middleware and data management system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711473196.6A CN108197233A (en) | 2017-12-29 | 2017-12-29 | A kind of data managing method, middleware and data management system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108197233A true CN108197233A (en) | 2018-06-22 |
Family
ID=62586403
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711473196.6A Pending CN108197233A (en) | 2017-12-29 | 2017-12-29 | A kind of data managing method, middleware and data management system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108197233A (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108989314A (en) * | 2018-07-20 | 2018-12-11 | 北京木瓜移动科技股份有限公司 | A kind of Transmitting Data Stream, processing method and processing device |
CN109325200A (en) * | 2018-07-25 | 2019-02-12 | 北京京东尚科信息技术有限公司 | Obtain the method, apparatus and computer readable storage medium of data |
CN109525448A (en) * | 2019-01-10 | 2019-03-26 | 北京智信未来信息技术有限公司 | Log data acquisition system and method |
CN109657125A (en) * | 2018-12-14 | 2019-04-19 | 平安城市建设科技(深圳)有限公司 | Data processing method, device, equipment and storage medium based on web crawlers |
CN110502491A (en) * | 2019-07-25 | 2019-11-26 | 北京神州泰岳智能数据技术有限公司 | A kind of Log Collect System and its data transmission method, device |
CN110515619A (en) * | 2019-08-09 | 2019-11-29 | 济南浪潮数据技术有限公司 | Theme creation method, device and equipment and readable storage medium |
CN110569112A (en) * | 2019-09-12 | 2019-12-13 | 华云超融合科技有限公司 | Log data writing method and object storage daemon device |
CN110688383A (en) * | 2019-09-26 | 2020-01-14 | 中国银行股份有限公司 | Data acquisition method and system |
CN110889132A (en) * | 2019-11-04 | 2020-03-17 | 中盈优创资讯科技有限公司 | Distributed application permission verification method and device |
CN111143314A (en) * | 2019-12-26 | 2020-05-12 | 厦门服云信息科技有限公司 | Log analysis method and system based on high-speed streaming processing technology |
CN111200637A (en) * | 2019-12-20 | 2020-05-26 | 新浪网技术(中国)有限公司 | Cache processing method and device |
CN111625452A (en) * | 2020-05-22 | 2020-09-04 | 上海哔哩哔哩科技有限公司 | Flow playback method and system |
WO2020211622A1 (en) * | 2019-04-16 | 2020-10-22 | 深圳前海微众银行股份有限公司 | Blockchain-based message storage method and device |
CN112261069A (en) * | 2020-12-22 | 2021-01-22 | 国网江苏省电力有限公司信息通信分公司 | Message blacklist generation method for electric power internet of things management platform |
CN112527618A (en) * | 2020-12-17 | 2021-03-19 | 中国农业银行股份有限公司 | Log collection method and log collection system |
CN112615920A (en) * | 2020-12-18 | 2021-04-06 | 北京达佳互联信息技术有限公司 | Abnormality detection method, abnormality detection device, electronic apparatus, storage medium, and program product |
CN113760564A (en) * | 2020-10-20 | 2021-12-07 | 北京沃东天骏信息技术有限公司 | Data processing method, device and system |
CN115296973A (en) * | 2022-05-06 | 2022-11-04 | 北京数联众创科技有限公司 | Method, device and application for batch collection and sending of front-end logs |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103034541A (en) * | 2012-11-16 | 2013-04-10 | 北京奇虎科技有限公司 | Distributing type information system and equipment and method thereof |
US20150254328A1 (en) * | 2013-12-26 | 2015-09-10 | Webtrends Inc. | Methods and systems that categorize and summarize instrumentation-generated events |
CN105608223A (en) * | 2016-01-12 | 2016-05-25 | 北京中交兴路车联网科技有限公司 | Hbase database entering method and system for kafka |
CN105786683A (en) * | 2016-03-03 | 2016-07-20 | 四川长虹电器股份有限公司 | Customized log collecting system and method |
CN106776249A (en) * | 2016-11-28 | 2017-05-31 | 华迪计算机集团有限公司 | A kind of processing method and system of the business diary for concurrently generating |
-
2017
- 2017-12-29 CN CN201711473196.6A patent/CN108197233A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103034541A (en) * | 2012-11-16 | 2013-04-10 | 北京奇虎科技有限公司 | Distributing type information system and equipment and method thereof |
US20150254328A1 (en) * | 2013-12-26 | 2015-09-10 | Webtrends Inc. | Methods and systems that categorize and summarize instrumentation-generated events |
CN105608223A (en) * | 2016-01-12 | 2016-05-25 | 北京中交兴路车联网科技有限公司 | Hbase database entering method and system for kafka |
CN105786683A (en) * | 2016-03-03 | 2016-07-20 | 四川长虹电器股份有限公司 | Customized log collecting system and method |
CN106776249A (en) * | 2016-11-28 | 2017-05-31 | 华迪计算机集团有限公司 | A kind of processing method and system of the business diary for concurrently generating |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108989314A (en) * | 2018-07-20 | 2018-12-11 | 北京木瓜移动科技股份有限公司 | A kind of Transmitting Data Stream, processing method and processing device |
CN109325200A (en) * | 2018-07-25 | 2019-02-12 | 北京京东尚科信息技术有限公司 | Obtain the method, apparatus and computer readable storage medium of data |
CN109657125A (en) * | 2018-12-14 | 2019-04-19 | 平安城市建设科技(深圳)有限公司 | Data processing method, device, equipment and storage medium based on web crawlers |
CN109525448A (en) * | 2019-01-10 | 2019-03-26 | 北京智信未来信息技术有限公司 | Log data acquisition system and method |
WO2020211622A1 (en) * | 2019-04-16 | 2020-10-22 | 深圳前海微众银行股份有限公司 | Blockchain-based message storage method and device |
CN110502491A (en) * | 2019-07-25 | 2019-11-26 | 北京神州泰岳智能数据技术有限公司 | A kind of Log Collect System and its data transmission method, device |
CN110515619A (en) * | 2019-08-09 | 2019-11-29 | 济南浪潮数据技术有限公司 | Theme creation method, device and equipment and readable storage medium |
CN110569112A (en) * | 2019-09-12 | 2019-12-13 | 华云超融合科技有限公司 | Log data writing method and object storage daemon device |
CN110569112B (en) * | 2019-09-12 | 2022-04-08 | 江苏安超云软件有限公司 | Log data writing method and object storage daemon device |
CN110688383A (en) * | 2019-09-26 | 2020-01-14 | 中国银行股份有限公司 | Data acquisition method and system |
CN110889132A (en) * | 2019-11-04 | 2020-03-17 | 中盈优创资讯科技有限公司 | Distributed application permission verification method and device |
CN111200637A (en) * | 2019-12-20 | 2020-05-26 | 新浪网技术(中国)有限公司 | Cache processing method and device |
CN111200637B (en) * | 2019-12-20 | 2022-07-08 | 新浪网技术(中国)有限公司 | Cache processing method and device |
CN111143314A (en) * | 2019-12-26 | 2020-05-12 | 厦门服云信息科技有限公司 | Log analysis method and system based on high-speed streaming processing technology |
CN111625452A (en) * | 2020-05-22 | 2020-09-04 | 上海哔哩哔哩科技有限公司 | Flow playback method and system |
CN111625452B (en) * | 2020-05-22 | 2024-04-16 | 上海哔哩哔哩科技有限公司 | Flow playback method and system |
CN113760564A (en) * | 2020-10-20 | 2021-12-07 | 北京沃东天骏信息技术有限公司 | Data processing method, device and system |
CN112527618A (en) * | 2020-12-17 | 2021-03-19 | 中国农业银行股份有限公司 | Log collection method and log collection system |
CN112615920A (en) * | 2020-12-18 | 2021-04-06 | 北京达佳互联信息技术有限公司 | Abnormality detection method, abnormality detection device, electronic apparatus, storage medium, and program product |
CN112615920B (en) * | 2020-12-18 | 2023-03-14 | 北京达佳互联信息技术有限公司 | Abnormality detection method, abnormality detection device, electronic apparatus, storage medium, and program product |
CN112261069A (en) * | 2020-12-22 | 2021-01-22 | 国网江苏省电力有限公司信息通信分公司 | Message blacklist generation method for electric power internet of things management platform |
CN115296973A (en) * | 2022-05-06 | 2022-11-04 | 北京数联众创科技有限公司 | Method, device and application for batch collection and sending of front-end logs |
CN115296973B (en) * | 2022-05-06 | 2024-10-22 | 北京清研兰亭科技有限公司 | Method, device and application for collecting and sending front-end journals in batches |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108197233A (en) | A kind of data managing method, middleware and data management system | |
US20220300354A1 (en) | System and method for tagging and tracking events of an application | |
CN103617038B (en) | A kind of service monitoring method and device of distribution application system | |
CN105224445B (en) | Distributed tracking system | |
CN106953740B (en) | Processing method, client, server and system for page access data in application | |
CN109074377B (en) | Managed function execution for real-time processing of data streams | |
CN107145489B (en) | Information statistics method and device for client application based on cloud platform | |
Tse et al. | Global zoom/pan estimation and compensation for video compression | |
CN109634818A (en) | Log analysis method, system, terminal and computer readable storage medium | |
CN106487596A (en) | Distributed Services follow the tracks of implementation method | |
CN107104840A (en) | A kind of daily record monitoring method, apparatus and system | |
WO2016206600A1 (en) | Information flow data processing method and device | |
CN106878064A (en) | Data monitoring method and device | |
CN106953758A (en) | A kind of dynamic allocation management method and system based on Nginx servers | |
CN107895009A (en) | One kind is based on distributed internet data acquisition method and system | |
CN110232010A (en) | A kind of alarm method, alarm server and monitoring server | |
CN108021809A (en) | A kind of data processing method and system | |
CN107168847A (en) | The full link application monitoring method and device of a kind of support distribution formula framework | |
CN113448812A (en) | Monitoring alarm method and device under micro-service scene | |
CN107291594A (en) | The device and method that openstack platforms are monitored and managed to ceph | |
US9054969B2 (en) | System and method for situation-aware IP-based communication interception and intelligence extraction | |
Gao | A General Logging Service for Symbian based Mobile Phones | |
CN107257289A (en) | A kind of risk analysis equipment, monitoring system and monitoring method | |
CN112395357A (en) | Data collection method and device and electronic equipment | |
CN116932148B (en) | Problem diagnosis system and method based on AI |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180622 |