[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN110502491A - A kind of Log Collect System and its data transmission method, device - Google Patents

A kind of Log Collect System and its data transmission method, device Download PDF

Info

Publication number
CN110502491A
CN110502491A CN201910677014.XA CN201910677014A CN110502491A CN 110502491 A CN110502491 A CN 110502491A CN 201910677014 A CN201910677014 A CN 201910677014A CN 110502491 A CN110502491 A CN 110502491A
Authority
CN
China
Prior art keywords
sink
data
extension element
component
collect system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910677014.XA
Other languages
Chinese (zh)
Inventor
李明
李晓宇
张月鹏
张伟东
裴广超
刘立超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shenzhou Intelligent Intelligent Data Technology Co Ltd
Original Assignee
Beijing Shenzhou Intelligent Intelligent Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shenzhou Intelligent Intelligent Data Technology Co Ltd filed Critical Beijing Shenzhou Intelligent Intelligent Data Technology Co Ltd
Priority to CN201910677014.XA priority Critical patent/CN110502491A/en
Publication of CN110502491A publication Critical patent/CN110502491A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/176Support for shared access to files; File sharing support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the present application discloses a kind of Log Collect System and its data transmission method, device.The data transmission method of Log Collect System includes: that Sink extension element corresponding with business demand is configured in the Sink component of Log Collect System Flume;Initial data is obtained from the Sink component using each Sink extension element, and handles rule according to pre-set business and the initial data is handled, and obtains target data;The target data is transferred to corresponding data receiver.By increasing Sink extension element, initial data is handled according to business need using Sink extension element, extract needs, available data transmit backward, avoid back-end server IO stream bottleneck and data overstock problem.

Description

A kind of Log Collect System and its data transmission method, device
Technical field
This application involves field of computer technology, and in particular to a kind of Log Collect System and its data transmission method, dress It sets.
Background technique
Flume is a High Availabitity, highly reliable, distributed massive logs acquisition, polymerization and the system transmitted. Flume, which is provided, carries out simply dealt ability to data, and the core of Flume operation is Agent, and Flume is the smallest with Agent Independent operating unit.One Agent is exactly a JVM (Java Virtual Machine).It is a complete data gathering tool, is contained Three core components, are Source, Channel, Sink respectively.By these components, Event (event) can be from a place Another place is flowed to, as shown in Figure 1, Source is the collecting terminal of data, is responsible for data capture post package to event (event) inner, then event is pushed into Channel, Channel (pipeline) be for transmitting data, Sink component from Event is taken out in Channel, and data are then dealt into other places and (for example is distributed to file system, database, can also be distributed to it At the Source of his Agent).The direction of arrow represents the flow direction of data in Fig. 1, for example data are from Web server -- > Source-- > Channel-- > Sink (component) -- the storage of > data, the part that the centre of Fig. 1 is framed by rectangle is one Agent。
Since existing Flume is mainly used for collection of log data and transmission, in order to guarantee log data acquisition efficiency, its is right The processing of daily record data is substantially unrelated with business, when the data volume of transmission is larger, such as per second tens data volume, it is easy It causes back end database server or service server to be unable to the data of real-time consumption processing Flume transmission, causes data product The problems such as pressure, server is seemingly-dead.
Summary of the invention
In view of this, the embodiment of the present application provides a kind of data transmission method of Log Collect System, device, pass through increasing Add Sink extension element so that Log Collect System is handled initial data according to business need, extract needs, it is available Data transmit backward, avoid back-end server IO stream bottleneck and data overstock problem.
According to the one aspect of the application, a kind of data transmission method of Log Collect System is provided, comprising:
Sink extension element corresponding with business demand is configured in the Sink component of Log Collect System Flume;
Initial data is obtained from the Sink component using each Sink extension element, and is handled according to pre-set business Rule handles the initial data, obtains target data;
The target data is transferred to corresponding data receiver.
According to further aspect of the application, a kind of data transmission device of Log Collect System is provided, comprising:
Component expansion module, it is corresponding with business demand for being configured in the Sink component of Log Collect System Flume Sink extension element;
Data processing module, for obtaining initial data from the Sink component using each Sink extension element, And handle rule according to pre-set business and the initial data is handled, obtain target data;
Transmission module, for the target data to be transferred to corresponding data receiver.
According to the another aspect of the application, a kind of Log Collect System is provided, wherein the system includes: client And log acquisition server;It include the log collection system as described in further aspect of the application in the log acquisition server The data transmission device of system.
According to another aspect of the application, a kind of non-transient computer readable storage medium is provided, is stored thereon There is computer program, which realizes one aspect the method for the application when being executed by processor the step of.
The utility model has the advantages that the data transmission scheme of the Log Collect System using the embodiment of the present application, in Log Collect System Corresponding with business demand Sink extension element is configured in the Sink component of Flume, using each Sink extension element from Sink group It obtains initial data in part, and handles rule according to pre-set business and initial data is handled, obtain mesh after target data Mark data are transferred to corresponding data receiver, thus advance in Sink component business process rule and handle, avoid Data are not made with business processing for Sink component but back-end server pressure caused by simple transmission is big, data overstock, server Seemingly-dead problem.
Detailed description of the invention
Fig. 1 is the data transmission schematic diagram of prior art Flume system;
Fig. 2 is the flow chart of the data transmission method of the Log Collect System of the embodiment of the present application one;
Fig. 3 is the flow chart of the data transmission method of the Log Collect System of the embodiment of the present application two;
Fig. 4 is the flow chart of the data transmission method of the Log Collect System of the embodiment of the present application three;
Fig. 5 is the flow chart of the data transmission method of the Log Collect System of the embodiment of the present application four;
Fig. 6 is the flow chart of the data transmission method of the Log Collect System of the embodiment of the present application five;
Fig. 7 is the block diagram of the data transmission device of the Log Collect System of the application one embodiment;
Fig. 8 is the block diagram of the Log Collect System of the application one embodiment;
Fig. 9 is the structural schematic diagram of the non-transient computer readable storage medium of the application one embodiment.
Specific embodiment
To keep the above objects, features, and advantages of the embodiment of the present application more obvious and easy to understand, with reference to the accompanying drawing and Specific embodiment is described in further detail the embodiment of the present application.Obviously, described embodiment is the application one Divide embodiment, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not making Every other embodiment obtained under the premise of creative work belongs to the range of the embodiment of the present application protection.
Embodiment one
The present embodiment proposes that a kind of data transmission method of Log Collect System, Fig. 2 are the logs of the embodiment of the present application one The flow chart of the data transmission method of acquisition system, referring to fig. 2, the data transmission method of the Log Collect System of the present embodiment, Include the following steps:
Step S201 configures Sink extension corresponding with business demand in the Sink component of Log Collect System Flume Component;
Step S202 obtains initial data using each Sink extension element from the Sink component, and according to institute The business process rule stated in Sink extension element handles the initial data, obtains target data;
The target data is transferred to corresponding data receiver by step S203.
As shown in Figure 2 it is found that the data transmission method of the Log Collect System of the present embodiment, in Log Collect System Sink extension element corresponding with business demand is configured in the Sink component of Flume, using each Sink extension element according to default Business process rule handles initial data, obtains that target data is transferred to corresponding data receiver after target data Side, is equivalent to shifts to an earlier date business processing flow as a result, extend the data processing function of existing Sink component, avoid Sink Component, which does not deal with only simple transmission data, causes back-end server pressure big, and data overstock problem, is conducive to improve log The full-range treatment effeciency of data.
In the present embodiment, Sink corresponding with business demand is configured in the Sink component of Log Collect System Flume and is expanded Exhibition component includes: one frame clsss of creation, and the AbstractSink of Log Collect System Flume described in the frame Similar integral takes out As class, and realize the Configurable interface of the Log Collect System Flume;Define common interface, the common interface In include multiple business processing method;According to business demand creation Sink extension element, described in the Sink extension element succession Frame clsss, and realize the common interface.
Specifically, configuration process is:
Create a frame clsss, BDIAFlumeSinkFramework;
The AbstractSink that the frame clsss BDIAFlumeSinkFramework inherits Log Collect System Flume takes out As class, and the Configurable interface of Log Collect System Flume is realized, to construct customized business framework.
Then, all business are defined and require the common interface realized, such as BusinService interface, in common interface Including different methods for realizing corresponding business processing, such as initial method, method for processing business, distribution method, Filter method, standardized method, data encapsulation method, relational database method, method for early warning, thread control method, circulation side Method, dynamic configuration method for refreshing, destroying method etc..
By taking the extension element of creation is log monitor component as an example, the configuration file of log monitor component in one embodiment It is as follows:
Agent.sinks.log_monitor_sink.type=com.dm.software.sink.B DIAFlumeSin kFramework;
Agent.sinks.log_monitor_sink.service=com.dm.software.ser viceImpl.Lo gMonitorService
Agent.sinks.log_monitor_sink.assembly.jar=/home/bdia/ LogMonitorServerV1.0.jar
Agent.sinks.log_monitor_sink.assembly.configure=/home/bd ia/ logmonitor.properties
Wherein, LogMonitorService realizes BusinService common interface,
Agent.sinks.log_monitor_sink.assembly configuration is customized component packet,
Agent.sinks.log_monitor_sink.assembly.configure is load dynamic configuration file letter It ceases, non-required item, has default.properties, i.e. default property in the default component packet of frame clsss.
By definition frame class and common interface, using frame clsss from data source (such as from Sink component) obtain data and Configuration information, and carry out the circulation and control of different business.Using common interface encapsulation, there are many method, different service selections Realize the method in corresponding common interface.
Embodiment two
By the way that filter assemblies are added in Sink component in the present embodiment, invalid data, rubbish number are filtered using filter assemblies According to the information datas such as blacklist.That is, the aforementioned configuration in the Sink component of Log Collect System Flume is needed with business Seeking corresponding Sink extension element includes: according to data filtering business demand, in the Sink component of Log Collect System Flume Configure Sink filter assemblies corresponding with data filtering business demand;Using each Sink extension element from the Sink component Middle acquisition initial data, and handle rule according to pre-set business and the initial data is handled, obtaining target data includes: Initial data is obtained from the Sink component by the frame clsss using the Sink filter assemblies, and public is connect by described Filter traffic processing method in mouthful handles the initial data, obtains target data;Wherein, at the filter traffic Reason method is for filtering invalid data, filtering spam data, one or more in filtering black list data.
Certainly, described that Sink extension corresponding with business demand is configured in the Sink component of Log Collect System Flume Component can also include: according to first business demand and the second business demand with stream compression relationship, in log collection system Configuration the first Sink extension element corresponding with first business demand in the Sink component of system Flume;And according to the second industry Business demand configures twoth Sink corresponding with second business demand in the Sink component of Log Collect System Flume and expands Component is opened up, the first Sink extension element is the upper level Sink extension element of the 2nd Sink extension element;The benefit Initial data is obtained from the Sink component with each Sink extension element, and handles rule to described according to pre-set business Initial data is handled, and is obtained target data and is included: being obtained using the first Sink extension element from the Sink component Initial data, and handle rule according to pre-set business and the initial data is handled;Utilize the 2nd Sink expanded set Part handles rule according to pre-set business and carries out to the process data from the first Sink extension element acquisition process data Processing, obtains target data;Wherein, the first Sink extension element is filter assemblies, and the 2nd Sink extension element is Modular unit.Modular unit can be with design database storage format, field length, position, the standards such as field contents, format Change information.
That is, supporting that multiple Sink extension elements are arranged in the present embodiment and each Sink extension element and other Sink expansion being arranged Open up the stream compression relationship of component.
Filter assemblies, modular unit, database component, distributed components etc., these components are all the tool groups developed Part may be directly applied in actual scene.Filter assemblies are to realize that noise data is cleaned to reduce the defeated of IO in data flow Purpose out carries out cleaning by the rule of aforementioned arrangements file and realizes said effect.Modular unit is the function for realizing adaptation Can, for example, the storage organization of business A be it is fixed, order of the field and type are fixed, then then needing to initial data It is standardized.Database component is stored data into corresponding database.Distributed components are to realize next process Circulation process.Different components is specifically loaded by practical business situation.
As shown in figure 3, box show a Flume agent in Fig. 3, as shown in Figure 3 it is found that each Flume Agent includes Source component, Channel component and Sink component.Source component is used for from being mounted on Web server On client at collector journal data, it is inner to encapsulate data into event (event);Then event is pushed into Channel component In, Channel (pipeline) is connection Source component and Sink component, is equivalent to buffer area;Sink component is used for from Channel Event is taken out in component, and event is then dealt into other places.
The technical issues of in order to solve in the prior art, the application improve traditional Si nk component, increase Sink Extension element, each application scenarios can customized corresponding Sink extension element, corresponding function is realized by configuring.
Two Sink extension elements are illustrated in Fig. 3, are Sink filter assemblies and Sink modular unit respectively, are needed It is bright, be in Fig. 3 the case where increasing by two customized Sink extension elements, but in the other embodiments of the application, from The quantity of definitions component is not limited to two shown in Fig. 3, should be configured according to actual needs.For example, it is shown in Fig. 4 such, Increase by three Sink extension elements in Sink component.
Embodiment three
Fig. 4 is to increase Sink relational database component, relational database component is to meet following answer on the basis of Fig. 3 With scene: if necessary to which daily record data is imported relevant database, then relational database component can be added, and then be distributed to In desired relational database (such as Mysql, Oracle).
For example, in log collection, in collected log, field includes the time, levels of information, class, thread, method, Output information, totally 50 fields such as exception information, this is the full dose data in whole flow process, if directly transmission be easy to cause net Network IO, storage IO is excessively high, stuck phenomenon.
Introduce filter assemblies, it is only necessary to by time, rank, output information, exception information this four fields, then to original Data execute the process of primary removal noise data, reduce IO stream.
And the effect of modular unit is converted according to standard configuration, to adapt to subsequent business processing, for example, log number Have plenty of yyyy/MM/dd according to middle time format, have plenty of yyy-MM-dd, format also is yyMMdd, this just needs clock synchronization Between format be standardized to meet subsequent processing demand.Different business scenario modular unit output situations is different, for example, Directly storage arrives oracle relation data after standardization, this just should not subsequent the problem of circulating, decreasing IO.
As a result, by increasing Sink extension element, the business processings such as it is filtered, standardizes, avoiding Sink component will be big The extraneous data of amount is transmitted backward, to reduce the pressure of IO stream, is advantageously accounted for data and is overstock problem and improve at data Manage efficiency.
Example IV
In view of the same data source may need different business to be handled, and then may need same part data It is distributed in multiple Service Component, as shown in figure 5, including Sink filter assemblies, Sink mark in the Flume agent of the present embodiment Standardization component, Sink relational database component, these three extension elements respectively from Channel component obtain event, respectively into It is sent to rear end after the corresponding processing of row, for example, Sink filter assemblies obtain event from Channel component, and after being filtered It is distributed to HDFS storage, Sink modular unit obtains event from Channel component, and is distributed to after being standardized Redis storage, Sink relational database component obtains event from Channel component, at the storage rule of relational database Relational database is sent to after reason.
It as a result, to the same data source, is respectively processed according to different business needs, is subsequently sent to corresponding number According to the thought divided and rule in library, is taken, avoid be not added differentiation, centralized processing cause back-end data base IO flowing pressure huge, It solves the problems, such as that IO stream bottleneck and data-handling efficiency are low, reaches quick storage, response, meet business demand.
Embodiment five
The data collection system data transmission method of the present embodiment additionally provide distributed data processing is then concentrated with And the support of data fusion.As shown in fig. 6, different Flume agent is obtained from different network server Web Server Daily record data, left side illustrates two network servers in Fig. 6, and each network server is separately connected a Flume agent, Flume agent collector journal data are filtered, standardize, data encapsulation, being then distributed to Flume agent on the right side of Fig. 6 In Source component, Flume agent on the right side of Fig. 6 is passed data to by the Source component of Flume agent on the right side of Fig. 6 In Sink filter assemblies be filtered and be transmitted to Sink relational database group after being standardized by Sink modular unit Part is distributed to relational database storage after being handled by Sink relational database component.
Log Collect System Flume supports distribution, and therefore, in order to avoid the bottleneck of standalone processes, the present embodiment Fig. 6 is left Side illustrates different Flume agent, and acquisition is handled respectively, then carries out the scheme of data fusion, concrete application When, the Flume agent of the upper left Fig. 6 can obtain order daily record data from Web Server, according to preset less than 10 words The rule that the order daily record data of section filters out is filtered, and the Flume agent of the lower-left Fig. 6 can be obtained from Web Server Goods browse daily record data was carried out according to the rule that the preset goods browse daily record data less than 15 fields filters out Filter, by respectively treated, data encapsulate two Flume agent in left side, are subsequently sent to the Flume agent's of rear end Source component is merged.
High for the data stream matches delivery efficiency of big data, accuracy rate is high, solves the need of different business from source It asks, avoids server because seemingly-dead, the states such as stuck occurs in data IO stream bottleneck problem.
A technical concept is belonged to the data transmission method of aforementioned Log Collect System, the embodiment of the present application also provides A kind of data transmission device of Log Collect System, as shown in fig. 7, the data transmission device 700 of the Log Collect System wraps It includes:
Component expansion module 701, it is corresponding with business demand for being configured in the Sink component of Log Collect System Flume Sink extension element;
Data processing module 702, for obtaining original number from the Sink component using each Sink extension element According to, and handle rule according to pre-set business and the initial data is handled, obtain target data;
Transmission module 703, for the target data to be transferred to corresponding data receiver.
In one embodiment of the application, component expansion module 701 is specifically used for one frame clsss of creation, the frame The AbstractSink abstract class of Log Collect System Flume described in frame Similar integral, and realize the Log Collect System Flume Configurable interface;Common interface is defined, includes multiple business processing method in the common interface;According to business need Creation Sink extension element is sought, the Sink extension element inherits the frame clsss, and realizes the common interface.
In one embodiment of the application, the component expansion module 701 is specifically used for needing according to data filtering business It asks, Sink filtering group corresponding with the data filtering business demand is configured in the Sink component of Log Collect System Flume Part;
The data processing module 702 is specifically used for using the Sink filter assemblies through the frame clsss from described Sink component obtains initial data, and is carried out by the filter traffic processing method in the common interface to the initial data Processing, obtains target data;Wherein, the filter traffic processing method is for filtering invalid data, filtering spam data, filtering It is one or more in blacklist data.
In one embodiment of the application, the component expansion module 701 is specifically used for closing according to stream compression The first business demand and the second business demand of system, the configuration and described first in the Sink component of Log Collect System Flume The corresponding first Sink extension element of business demand;And according to the second business demand, in the Sink group of Log Collect System Flume Configure corresponding with second business demand the 2nd Sink extension element in part, the first Sink extension element is described the The upper level Sink extension element of two Sink extension elements;
The data processing module 702, specifically for being obtained using the first Sink extension element from the Sink component Initial data is taken, and handles rule according to pre-set business and the initial data is handled;It is extended using the 2nd Sink Component from the first Sink extension element acquisition process data, and according to pre-set business handle rule to the process data into Row processing, obtains target data;Wherein, the first Sink extension element is filter assemblies, the 2nd Sink extension element For modular unit.
The illustration of each function performed by each module in Fig. 7 shown device illustrates, implements with preceding method Illustration explanation in example is consistent, no longer repeats one by one here.
Fig. 8 is the block diagram of the Log Collect System of the application one embodiment, referring to Fig. 8, the log collection of the present embodiment System 800 includes client 801 and log acquisition server 802;
It include the data transmission device of the Log Collect System in previous embodiment in the log acquisition server 802 700。
The Log Collect System of the present embodiment is expanded the initial data of acquisition through Sink by increasing Sink extension element It is distributed and stores after exhibition component services processing, alleviate the bottleneck problem of back-end system IO stream with this, improve business processing effect Rate meets the demand of big data log collection and the analysis of platform log collection.
It should be understood that
Algorithm and display be not inherently related to any certain computer, virtual bench or other equipment provided herein. Various fexible units can also be used together with teachings based herein.As described above, it constructs required by this kind of device Structure be obvious.In addition, the embodiment of the present application is also not for any particular programming language.It should be understood that can benefit The content of the embodiment of the present application described herein is realized with various programming languages, and the description done above to language-specific is In order to disclose the preferred forms of the embodiment of the present application.
In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the application Example can be practiced without these specific details.In some instances, well known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this specification.
Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of each application aspect, In Above in the description of the exemplary embodiment of the application, each feature of the embodiment of the present application is grouped together into individually sometimes In embodiment, figure or descriptions thereof.However, the disclosed method should not be interpreted as reflecting the following intention: being wanted The embodiment of the present application of protection is asked to require features more more than feature expressly recited in each claim.More precisely It says, as reflected in the following claims, application aspect is all less than single embodiment disclosed above Feature.Therefore, it then follows thus claims of specific embodiment are expressly incorporated in the specific embodiment, wherein each power Benefit requires in itself all as the separate embodiments of the application.
Those skilled in the art will understand that can be carried out adaptively to the module in the equipment in embodiment Change and they are arranged in one or more devices different from this embodiment.It can be the module or list in embodiment Member or component are combined into a module or unit or component, and furthermore they can be divided into multiple submodule or subelement or Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it can use any Combination is to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed All process or units of what method or apparatus are combined.Unless expressly stated otherwise, this specification is (including adjoint power Benefit require, abstract and attached drawing) disclosed in each feature can carry out generation with an alternative feature that provides the same, equivalent, or similar purpose It replaces.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included certain features rather than other feature, but the combination of the feature of different embodiments mean it is real in the application It applies within the scope of example and forms different embodiments.For example, in the following claims, implementation claimed Example it is one of any can in any combination mode come using.
The various component embodiments of the embodiment of the present application can be implemented in hardware, or in one or more processor The software module of upper operation is realized, or is implemented in a combination thereof.It will be understood by those of skill in the art that can practice The middle page performance test device realized using microprocessor or digital signal processor (DSP) according to the embodiment of the present application In some or all components some or all functions.The application is also implemented as described herein for executing Some or all device or device programs (for example, computer program and computer program product) of method.In this way The program of realization the embodiment of the present application can store on a computer-readable medium, or can have one or more letter Number form.Such signal can be downloaded from an internet website to obtain, and perhaps be provided on the carrier signal or with any Other forms provide.
Fig. 9 is the structural schematic diagram of the non-transient computer readable storage medium of the application one embodiment.The calculating Machine readable storage medium storing program for executing 900 is stored with the computer program for executing the method and step according to the embodiment of the present application, can be by The processor of log acquisition server is read, and when computer program is run by log acquisition server, leads to the log collection Server executes each step in method described above, specifically, the meter of the computer-readable recording medium storage Method shown in any of the above-described embodiment can be executed by calculating program.Computer program can be compressed in a suitable form.
The embodiment of the present application is carried out it should be noted that above-described embodiment illustrates rather than the embodiment of the present application Limitation, and those skilled in the art can be designed alternative embodiment without departing from the scope of the appended claims. In the claims, any reference symbol between parentheses should not be configured to limitations on claims.Word " packet Containing " do not exclude the presence of element or step not listed in the claims.Word "a" or "an" located in front of the element is not arranged Except there are multiple such elements.The embodiment of the present application can by means of include several different elements hardware and by means of Properly programmed computer is realized.In the unit claims listing several devices, several in these devices can To be to be embodied by the same item of hardware.Word, second and the use of third etc. do not indicate any sequence, can will These words are construed to title.

Claims (10)

1. a kind of data transmission method of Log Collect System characterized by comprising
Sink extension element corresponding with business demand is configured in the Sink component of Log Collect System Flume;
Initial data is obtained from the Sink component using each Sink extension element, and handles rule according to pre-set business The initial data is handled, target data is obtained;
The target data is transferred to corresponding data receiver.
2. the method as described in claim 1, which is characterized in that described to match in the Sink component of Log Collect System Flume Setting Sink extension element corresponding with business demand includes:
Create a frame clsss, the AbstractSink abstract class of Log Collect System Flume described in the frame Similar integral, and Realize the Configurable interface of the Log Collect System Flume;
Common interface is defined, includes multiple business processing method in the common interface;
Sink extension element is created, the Sink extension element inherits the frame clsss, and realizes the common interface.
3. method according to claim 2, which is characterized in that described to match in the Sink component of Log Collect System Flume Setting Sink extension element corresponding with business demand includes:
According to data filtering business demand, configuration and the data filtering industry in the Sink component of Log Collect System Flume The corresponding Sink filter assemblies of business demand;
It is described to obtain initial data from the Sink component using each Sink extension element, and handled according to pre-set business Rule handles the initial data, obtains target data and includes:
Initial data is obtained from the Sink component by the frame clsss using the Sink filter assemblies, and passes through the public affairs The filter traffic processing method in interface handles the initial data altogether, obtains target data;
Wherein, the filter traffic processing method is for filtering invalid data, filtering spam data, in filtering black list data It is one or more.
4. the method as described in claim 1, which is characterized in that described to match in the Sink component of Log Collect System Flume Setting Sink extension element corresponding with business demand includes:
According to first business demand and the second business demand with stream compression relationship, Log Collect System Flume's The first Sink extension element corresponding with first business demand is configured in Sink component;And according to the second business demand, In The 2nd Sink extension element corresponding with second business demand, institute are configured in the Sink component of Log Collect System Flume State the upper level Sink extension element that the first Sink extension element is the 2nd Sink extension element;
It is described to obtain initial data from the Sink component using each Sink extension element, and handled according to pre-set business Rule handles the initial data, obtains target data and includes:
Initial data is obtained from the Sink component using the first Sink extension element, and handles rule according to pre-set business The initial data is handled;
Using the 2nd Sink extension element from the first Sink extension element acquisition process data, and according to pre-set business Processing rule handles the process data, obtains target data;Wherein, the first Sink extension element is filtering group Part, the 2nd Sink extension element are modular unit.
5. a kind of data transmission device of Log Collect System, which is characterized in that the device includes:
Component expansion module, for configuring Sink corresponding with business demand in the Sink component of Log Collect System Flume Extension element;
Data processing module for obtaining initial data from the Sink component using each Sink extension element, and is pressed The initial data is handled according to pre-set business processing rule, obtains target data;
Transmission module, for the target data to be transferred to corresponding data receiver.
6. device as claimed in claim 5, which is characterized in that the component expansion module is specifically used for one frame of creation Class, the AbstractSink abstract class of Log Collect System Flume described in the frame Similar integral, and realize the log collection The Configurable interface of system Flume;Common interface is defined, includes multiple business processing method in the common interface; Sink extension element is created, the Sink extension element inherits the frame clsss, and realizes the common interface.
7. device as claimed in claim 6, which is characterized in that the component expansion module is specifically used for according to data filtering Business demand configures Sink corresponding with the data filtering business demand in the Sink component of Log Collect System Flume Filter assemblies;
The data processing module is specifically used for passing through the frame clsss from the Sink component using the Sink filter assemblies Initial data is obtained, and the initial data is handled by the filter traffic processing method in the common interface, is obtained To target data;Wherein, the filter traffic processing method is for filtering invalid data, filtering spam data, filtering black list It is one or more in data.
8. device as claimed in claim 5, which is characterized in that the component expansion module is specifically used for according to data The first business demand and the second business demand of circulation relationship, configuration and institute in the Sink component of Log Collect System Flume State the corresponding first Sink extension element of the first business demand;And according to the second business demand, Log Collect System Flume's The 2nd Sink extension element corresponding with second business demand is configured in Sink component, the first Sink extension element is The upper level Sink extension element of the 2nd Sink extension element;
The data processing module is specifically used for obtaining original number from the Sink component using the first Sink extension element According to, and handle rule according to pre-set business and the initial data is handled;Using the 2nd Sink extension element from institute The first Sink extension element acquisition process data are stated, and handles rule according to pre-set business and the process data is handled, Obtain target data;Wherein, the first Sink extension element is filter assemblies, and the 2nd Sink extension element is standardization Component.
9. a kind of Log Collect System, wherein the system includes: client and log acquisition server;
It include the data transmission of the Log Collect System as described in any one of claim 5-8 in the log acquisition server Device.
10. a kind of non-transient computer readable storage medium, is stored thereon with computer program, which is characterized in that the calculating The step of any one of claim 1-4 the method is realized when machine program is executed by processor.
CN201910677014.XA 2019-07-25 2019-07-25 A kind of Log Collect System and its data transmission method, device Pending CN110502491A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910677014.XA CN110502491A (en) 2019-07-25 2019-07-25 A kind of Log Collect System and its data transmission method, device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910677014.XA CN110502491A (en) 2019-07-25 2019-07-25 A kind of Log Collect System and its data transmission method, device

Publications (1)

Publication Number Publication Date
CN110502491A true CN110502491A (en) 2019-11-26

Family

ID=68587276

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910677014.XA Pending CN110502491A (en) 2019-07-25 2019-07-25 A kind of Log Collect System and its data transmission method, device

Country Status (1)

Country Link
CN (1) CN110502491A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112019605A (en) * 2020-08-13 2020-12-01 上海哔哩哔哩科技有限公司 Data distribution method and system of data stream
CN112269902A (en) * 2020-11-10 2021-01-26 珠海市新德汇信息技术有限公司 Data acquisition method for big data
CN112559215A (en) * 2020-12-21 2021-03-26 长沙树根互联技术有限公司 Internet of things data processing method and device, storage medium and internet of things box
CN113032375A (en) * 2019-12-24 2021-06-25 广州如加网络科技有限公司 Data acquisition and aggregation method based on Flume
CN113051354A (en) * 2021-04-09 2021-06-29 金蝶软件(中国)有限公司 Online source searching method and device based on dynamic configuration and computer equipment
CN113157475A (en) * 2021-03-30 2021-07-23 北京大米科技有限公司 Log processing method and device, storage medium and electronic equipment
CN114936245A (en) * 2022-04-28 2022-08-23 北京远舢智能科技有限公司 Method and device for integrating and processing multi-source heterogeneous data
CN115086303A (en) * 2022-06-29 2022-09-20 徐工汉云技术股份有限公司 Multi-data-source data repeater and design method thereof
CN115168030A (en) * 2022-06-24 2022-10-11 天翼爱音乐文化科技有限公司 Dynamic regulation and control log acquisition and processing method and device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103401934A (en) * 2013-08-06 2013-11-20 广州唯品会信息科技有限公司 Method and system for acquiring log data
CN105512201A (en) * 2015-11-26 2016-04-20 晶赞广告(上海)有限公司 Data collection and processing method and device
US20180173777A1 (en) * 2013-02-25 2018-06-21 Leidos, Inc. System and Method For Correlating Cloud-Based Big Data in Real-Time For Intelligent Analytics and Multiple End Uses
CN108197233A (en) * 2017-12-29 2018-06-22 飞狐信息技术(天津)有限公司 A kind of data managing method, middleware and data management system
CN108846076A (en) * 2018-06-08 2018-11-20 山大地纬软件股份有限公司 The massive multi-source ETL process method and system of supporting interface adaptation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180173777A1 (en) * 2013-02-25 2018-06-21 Leidos, Inc. System and Method For Correlating Cloud-Based Big Data in Real-Time For Intelligent Analytics and Multiple End Uses
CN103401934A (en) * 2013-08-06 2013-11-20 广州唯品会信息科技有限公司 Method and system for acquiring log data
CN105512201A (en) * 2015-11-26 2016-04-20 晶赞广告(上海)有限公司 Data collection and processing method and device
CN108197233A (en) * 2017-12-29 2018-06-22 飞狐信息技术(天津)有限公司 A kind of data managing method, middleware and data management system
CN108846076A (en) * 2018-06-08 2018-11-20 山大地纬软件股份有限公司 The massive multi-source ETL process method and system of supporting interface adaptation

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
RANGEYAN2012: "《flume开发-自定义sink》", 《HTTPS://BLOG.CSDN.NET/YANSHU2012/ARTICLE/DETAILS/53391070》 *
健康平安的活着: "《flume学习一:flume基础知识》", 《CSDN》 *
陆世鹏: "《基于SparkStreaming的海量日志实时处理系统的设计》", 《电子产品可靠性与环境试验》 *
非勤能补拙: "《8、Flume高阶自定义组件_Flume自定义Sink》", 《简书-HTTPS://WWW.JIANSHU.COM/P/2625AB7C2651》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113032375A (en) * 2019-12-24 2021-06-25 广州如加网络科技有限公司 Data acquisition and aggregation method based on Flume
CN112019605B (en) * 2020-08-13 2023-05-09 上海哔哩哔哩科技有限公司 Data distribution method and system for data stream
CN112019605A (en) * 2020-08-13 2020-12-01 上海哔哩哔哩科技有限公司 Data distribution method and system of data stream
CN112269902A (en) * 2020-11-10 2021-01-26 珠海市新德汇信息技术有限公司 Data acquisition method for big data
CN112559215A (en) * 2020-12-21 2021-03-26 长沙树根互联技术有限公司 Internet of things data processing method and device, storage medium and internet of things box
CN112559215B (en) * 2020-12-21 2024-10-15 长沙树根互联技术有限公司 Internet of things data processing method and device, storage medium and Internet of things box
CN113157475A (en) * 2021-03-30 2021-07-23 北京大米科技有限公司 Log processing method and device, storage medium and electronic equipment
CN113051354A (en) * 2021-04-09 2021-06-29 金蝶软件(中国)有限公司 Online source searching method and device based on dynamic configuration and computer equipment
CN113051354B (en) * 2021-04-09 2024-09-06 金蝶软件(中国)有限公司 Online source searching method and device based on dynamic configuration and computer equipment
CN114936245A (en) * 2022-04-28 2022-08-23 北京远舢智能科技有限公司 Method and device for integrating and processing multi-source heterogeneous data
CN114936245B (en) * 2022-04-28 2023-04-14 北京远舢智能科技有限公司 Method and device for integrating and processing multi-source heterogeneous data
CN115168030B (en) * 2022-06-24 2023-10-20 天翼爱音乐文化科技有限公司 Dynamic regulation log acquisition and processing method, device and storage medium
CN115168030A (en) * 2022-06-24 2022-10-11 天翼爱音乐文化科技有限公司 Dynamic regulation and control log acquisition and processing method and device and storage medium
CN115086303B (en) * 2022-06-29 2024-05-17 徐工汉云技术股份有限公司 Multi-data source data repeater and design method thereof
CN115086303A (en) * 2022-06-29 2022-09-20 徐工汉云技术股份有限公司 Multi-data-source data repeater and design method thereof

Similar Documents

Publication Publication Date Title
CN110502491A (en) A kind of Log Collect System and its data transmission method, device
CN104699718B (en) Method and apparatus for being rapidly introduced into business datum
CN102404126B (en) Charging method of cloud computing during application process
CN110784419A (en) Method and system for visualizing professional data of railway electric affairs
CN113360554B (en) Method and equipment for extracting, converting and loading ETL (extract transform load) data
WO2019243788A1 (en) Pipeline data processing
US20140330827A1 (en) Methods and systems to operate on group-by sets with high cardinality
JP2010524060A (en) Data merging in distributed computing
CN104111996A (en) Health insurance outpatient clinic big data extraction system and method based on hadoop platform
CN108197237A (en) Visualization data, which collect, shows system
CN104572975B (en) Real-time data processing and analyzing system
CN102662993A (en) A method for providing page data
CN106453536A (en) Network request processing method and system and server
CN105989163A (en) Data real-time processing method and system
US10776359B2 (en) Abstractly implemented data analysis systems and methods therefor
CN103927314A (en) Data batch processing method and device
CN109241384A (en) Scientific research information visualization method and device
CN101739454A (en) Data processing system
CN107147527A (en) A kind of system and method for Linux clusters alarm
CN106599120A (en) Stream processing framework-based data processing method and apparatus
CN112631754A (en) Data processing method, data processing device, storage medium and electronic device
CN104346378B (en) A kind of method, apparatus and system for realizing complex data processing
CN116016628A (en) API gateway buried point analysis method and device
CN113806429A (en) Canvas type log analysis method based on large data stream processing framework
CN111125209A (en) Access configuration system supporting multi-element heterogeneous type data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191126