CN113900886A - Abnormal log monitoring method - Google Patents
Abnormal log monitoring method Download PDFInfo
- Publication number
- CN113900886A CN113900886A CN202111039082.7A CN202111039082A CN113900886A CN 113900886 A CN113900886 A CN 113900886A CN 202111039082 A CN202111039082 A CN 202111039082A CN 113900886 A CN113900886 A CN 113900886A
- Authority
- CN
- China
- Prior art keywords
- bit
- adopting
- monitoring method
- elements
- hash
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000012544 monitoring process Methods 0.000 title claims abstract description 32
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 17
- 238000013461 design Methods 0.000 claims abstract description 8
- 238000012163 sequencing technique Methods 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 24
- 238000012545 processing Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 3
- 230000005856 abnormality Effects 0.000 claims 1
- 230000008569 process Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 241000272201 Columbiformes Species 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3006—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3065—Monitoring arrangements determined by the means or processing involved in reporting the monitored data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/1734—Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention particularly relates to an abnormal log monitoring method. The abnormal log monitoring method comprises the steps of storing all known elements to form a set R, and judging whether an element x exists in the set R by adopting a Bloom Filter as a probability algorithm; separating data into different areas by adopting a double-layer bucket algorithm design idea, and then respectively sequencing elements in each area by adopting a Bit-map method; and searching and storing a set R corresponding to the element x in the document set and the position of the element x in the set R by adopting an inverted index searching method. The abnormal log monitoring method can timely monitor abnormal log information in a log environment with high concurrency and large data volume, effectively collect and classify the log information, assist workers to find problems in the shortest time and ensure normal and stable operation of the system.
Description
Technical Field
The invention relates to the technical field of system monitoring, in particular to an abnormal log monitoring method.
Background
Improving the service quality and ensuring the safe operation of the network in the network management is an important research subject. Network management needs the support of network log data, and a log data acquisition technology becomes an important research content.
The system log is a very critical component, and can record information of hardware, software and system problems in the system, including the system log, the application log and the security log. Originally, the main use-oriented object of logs was software engineers, who examined problems by reading log information, because system log information was critical to determine the root cause of a failure or to narrow down the scope of system attacks. The system log can enable engineers to quickly know all events before a fault or attack occurs, and can be used for checking the reason of the error or searching traces left by an attacker when the attack occurs. Of course, it is also critical to develop a good set of system logging policies for a virtualized environment, as system logs need to be associated with many different external components.
Aiming at the problems of large-batch log monitoring and log data acquisition, at present, several schemes are mainly popular, including fluent, Logstash, Flume, scriber and the like, wherein LogAgent is adopted in the interior of the Alibara, and LogTail is adopted in the Aliskiu cloud. Fluent in these products takes absolute advantage and successfully resides in CNCF (Cloud Native Computing Foundation), and the proposed Unified Logging Layer (Unified Logging Layer) greatly reduces the complexity of the whole log collection and analysis. Fluentd considers that most existing log formats are poorly structured, which benefits from the excellent ability of humans to parse log data, since log data is initially human-oriented, humans being their primary log data consumers. Therefore, the fluent hopes to reduce the complexity of the whole log collection access by unifying the log storage format, supposing that the log data input under the assumption is in M formats, and N kinds of storage are accessed at the rear end of the log collection Agent (Agent), so that each storage system needs to realize the function of analyzing the M kinds of log formats, the total complexity is M × N, and if the log collection Agent unifies the log formats, the total complexity is M + N. This is the core idea of fluent, and its plug-in mechanism is a favorable place. Logstack and fluent are similar to the ELK technology stack and are widely used in the industry.
In consideration of the characteristic of large log data magnitude, an effective system log strategy can effectively help technicians to better organize a log structure and accurately find log information. The system logging strategy can send warning information to the user just when the fault occurs, and help find the problem in the shortest time. Today, a large number of machines process log data day and night for use by offline and online analysis systems to generate readable reports to assist humans in making decisions.
The invention provides an abnormal log monitoring method aiming at monitoring abnormal conditions and acquiring abnormal information data in the running process of an e-government affair system, and aims to accurately find abnormal information generated in the running process of the system when the system generates large service data volume and log information at TB level and is distributed at different nodes.
Disclosure of Invention
In order to make up for the defects of the prior art, the invention provides a simple and efficient abnormal log monitoring method.
The invention is realized by the following technical scheme:
an abnormal log monitoring method is characterized in that: the method for optimizing and improving the network monitoring point deployment algorithm and the log data acquisition method in the log data acquisition specifically comprises the following steps:
firstly, storing all known elements to form a set R, and judging whether an element x exists in the set R by adopting a Bloom Filter (BF) as a probability algorithm;
secondly, separating data into different areas by adopting a double-layer bucket algorithm design idea, and then respectively sequencing elements in each area by adopting a Bit-map method;
and thirdly, searching and storing a set R corresponding to the element x in the document set and the position of the element x in the set R by adopting an inverted index searching method.
In the first step, a Bloom Filter (BF) maps URLs (Uniform Resource locators) of elements in the set R to a certain bit in a binary bit array (bitmap array) by using a Hash table data structure, and if a bit of an element x corresponding to the binary bit array is already set to 1, it indicates that the element x exists in the set R.
Hash has a collision problem, and the values of two URLs obtained by using the same Hash are probably the same. In order to reduce the conflict, in the first step, a plurality of different Hash functions are used to obtain different Hash table data structures to judge whether the element x exists in the set R;
when the Hash table data structure obtained by any Hash function judges that the element x does not exist in the set R, the element x can be determined not to exist in the set R;
when the Hash table data structures obtained by all Hash functions judge that the element x exists in the set R, the element x is determined to exist in the set R.
In the first step, the implementation steps of judging whether the element x exists in the set R by a Bloom Filter (BF) are as follows:
1) a Bloom Filter (BF) uses a binary digit array with m bits to store information, and in an initial state, the binary digit array comprises the bit array with m bits, each bit is 0, namely the elements of the whole binary digit array are all set to be 0;
2) mapping each element in the set R ═ { x1, x2, …, xn } into a range of {1, …, m } using k mutually independent Hash functions (Hash functions), respectively;
3) when judging whether the element x belongs to the set R, k hash values are obtained by using k hash functions for the element x, if the positions of all hashi (x) are all 1(i, k are natural numbers, i is more than or equal to 1 and less than or equal to k), namely k positions are all set to be 1, the element x is considered to be an element in the set R, otherwise, the element x is considered not to be an element in the set R.
In the step 2), when any element y is added to the Bloom Filter (BF), k hash functions are used to obtain k hash values, and then the corresponding bit in the binary bit array is set to 1, that is, the position hashi (y) mapped by the ith hash function is set to 1(i is greater than or equal to 1 and is less than or equal to k).
In the step 2), when one position is set to be 1 for multiple times, only the first setting is valid, and the later settings are all invalid.
In the second step, a large amount of data to be processed is divided for multiple times by adopting a design idea of a double-layer barrel algorithm, the range is determined step by step, and finally data units which can be processed independently are formed; and when the elements need to be sequenced, processing each data unit by adopting a Bit-map method respectively.
In the second step, when n (m, n are natural numbers, m > n) elements in the m elements need to be sorted, the implementation steps of sorting by using a Bit Map algorithm are as follows:
1) opening up the space of m Bit positions by adopting a Bit-map method, and setting the m Bit positions as 0;
2) and traversing n elements needing to be sorted in sequence, and setting the corresponding bit positions of the elements to be 1.
The invention has the beneficial effects that: the abnormal log monitoring method can timely monitor abnormal log information in a log environment with high concurrency and large data volume, effectively collect and classify the log information, assist workers to find problems in the shortest time and ensure normal and stable operation of the system.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic diagram of a sorting method using a Bit Map algorithm according to the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the embodiment of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Monitoring (Monitoring) and Logging (Logging) are among the most critical infrastructures in large distributed systems. Because of no monitoring, there is no way to know the operation of the service, and there is no Down machine in the cluster, whether the CPU usage and load of the machine are normal, whether the Traffic of the website is normal, and whether the error rate of the service is within a tolerable range. In short, monitoring allows us to know the operation and availability of a website in real time. Therefore, for the detection of the network and the system operation condition, the collection and processing condition of the log are already one of the standards for measuring the stable operation condition of the system.
Bloom Filter (BF) is a space-efficient random data structure that uses bit arrays to represent a set very compactly and to determine whether an element belongs to the set. It is a fast probabilistic algorithm that determines whether a set of elements exists. The Bloom Filter may make a false determination, but does not miss the determination. That is, the Bloom Filter decision element is no longer aggregated, and that is certainly not. If the judgment element exists in the set, the judgment is wrong with a certain probability. Thus, Bloom filters are not suitable for "zero error" applications. And in the application occasion that can tolerate low error rate, the Bloom Filter greatly saves space compared with other common algorithms (such as hash, half-searching). The method has the advantages that the space efficiency and the query time far exceed those of a common algorithm, and the defects of certain misrecognition rate and difficulty in deletion are overcome.
For the problem of network monitoring point deployment, the problem that monitoring points arranged in an original network are not easy to change after the topological structure of a distributed network is expanded is considered, and an increment selection method of the increment network monitoring points is optimized.
The abnormal log monitoring method optimizes and improves a network monitoring point deployment algorithm and a log data acquisition method in log data acquisition, and specifically comprises the following steps:
firstly, storing all known elements to form a set R, and judging whether an element x exists in the set R by adopting a Bloom Filter (BF) as a probability algorithm;
secondly, separating data into different areas by adopting a double-layer bucket algorithm design idea, and then respectively sequencing elements in each area by adopting a Bit-map method;
and thirdly, searching and storing a set R corresponding to the element x in the document set and the position of the element x in the set R by adopting an inverted index searching method.
Calculating whether a certain element x is in a set, firstly, the conceivable method is to store all known elements to form a set R, and then, comparing the element x with the elements in the set R one by one to judge whether the element x exists in the set R; the method can be realized by adopting a data structure such as a linked list. However, as the number of elements in the set R increases, the memory occupied by the elements increases. If tens of millions of different web pages need to be downloaded, the required memory can occupy the memory address space of the whole process. Even if the methods of MD5 and UUID are used to convert URLs into fixed short strings, the memory usage is quite large.
In the first step, a Bloom Filter (BF) maps URLs (Uniform Resource locators) of elements in the set R to a certain bit in a binary bit array (bitmap array) by using a Hash table data structure, and if a bit of an element x corresponding to the binary bit array is already set to 1, it indicates that the element x exists in the set R.
Hash has a collision problem, and the values of two URLs obtained by using the same Hash are probably the same. In order to reduce the conflict, in the first step, a plurality of different Hash functions are used to obtain different Hash table data structures to judge whether the element x exists in the set R;
when the Hash table data structure obtained by any Hash function judges that the element x does not exist in the set R, the element x can be determined not to exist in the set R;
when the Hash table data structures obtained by all Hash functions judge that the element x exists in the set R, the element x is determined to exist in the set R.
In the first step, the implementation steps of judging whether the element x exists in the set R by a Bloom Filter (BF) are as follows:
1) a Bloom Filter (BF) uses a binary digit array with m bits to store information, and in an initial state, the binary digit array comprises the bit array with m bits, each bit is 0, namely the elements of the whole binary digit array are all set to be 0;
2) mapping each element in the set R ═ { x1, x2, …, xn } into a range of {1, …, m } using k mutually independent Hash functions (Hash functions), respectively;
when any element y is added in Bloom Filter (BF), k hash functions are used to obtain k hash values, and then corresponding bits in the binary bit array are set to be 1, namely the position hashi (y) mapped by the ith hash function is set to be 1(i, k are natural numbers, i is more than or equal to 1 and less than or equal to k);
in the step 2), when one position is set to be 1 for multiple times, only the first setting is valid, and the later settings are all invalid.
3) When judging whether the element x belongs to the set R, k hash values are obtained by using k hash functions for the element x, if the positions of all hashi (x) are all 1(i is more than or equal to 1 and less than or equal to k), namely k positions are all set to be 1, the element x is considered to be an element in the set R, otherwise, the element x is not considered to be an element in the set R.
Double-layer buckets are an algorithm design idea. When a pile of large amount of data cannot be processed directly without using a direct addressing table, the data can be divided into small units, and then the small units are processed according to a certain strategy, thereby achieving the purpose. In the second step, a large amount of data to be processed is divided for multiple times by adopting a design idea of a double-layer barrel algorithm, the range is determined step by step, and finally data units which can be processed independently are formed; and when the elements need to be sequenced, processing each data unit by adopting a Bit-map method respectively.
By reducing multiple times, a double layer is only an example, and divide and conquer is the root (only "divide and conquer"). This idea can also be used when it is sometimes necessary to construct a large data with a small range of data, in contrast to the inverse of this.
For example, to find the number of non-repeating integers out of 2.5 million integers, the memory space is not sufficient to accommodate the 2.5 million integers. Just like the pigeon nest principle, when the integer number is 2^32, the 2^32 number can be divided into 2^8 ^ 256 areas (for example, a single file represents an area), then the data is separated into different areas, and then the different areas are processed by using the Bit Map algorithm. The solution can be conveniently realized as long as enough disk space is available.
In the second step, when n (m, n are natural numbers, m > n) elements in the m elements need to be sorted, the implementation steps of sorting by using a Bit Map algorithm are as follows:
1) opening up the space of m Bit positions by adopting a Bit-map method, and setting the m Bit positions as 0;
2) and traversing n elements needing to be sorted in sequence, and setting the corresponding bit positions of the elements to be 1.
For example, 5 elements (4, 7, 2, 5, 3) within 0-7 are to be sorted (assuming there is no repetition of these elements). To represent 8 numbers, only 8 bits (1Bytes) are needed, and first a space of 1Byte is created, and all Bit positions of the space are set to 0.
Then go through these 5 elements, first the first element is 4, then the corresponding position of 4 is 1 (p + (i/8) | (0x01< (i% 8) can be operated in this way), where the operation relates to the case of Big-ending and Little-ending, here the Big-ending is defaulted), because it is from zero, the fifth position is 1 (as shown in fig. 1):
then the second element 7 is processed again, the eighth Bit is set to 1, then the third element is processed again until all elements are processed finally, the corresponding position is 1, and the state of the Bit of the memory at this time is as shown in fig. 1.
The index is a data structure for fast data search, and a hash table, a binary search and a block search can also be regarded as an index, and the index has the value of obtaining the most relevant, the most complete and the deepest data set in a shorter time. Most commonly used indexes are those based on a sequence table, a hash table, or a B + tree. The inverted index (InvertedIndex) is mainly used in the field of information retrieval, and is a most commonly used data index storage structure, and is used to store a document set corresponding to a word in the document set and a position of the word existing in a file, and therefore is also called a reverse index, and corresponding to the reverse index, is a forward index, and the forward index is used to store a word list of each file and record the position thereof, and the following files and corresponding sentence contents are exemplified as follows.
And searching and storing the file corresponding to the element (algorithm, data and mathematics) in the document set and the position of the element x in the file by using an inverted index searching method.
The above-described embodiment is only one specific embodiment of the present invention, and general changes and substitutions by those skilled in the art within the technical scope of the present invention are included in the protection scope of the present invention.
Claims (8)
1. An abnormal log monitoring method is characterized in that: the method for optimizing and improving the network monitoring point deployment algorithm and the log data acquisition method in the log data acquisition specifically comprises the following steps:
firstly, storing all known elements to form a set R, and judging whether an element x exists in the set R by adopting a Bloom Filter as a probability algorithm;
secondly, separating data into different areas by adopting a double-layer bucket algorithm design idea, and then respectively sequencing elements in each area by adopting a Bit-map method;
and thirdly, searching and storing a set R corresponding to the element x in the document set and the position of the element x in the set R by adopting an inverted index searching method.
2. The anomaly log monitoring method according to claim 1, wherein: in the first step, the Bloom Filter maps the URL of each element in the set R to a bit in the binary digit array by using the Hash table data structure, and if the bit of the binary digit array corresponding to the element x is already set to 1, it indicates that the element x exists in the set R.
3. The anomaly log monitoring method according to claim 2, wherein: in the first step, different Hash table data structures are obtained by utilizing a plurality of different Hash functions to judge whether the element x exists in the set R or not;
when the Hash table data structure obtained by any Hash function judges that the element x does not exist in the set R, the element x can be determined not to exist in the set R;
when the Hash table data structures obtained by all Hash functions judge that the element x exists in the set R, the element x is determined to exist in the set R.
4. The anomaly log monitoring method according to claim 3, wherein: in the first step, the implementation steps of judging whether the element x exists in the set R by the Bloom Filter are as follows:
1) the Bloom Filter uses a binary digit array with m bits to store information, and in an initial state, the binary digit array comprises the bit array with m bits, each bit is 0, namely, the elements of the whole binary digit array are all set to be 0;
2) mapping each element in a set R ═ { x1, x2, …, xn } into a range of {1, …, m } respectively using k mutually independent hash functions;
3) when judging whether the element x belongs to the set R, k hash values are obtained by using k hash functions for the element x, if the positions of all hashi (x) are all 1, namely k positions are set to be 1, the element x is considered to be an element in the set R, otherwise, the element x is considered not to be an element in the set R.
5. The anomaly log monitoring method according to claim 4, wherein: in the step 2), when any element y is added to the Bloom Filter, k hash functions are used to obtain k hash values, and then the corresponding bit in the binary bit array is set to 1, that is, the position hashi (y) mapped by the ith hash function is set to 1.
6. The anomaly log monitoring method according to claim 5, wherein: in the step 2), when one position is set to be 1 for multiple times, only the first setting is valid, and the later settings are all invalid.
7. The anomaly log monitoring method according to claim 1, wherein: in the second step, a large amount of data to be processed is divided for multiple times by adopting a design idea of a double-layer barrel algorithm, the range is determined step by step, and finally data units which can be processed independently are formed; and when the elements need to be sequenced, processing each data unit by adopting a Bit-map method respectively.
8. The abnormality log monitoring method according to claim 1 or 7, characterized in that: in the second step, when n elements in the m elements need to be sorted, the implementation steps of sorting by using a Bit Map algorithm are as follows:
1) opening up the space of m Bit positions by adopting a Bit-map method, and setting the m Bit positions as 0;
2) and traversing n elements needing to be sorted in sequence, and setting the corresponding bit positions of the elements to be 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111039082.7A CN113900886A (en) | 2021-09-06 | 2021-09-06 | Abnormal log monitoring method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111039082.7A CN113900886A (en) | 2021-09-06 | 2021-09-06 | Abnormal log monitoring method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113900886A true CN113900886A (en) | 2022-01-07 |
Family
ID=79188744
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111039082.7A Pending CN113900886A (en) | 2021-09-06 | 2021-09-06 | Abnormal log monitoring method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113900886A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117520410A (en) * | 2023-11-03 | 2024-02-06 | 华青融天(北京)软件股份有限公司 | Service data processing method, device, electronic equipment and computer readable medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101799783A (en) * | 2009-01-19 | 2010-08-11 | 中国人民大学 | Data storing and processing method, searching method and device thereof |
CN102821164A (en) * | 2012-08-31 | 2012-12-12 | 河海大学 | Efficient parallel-distribution type data processing system |
CN105577455A (en) * | 2016-03-07 | 2016-05-11 | 达而观信息科技(上海)有限公司 | Method and system for performing real-time UV statistic of massive logs |
EP2487610B1 (en) * | 2011-02-10 | 2019-01-16 | Deutsche Telekom AG | A method for generating a randomized data structure for representing sets, based on bloom filters |
-
2021
- 2021-09-06 CN CN202111039082.7A patent/CN113900886A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101799783A (en) * | 2009-01-19 | 2010-08-11 | 中国人民大学 | Data storing and processing method, searching method and device thereof |
EP2487610B1 (en) * | 2011-02-10 | 2019-01-16 | Deutsche Telekom AG | A method for generating a randomized data structure for representing sets, based on bloom filters |
CN102821164A (en) * | 2012-08-31 | 2012-12-12 | 河海大学 | Efficient parallel-distribution type data processing system |
CN105577455A (en) * | 2016-03-07 | 2016-05-11 | 达而观信息科技(上海)有限公司 | Method and system for performing real-time UV statistic of massive logs |
Non-Patent Citations (3)
Title |
---|
佚名: "Bitmap的原理和应用", pages 1, Retrieved from the Internet <URL:https://zhuanlan.zhihu.com/p/67920410> * |
樊重俊等: "大数据分析与应用", vol. 1, 31 January 2016, 立信会计出版社, pages: 247 - 251 * |
苏高: "大数据时代的营销与商业分析", vol. 1, 31 October 2014, 中国铁道出版社, pages: 311 - 312 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117520410A (en) * | 2023-11-03 | 2024-02-06 | 华青融天(北京)软件股份有限公司 | Service data processing method, device, electronic equipment and computer readable medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109740037B (en) | Multi-source heterogeneous flow state big data distributed online real-time processing method and system | |
JP6716727B2 (en) | Streaming data distributed processing method and apparatus | |
US20100312749A1 (en) | Scalable lookup service for distributed database | |
CN102648468B (en) | Table search device, table search method, and table search system | |
CN102142032B (en) | Method and system for reading and writing data of distributed file system | |
CN104077423A (en) | Consistent hash based structural data storage, inquiry and migration method | |
CN105069111A (en) | Similarity based data-block-grade data duplication removal method for cloud storage | |
CN112434000A (en) | Small file merging method, device and equipment based on HDFS | |
Moia et al. | Similarity digest search: A survey and comparative analysis of strategies to perform known file filtering using approximate matching | |
CN110598467A (en) | Memory data block integrity checking method | |
CN107180034A (en) | The group system of MySQL database | |
CN107391761A (en) | A kind of data managing method and device based on data de-duplication technology | |
CN113900886A (en) | Abnormal log monitoring method | |
CN107133334B (en) | Data synchronization method based on high-bandwidth storage system | |
Feng et al. | An efficient caching mechanism for network-based url filtering by multi-level counting bloom filters | |
Blustein et al. | Bloom filters. a tutorial, analysis, and survey | |
CN111078975B (en) | Multi-node incremental data acquisition system and acquisition method | |
CN118394269A (en) | Distributed storage design implementation method based on artificial intelligence | |
CN116226139B (en) | Distributed storage and processing method and system suitable for large-scale ocean data | |
CN106709045B (en) | Node selection method and device in distributed file system | |
CN106250440B (en) | Document management method and device | |
CN116318800A (en) | BGP route data monitoring method and device and electronic equipment | |
CN114880297A (en) | Distributed data deduplication method and system based on fingerprints | |
CN112445771A (en) | Data processing method, device and equipment of network flow and storage medium | |
CN117573428B (en) | Disaster recovery backup method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |