[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN104424229A - Calculating method and system for multi-dimensional division - Google Patents

Calculating method and system for multi-dimensional division Download PDF

Info

Publication number
CN104424229A
CN104424229A CN201310376344.8A CN201310376344A CN104424229A CN 104424229 A CN104424229 A CN 104424229A CN 201310376344 A CN201310376344 A CN 201310376344A CN 104424229 A CN104424229 A CN 104424229A
Authority
CN
China
Prior art keywords
data
view
checked
dimension
combination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310376344.8A
Other languages
Chinese (zh)
Other versions
CN104424229B (en
Inventor
李�浩
武磊
曾伟纪
蔡馥晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201310376344.8A priority Critical patent/CN104424229B/en
Publication of CN104424229A publication Critical patent/CN104424229A/en
Application granted granted Critical
Publication of CN104424229B publication Critical patent/CN104424229B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2264Multidimensional index structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses calculating method and system for multi-dimensional division, relates to the technical field of multi-dimensional division, and aims at performing multi-dimensional division calculation for mass data on real time in order to lower down the complexity in calculating. The method comprises the steps of generating a recursive topology according to the default indicators and dimension information; determining the curing dimension combination and the operation path for generating the optimal curing dimension combination according to the preset curing strategy; receiving mass data reported by a data reporting point; preprocessing the mass data to obtain the processed data; performing different accumulation calculations for the processed data on real time according to the preset calculation rules, the curing dimension combination and the optimal curing dimension combination, so as to obtain the accumulation calculation result. The method and system are applied to multi-dimensional division.

Description

The computing method that a kind of various dimensions split and system
Technical field
The present invention relates to various dimensions disassemble technique field, particularly relate to computing method and the system of the fractionation of a kind of various dimensions.
Background technology
At present, data for big data quantity generally adopt the data of N+1 to carry out various dimensions off-line and split, when namely usually carrying out various dimensions fractionation calculating, first add up all data received for a day, then all data that the previous day is added up are carried out various dimensions fractionation in second day.Prior art, when carrying out various dimensions and splitting calculating, adopts the OLAP(on-line analytical processing of relevant database usually), on-line analytical processing) system.Relevant database is the Organization of Data be made up of the contact between bivariate table and each bivariate table.Adopting prior art to carry out various dimensions splits when calculating, and according to the request of data of user, by the contact combined calculation between multiple bivariate table of relevant database and multiple bivariate table, thus obtains the multidimensional combined result meeting user data requests.
But, when various dimensions fractionation is carried out to the data of big data quantity, real-time various dimensions deconsolidation process cannot be accomplished, the various dimensions under small data quantity split to adopt OLAP system only can support, and split calculating owing to carrying out various dimensions based on multiple bivariate tables of relevant database, computation complexity is higher, and it is less to process data volume.
Summary of the invention
The computing method that embodiments of the invention provide a kind of various dimensions to split and system, can carry out real-time various dimensions to mass data and split calculating, reduce computation complexity.
First aspect, the computing method that the embodiment of the present invention provides a kind of various dimensions to split, comprising:
Generate recursion topology according to the index preset and dimensional information, described recursion topology comprises the recursion path between dimension combination and the combination of each dimension, and described dimension combination comprises the Property Name of each dimension;
According to default solidification strategy, determine that solidifying dimension combines and the arithmetic path determining to form the combination of optimum solidification dimension;
Receive data and report the mass data a little reported, and pre-service is carried out to described mass data, obtain process data;
Combine according to default computation rule and described solidification dimension and the described optimum arithmetic path solidifying dimension combination, in real time different accumulation calculating is performed to described process data, obtain accumulation calculating result.
Second aspect, the computing system that the embodiment of the present invention provides a kind of various dimensions to split, comprising:
Computing decision-making module, for generating recursion topology according to the index preset and dimensional information, described recursion topology comprises the recursion path between dimension combination and the combination of each dimension, and described dimension combination comprises the Property Name of each dimension;
Described computing decision-making module, also for according to presetting solidification strategy, determines that solidifying dimension combines and the arithmetic path determining to form the combination of optimum solidification dimension;
Pretreatment module, reports for receiving data the mass data a little reported, and carries out pre-service to described mass data, obtains process data;
Dimension combines real-time calculation services module, for according to the arithmetic path presetting computation rule and the combination of described solidification dimension and the combination of described optimum solidification dimension, performs different accumulation calculating in real time, obtain accumulation calculating result to described process data.
The computing method that the embodiment of the present invention provides a kind of various dimensions to split and system, by generating recursion topology according to the index preset and dimensional information, described recursion topology comprises the recursion path between dimension combination and the combination of each dimension, and described dimension combination comprises the Property Name of each dimension; According to default solidification strategy, determine that solidifying dimension combines and the arithmetic path determining to form the combination of optimum solidification dimension; Receive data and report the mass data a little reported, and pre-service is carried out to described mass data, obtain process data; Combine according to default computation rule and described solidification dimension and the described optimum arithmetic path solidifying dimension combination, in real time different accumulation calculating is performed to described process data, obtain accumulation calculating result.
With in prior art to the data of big data quantity carry out various dimensions split time, real-time various dimensions deconsolidation process cannot be accomplished, the various dimensions under small data quantity split to adopt OLAP system only can support, and split calculating owing to carrying out various dimensions based on multiple bivariate tables of relevant database, computation complexity is higher, and process that data volume is less to be compared, the present invention is by performing different accumulation calculating to described process data in real time, obtain accumulation calculating result, make it possible to that real-time various dimensions are carried out to mass data and split calculating, and reduce computation complexity.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
The process flow diagram of the computing method that a kind of various dimensions that Fig. 1 provides for one embodiment of the invention split;
The recursion topology schematic diagram that Fig. 2 provides for one embodiment of the invention;
The process flow diagram of the computing method that a kind of various dimensions that Fig. 3 provides for another embodiment of the present invention split;
What Fig. 4 provided for another embodiment of the present invention performs different accumulation calculating to process data, obtains the process flow diagram of accumulation calculating result;
The flowing water row list schematic diagram converted by TopView list to be calculated that Fig. 5 provides for another embodiment of the present invention;
Memory cache storage organization schematic diagram in the View stores service that Fig. 6 provides for another embodiment of the present invention;
The process flow diagram of the data of the acquisition of the data inquiry request according to the user correspondence that Fig. 7 provides for another embodiment of the present invention;
The block diagram of the computing system that a kind of various dimensions that Fig. 8 provides for another embodiment of the present invention split;
The block diagram of the computing method that the another kind of various dimensions that Fig. 9 provides for another embodiment of the present invention split.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
The computing method that the embodiment of the present invention provides a kind of various dimensions to split, the executive agent of the method can be server, and be specifically as follows various dimensions split system, the method comprises:
Step 101, generates recursion topology according to the index preset and dimensional information, and described recursion topology comprises the recursion path between dimension combination and the combination of each dimension, and described dimension combination comprises the Property Name of each dimension.
Optionally, the index preset and dimensional information can be arranged according to current need business to be processed, and different business can arrange different indexs and dimensional information.Before this step performs, can be configured according to the index of current need business setting to be processed and dimensional information to various dimensions split system by Configuration Management Officer.
When generating recursion topology according to the index preset and dimensional information, upper volume can be carried out or delete the modes such as some attributes carrying out topology according to the hierarchical relationship between attribute.Such as, recursion topology is generated for dimension combination user city, age of user, user's sex, optionally, upper volume is carried out according to user city, such as user city is Shijiazhuang, province belonging to user city is Hebei province, then can obtain this attribute of province belonging to user city from this attribute of user city by upper volume.Optionally, can carry out upper volume according to age of user, such as age of user is 13 years old, and the age bracket belonging to age of user is 10 years old-20 years old, then can obtain this attribute of age bracket belonging to age of user from this attribute of age of user by upper volume.Optionally, some attributes can be deleted thus obtain new dimension combination, such as, can delete this attribute of user's sex, thus new dimension combination can be obtained: user city and age of user.
Optionally, recursion topology schematic diagram as shown in Figure 2, in figure, ABC is the magnanimity preprocessed data received, wherein, ABC can for dimension combination (View) be made up of three dimensions, the second layer comprises View_AB and View_AC that obtain from View_ABC two dimensions combinations, and third layer comprises and from View_AB, obtains View_A and View_B, the View_C obtained from View_AC.Wherein, the arrow between dimension combination represents the recursion path between attribute.
Step 102, according to default solidification strategy, determines that solidifying dimension combines and the arithmetic path determining to form the combination of optimum solidification dimension.
Preset the data query source that solidification strategy is used to indicate the combination of solidification dimension.
The combination of solidification dimension is the combination of some default dimension, such as, and the combination of age of user, user's sex, these three dimensions of user city, or, the combination etc. of age of user, these two dimensions of user's sex.
The arithmetic path that when arithmetic path of optimum solidification dimension combination can be the dimension combination of acquiring demand, calculation cost is minimum.
Optionally, when determining according to default solidification strategy the arithmetic path forming the combination of optimum solidification dimension, the dimension combination of path end points can think the combination of solidification dimension.
Step 103, receives data and reports the mass data a little reported, and carry out pre-service to described mass data, obtains process data.
Optionally, Configuration Management Officer reports a little to various dimensions split system log-on data in advance, and various dimensions split system receives pre-configured data and reports a little.
Report a little according to pre-configured described data, receiving described data reports a little by the mass data that different data channel reports, and described data channel is used for carrying out the process of pooled data bag or bust this retransmission process or compression to the described mass data received and processes.Optionally, different data report the data that a little can receive different-format, and by different data channel transmission data.
Optionally, pooled data bag is be large packet by little packet combining, makes it possible to reduce the handshake authentication etc. of data in data channel transmitting procedure, thus can reduce mean transit delay when trans-city transmission.Namely failure retransmits transmits again when bust this, so that ensure can transmission success, improves transmission reliability.
Optionally, this step can carry out timely pre-service to the mass data received.Mass data can be the data in different user a period of time, and such as, different user carries out the data operated at 9::0-10:00.
Optionally, carry out pre-service to the mass data received to comprise:
A, according to described mass data and default selection preprocessing rule, determines the preprocessing rule corresponding with described mass data.
Optionally, presetting selection preprocessing rule is specify mass data to adopt which kind of preprocessing rule to carry out the rule processed.Optionally, mass data can for the data of a user preset time period received, and preprocessing rule 1 now can be adopted to process, and preprocessing rule 1 can for carry out illegal field filtration to mass data; Mass data can for the data of multiple user preset time periods received, default processing rule 2 now can be adopted to process, presetting processing rule 2 can for first to divide into groups according to user to mass data, obtain the data in each user preset time period, then illegal field filtration is carried out to each group data after grouping.
B, according to described preprocessing rule, carries out cleaning treatment to described mass data, obtains the first process data.
Optionally, the preprocessing rule according to foregoing description carries out cleaning treatment to different mass datas, namely filters the illegal field in mass data.
C, according to default extraction field rule, extracts at least one field needed for various dimensions fractionation or at least one field combination in described first process data.
Optionally, preset the field extracting field rule and can extract for the needs rule of thumb arranged, namely the dimension combination extracting needs in the first process data is respectively needed, such as, need to obtain the sex of user, the combination of the dimension at the age of user, or need the dimension combination obtaining the sex of user, the address of user, then can according to default extraction field rule, first process data in extract the sex of user, these two fields of age of user, and again first process data in extract the sex of user, these two fields of address of user.Further alternative, of short duration buffer memory can be carried out at least one field extracted or at least one field combination.
D, according to extraction, at least one field or at least one field combination generate wide table, and at least one field described or at least one field combination are kept in described wide table.
Wide table refers to the table of the form storage data arranging storage, and the row that namely each field takies in wide table store.
Optionally, a field combination adopts a wide table to store, understandable, no matter have several field in a field combination, all adopt a wide table to store, the field combination such as extracted is the sex of user, the age of user, then preserve the sex of user, the data at the age of user in corresponding wide table, the field combination extracted is the sex of user, the address of user, then preserve the data of the sex of user, the address of user in corresponding wide table.
Optionally, the wide table repeated can be merged.
E, compresses at least one field described in preserving in described wide table or at least one field combination, obtains described process data.
Optionally, in order to reduce the space of preserving wide table, reducing and landing data volume, then the field combination of preserving in wide table can being compressed, such as, the mode such as compressed by weight, light reduction can be adopted to compress.Wherein, compressed by weight refers to adopt larger compression factor packed data, and light reduction refers to adopt less compression factor packed data.Data volume after same data acquisition compressed by weight is less than data volume after employing light reduction, but the calculated performance of the data after employing compressed by weight is poor, namely because compression factor is larger, after subsequent treatment compression during data, need the time of decompression more, adopt the calculated performance of the data after light reduction better, therefore preferably, at least one field described in preserving in described wide table or at least one field combination are carried out light reduction, obtains described process data.
Further alternative, can the process data acquisition corresponding by wide table be saved in real-time wide table service by the mode of full internal memory or internal memory list.Optionally, real-time wide table service is large list storage system, form by distributed internal memory list with based on the large table (Big Table) of distributed file system, whether data that random read-write is many, time delay is low such as duplicate removal are checked for needs, the full memory of meeting, other data are then use internal memory list storage, and are regularly merged on the Big Table of distributed file system.
Step 104, combines according to default computation rule and described solidification dimension and the described optimum arithmetic path solidifying dimension combination, performs different accumulation calculating in real time, obtain accumulation calculating result to described process data.
Optionally, default computation rule can for judge according to index simply to add up to process data or duplicate removal cumulative.Such as, certain male user A of Beijing in the morning 8:00 mobile phone logs in QQ, cross QQ in 7:00 computers log-on simultaneously, treat that computing index is that QQ logs in number, then for View [user city-user's sex], calculated before user A, therefore this login of user A can not be carried out repeat count, therefore need to adopt the cumulative mode of duplicate removal to calculate; But for View [user city-user's sex-landing approach], because the login mode of user A is different, therefore View value is different, then [Beijing-male sex-mobile phone logs in] this value under this View needs to add one, namely adopts simple cumulative mode to calculate.
Further alternative, the mode of full internal memory or internal memory list accumulation calculating result is adopted to be kept in View stores service, optionally, View stores service is large list storage system, form by distributed internal memory list with based on the large table (Big Table) of distributed file system, whether data that random read-write is many, time delay is low such as duplicate removal are checked for needs, the full memory of meeting, other data are then use internal memory list storage, and are regularly merged on the Big Table of distributed file system.
The computing method that the embodiment of the present invention provides a kind of various dimensions to split, by generating recursion topology according to the index preset and dimensional information, described recursion topology comprises the recursion path between dimension combination and the combination of each dimension, and described dimension combination comprises the Property Name of each dimension; According to default solidification strategy, determine that solidifying dimension combines and the arithmetic path determining to form the combination of optimum solidification dimension; Receive data and report the mass data a little reported, and pre-service is carried out to described mass data, obtain process data; Combine according to default computation rule and described solidification dimension and the described optimum arithmetic path solidifying dimension combination, in real time different accumulation calculating is performed to described process data, obtain accumulation calculating result, make it possible to that real-time various dimensions are carried out to mass data and split calculating, and reduce computation complexity.
The computing method that the embodiment of the present invention provides a kind of various dimensions to split, as shown in Figure 3, the method comprises:
Step 301, the data receiving Configuration Management Officer configuration report point, index, dimensional information and default various dimensions to split calculated rate.
Optionally, before the computing method execution that various dimensions split, Configuration Management Officer can by the information configuration relevant to current need business to be processed in various dimensions split system.Such as, index can be the index to be calculated of current need business to be processed, and index can be number of persons logging, login times, click number, number of clicks etc.Dimensional information can be dimension to be calculated in current need business to be processed, can generate recursion topology according to the hierarchical relationship between this dimension and dimension.
Optionally, such as, it can be 5 minutes, 15 minutes, 1 hour etc. that various dimensions split calculated rate, and splitting calculated rate various dimensions split system according to these various dimensions can buffer memory pipelined data as much as possible, and perform joint account and read-write operation, make it possible to more effectively scheduling resource.
It should be noted that, different indexs, dimensional information and various dimensions can be obtained according to different business and split calculated rate.
Step 302, generates recursion topology according to the index preset and dimensional information, and described recursion topology comprises the recursion path between dimension combination and the combination of each dimension, and described dimension combination comprises the Property Name of each dimension.
Optionally, this step is identical with step 101 in accompanying drawing 1, and this is no longer going to repeat them.
Step 303, according to default solidification strategy, determines that solidifying dimension combines and the arithmetic path determining to form the combination of optimum solidification dimension;
Optionally, this step is identical with step 102 in accompanying drawing 1, and this is no longer going to repeat them.
Step 304, receives data and reports the mass data a little reported, and carry out pre-service to described mass data, obtains process data.
Optionally, report a little according to pre-configured described data, receiving described data reports a little by the mass data that different data channel reports, and described data channel is used for carrying out the process of pooled data bag or bust this retransmission process or compression to the described mass data received and processes.
Optionally, this step is identical with step 103 in accompanying drawing 1, and this is no longer going to repeat them.
Step 305, preserves the described process data acquisition corresponding by wide table by the mode of full internal memory or internal memory list.
Optionally, the process data after pre-service can be landed in real-time wide table service.Optionally, real-time wide table service is large list storage system, form by distributed internal memory list with based on the large table (Big Table) of distributed file system, whether data that random read-write is many, time delay is low such as duplicate removal are checked for needs, the full memory of meeting, other data are then use internal memory list storage, and are regularly merged on the Big Table of distributed file system.
Optionally, the wide table preserved in real-time wide table service is suitable for adopting when performing duplicate removal when each computing node calculates in real time.
Step 306, combine according to default computation rule and described solidification dimension and the described optimum arithmetic path solidifying dimension combination, each computing node performs different accumulation calculating to described process data in real time, obtains accumulation calculating result.
As shown in Figure 4, this step comprises following sub-step 401-410:
Step 401, according to consistance Hash (Hash) by computing nodes extremely different for described process Data dissemination.
Optionally, by process Data dissemination in the process of different computing nodes, the mode that process data acquisition internal memory exchanges stores, namely process data not land, thus can process in time process data, and different computing nodes calculates simultaneously, thus improve processing speed.
Step 402, obtains half sequence structure (Lattice), and described Lattice comprises the arithmetic path of the combination of described solidification dimension and the combination of described optimum solidification dimension.
Optionally, half sequence structure of Lattice for being made up of View and upper volume (Rollup), also can be understood as topological diagram, wherein, have hierarchical relationship between View, fine granularity View can Rollup to coarseness View, such as, the hierarchical relationship of user city and the province belonging to user city, hierarchical relationship of age of user and the age bracket belonging to age of user etc.
Step 403, according to described default computation rule and described half sequence structure, determines top layer dimension combination (TopView) list to be calculated.
Optionally, TopView is the View that can only obtain from original activities information (Action), and such as, in accompanying drawing 2, View_AB and View_AC is TopView.Top layer dimension combination (TopView) list to be calculated comprises the list that all TopView dimension combinations are formed, and wherein, all TopView dimension combinations can for the solidification dimension combination related in the arithmetic path of optimum solidification dimension combination.
Optionally, TopView list to be calculated is sent to each computing node, each computing node performs concrete calculating according to the TopView list to be calculated received.Following steps 404-step 410:
Step 404, judge index type, pointer type comprises simple cumulative sum duplicate removal and adds up.
Optionally, index is Measure, for Action provides the description of quantification, is the data that can add up.Accumulation calculating comprises summing function (sum), counting function (count), count [different (distinct)], mean function (avg) etc.; Such as payment amount, paying number of times, paying number, average online hours.
Step 405, when described pointer type be described simple cumulative time, according to described index and the process data that receive, from described TopView list to be calculated, extract related column.
Such as, certain male user A of Beijing in the morning 8:00 mobile phone logs in QQ, simultaneously at 7:00 computer log QQ, when index is login times, when TopView be user city, user's sex and login mode time, value under this TopView needs to add one, then in TopView list to be calculated, extract row corresponding to user city, user's sex and login mode.
Step 406, by described related column and described finger target value write dimension combination (View) result table, by the data in described View result table and the same day View data accumulation, obtain accumulation calculating result, described accumulation calculating result adopts the mode of full internal memory or internal memory list to preserve.
Optionally, computing node is by the related column write View result table that obtains, and wherein, View result table is kept in View stores service, View stores service by the data in the View result table that newly writes and the same day View data accumulation.
Further alternative, the mode of full internal memory or internal memory list accumulation calculating result is adopted to be kept in View stores service, optionally, View stores service is large list storage system, form by distributed internal memory list with based on the large table (Big Table) of distributed file system, whether data that random read-write is many, time delay is low such as duplicate removal are checked for needs, the full memory of meeting, other data are then use internal memory list storage, and are regularly merged on the Big Table of distributed file system.
Optionally, in View stores service, memory cache storage organization as shown in Figure 6, Key represents dimension combination mark and dimension combined value, after secondary series, the desired value of various index and correspondence is shown in follow-up list, during by the data in described View result table and View data accumulation on the same day, first can determine dimension combination and row corresponding to index, and then in the row of correspondence, desired value perform and add an operation.
Each cumulative request only need operate the cache table of internal memory, and cache table often crosses certain hour interval, can be merged into the data file of order, and be merged on the large table of distributed document database, thus decreases a large amount of random write and read to the pressure of distributed data base.
It should be noted that, first new process data upgrade the internal memory View value of this computing node before arriving, and wait cache-time then just to write View stores service, thus greatly reduce the renewal pressure of View.
Step 407, when described pointer type be described duplicate removal add up time, according to described TopView list to be calculated and the process data that receive, determine unduplicated major key (Key) list;
Optionally, TopView list to be calculated is scaled the list of flowing water row, flowing water row list as shown in Figure 5, wherein, Key in the list of flowing water row represents flowing water mark and flowing water major key, and all the other are the row representing pipelined data, namely represent the data that some attributes are corresponding.According to the process data received, be converted to the value list 1 of TopView, according to the Key of the value list of TopView, obtain the flowing water information before this user, by the value list 2 of the flowing water information combination TopView before this user of acquisition, fiducial value list 1 and value list 2, filter out in value list 1 and do not exist and the TopView in value list 2.The list that the TopView filtered out is formed is unduplicated Key list in this step.
Step 408, the wide table corresponding by described process data of more described Key list and preservation, when there is not the first row in described Key list in described wide table, inserted in described wide table by described first row, described first row represents the arbitrary row in described Key list.
Optionally, first is classified as the arbitrary row in described Key list, and " first " is not to sort here, but for convenience.
Step 409, according to the new Key list that the row inserted in described wide table are formed, determines the new related column of TopView.
Step 410, by described new related column and described finger target value write View result table, by the data in described View result table and the same day View data accumulation, obtain accumulation calculating result, described accumulation calculating result adopts the mode of full internal memory or internal memory list to preserve.
Further alternative, adopt the mode of full internal memory or internal memory list to be kept in View stores service accumulation calculating result.
Continue to perform follow-up step after obtaining accumulation calculating result.
Step 307, is regularly merged into the described wide table corresponding by described process data preserved and described accumulation calculating result in distributed data base.
Optionally, regularly the wide table in real-time wide table service and the accumulation calculating result in View stores service are merged in distributed data base, the large table (Big Table) of distributed file system is adopted to store, namely adopt and support that the mode that row store stores result of calculation, distributed data base is the database not on the same stage on computing machine that data store in a computer network respectively, the data volume stored is comparatively large, and can provide efficient access ability.It should be noted that, when the mode adopting support row to store stores result of calculation, then not taking storage resources for there is no the row of data, namely the row of data are only had to take storage resources, with store compared with data in prior art, a large amount of storage resources can be saved, be used for storing valid data.
Step 308, receives the data inquiry request that user sends, and obtains corresponding data according to the data inquiry request of user.
Described data inquiry request is used for the related data of requesting query dimension combination, can comprise each dimension values and desired value, and described data inquiry request comprises the Property Name of each dimension of described dimension combination.
As shown in Figure 7, this step comprises step 701-step 710:
Step 701, resolves the data inquiry request that user sends, obtains index to be checked, View to be checked and time point to be checked.
Optionally, resolve the data inquiry request that user sends, index to be checked can be obtained, such as number of persons logging, can obtain View to be checked, such as user city, user's sex etc., time point to be checked can be obtained, such as, these time points of 9:00,9:30,10:00,10:30,11:00,11:30,12:00, namely the data inquiry request of user is wish to obtain the Long-term change trend logging in number in a period of time.
Whether step 702, according to described index to be checked and described View to be checked, preserve described index to be checked and data to be checked corresponding to described View to be checked in audit memory.
Optionally, when inquiring about according to the data inquiry request of user, in memory cache, first inquiring about whether there are data to be checked, if preserve data to be checked in internal memory, directly can obtain, thus can inquiry velocity be improved, improve Consumer's Experience.
Step 703, when preserving described index to be checked and described data to be checked corresponding to described View to be checked in described internal memory, then directly reads described data to be checked.
Step 704, when not preserving described index to be checked and described data to be checked corresponding to described View to be checked in described internal memory, then judge described View type to be checked according to Lattice, described View type to be checked comprises TopView and upper volume (RollupView).
Optionally, Lattice comprises the combination of all dimensions, can inquire the type of View to be checked from Lattice.
Step 705, when described View type to be checked is described TopView, directly obtains data corresponding to described View to be checked, and is preserved by the mode of full internal memory or internal memory list by the described data acquisition obtained.
Optionally, the data acquisition of the acquisition mode of full internal memory or internal memory list is converged in View stores service, inquires about to facilitate subsequent user.
Step 706, when described View type to be checked is described RollupView, judges that whether described View to be checked is View as calculated.
Judge whether described View to be checked is calculated View, namely check in Lattice whether comprising View, Lattice to be checked comprises the combination of solidification dimension, namely checks in the combination of solidification dimension whether comprise View to be checked.
Step 707, when described View to be checked is View as calculated, directly obtains data corresponding to described View to be checked.
Step 708, when described View to be checked be not calculated View time, inquiry obtains and calculates described View Least-cost to be checked and the father View of calculated View described to be checked.
Optionally, according to the arithmetic path of optimum solidification dimension combination, check the father View minimum and calculated with View calculation cost to be checked, if minimum and calculatedly there is father View with View calculation cost to be checked, then can improve the speed obtaining data to be checked corresponding to View to be checked.
Step 709, according to the recursion path between described View to be checked and described father View, determines calculation task.
Optionally, such as, View to be checked is A 1b 1, father View is A 2b 2, the recursion path between View to be checked and father View is A 1b 1->A 2b 1->A 2b 2, then calculation task can be A 2b 1->A 2b 2with A 1b 1->A 2b 1.
Optionally, calculate different task matching to different computing nodes.
Step 710, according to described calculation task, obtains the data that described each task is corresponding.
According to described calculation task, different computing nodes obtains corresponding data.
Data corresponding for described each task are performed and roll up operation, obtain the data that View to be checked is corresponding, and the mode of data acquisition corresponding for described View to be checked with full internal memory or internal memory list preserved by step 711.
The data that each task obtained according to each computing node is corresponding, segmentation obtains data corresponding to father View, then carries out upper volume, obtains View to be checked.
Step 712, when described View to be checked be not calculated View time, and when not inquiring the father View of calculated View described to be checked, then merge the data inquiry request that at least one user in preset time period sends, and according to the arithmetic path that described solidification dimension combination and described optimum solidification dimension are combined, batch starts the calculation task that dimension combination calculates, batch performs the calculation task that the combination of described dimension calculates, obtain result of calculation, in described result of calculation, comprise the data that described View to be checked is corresponding.
Optionally, when inquiry does not navigate to solidification View, then trigger on-demand computing in real time, namely according to the arithmetic path that described solidification dimension combination and described optimum solidification dimension are combined, calculate View to be checked.Preferably, the data inquiry request that at least one user in preset time period sends can be merged, the View to be checked of each user is merged, and batch calculates, thus can throughput of system be improved.
The computing method that the embodiment of the present invention provides a kind of various dimensions to split, by the arithmetic path according to default computation rule and the combination of described solidification dimension and the combination of described optimum solidification dimension, in real time different accumulation calculating is performed to described process data, obtain accumulation calculating result, make it possible to that real-time various dimensions are carried out to mass data and split calculating, and reduce computation complexity.Make, under different business the scene such as scene such as redaction issue or New activity popularization, various dimensions to be performed fast and split calculating, thus search important factor in order, make a policy.Simultaneously can also according to result of calculation in real time to tenant group, for Instant Ads commending system provides decision data.
The computing system that the embodiment of the present invention provides a kind of various dimensions to split, as shown in Figure 8, this system comprises: computing decision-making module 801, pretreatment module 802, and dimension combines real-time calculation services module 803.
Computing decision-making module 801, for generating recursion topology according to the index preset and dimensional information, described recursion topology comprises the recursion path between dimension combination and the combination of each dimension, and described dimension combination comprises the Property Name of each dimension;
Described computing decision-making module 801, also for according to presetting solidification strategy, determines that solidifying dimension combines and the arithmetic path determining to form the combination of optimum solidification dimension;
Pretreatment module 802, reports for receiving data the mass data a little reported, and carries out pre-service to described mass data, obtains process data;
Dimension combines real-time calculation services module 803, for the arithmetic path according to default computation rule and the combination of described solidification dimension and the combination of described optimum solidification dimension, in real time different accumulation calculating is performed to described process data, obtain accumulation calculating result.
Further alternative, when described computing decision-making module 801 generates recursion topology according to the index preset and dimensional information, for:
According to current pending business, determine and pre-configured index to be calculated and dimensional information, described dimensional information comprises hierarchical relationship between each dimensional attribute and dimension;
According to described index and described dimensional information, generate recursion topology.
Further alternative, described pretreatment module 802, for:
Report a little according to pre-configured described data, receiving described data reports a little by the mass data that different data channel reports, and described data channel is used for carrying out the process of pooled data bag or bust this retransmission process or compression to the described mass data received and processes.
Further alternative, as shown in Figure 9, described system, also comprises: real-time wide table service module 804;
Real-time wide table service module 804, for preserving the described process data acquisition corresponding by wide table by the mode of full internal memory or internal memory list.
Further alternative, as shown in Figure 9, described dimension combines real-time calculation services module 803, comprising: Dispatching Unit 8031, acquiring unit 8032, determining unit 8033, computing node 8034;
Dispatching Unit 8031, for according to consistance Hash (Hash) by described process Data dissemination to different computing node;
Acquiring unit 8032, for obtaining half sequence structure (Lattice), described Lattice comprises the arithmetic path of the combination of described solidification dimension and the combination of described optimum solidification dimension;
Determining unit 8033, for according to described default computation rule and described half sequence structure, determines top layer dimension combination (TopView) list to be calculated;
Computing node 8034, for according to pointer type to be calculated, performs different accumulation calculating to described process data in real time, obtains accumulation calculating result, and described pointer type comprises simple cumulative sum duplicate removal and adds up.
Further alternative, as shown in Figure 9, described computing node 8034, comprising:
Extract subelement 80341, for when described pointer type be described simply add up time, according to described index and the process data that receive, from described TopView list to be calculated, extract related column;
Computation subunit 80342, for combining (View) result table by described related column and described finger target value write dimension;
Described system also comprises: View storage services module 805, for by the data in described View result table and the same day View data accumulation, obtain accumulation calculating result, described accumulation calculating result adopts the mode of full internal memory or internal memory list to preserve.
Further alternative, as shown in Figure 9, described computing node 8034, comprising:
Determine subelement 80343, for when described pointer type be described duplicate removal add up time, according to described TopView list to be calculated and the process data that receive, determine unduplicated major key (Key) list;
Intron unit 80344, for the wide table corresponding by described process data of more described Key list and preservation, when there is not the first row in described Key list in described wide table, inserted in described wide table by described first row, described first row represents the arbitrary row in described Key list;
Describedly determine subelement 80343, the new Key list also for forming according to the row inserted in described wide table, determines the new related column of TopView;
Computation subunit 80345, for writing View result table by described new related column and described finger target value;
Described system also comprises: View storage services module 805, for by the data in described View result table and the same day View data accumulation, obtain accumulation calculating result, described accumulation calculating result adopts the mode of full internal memory or internal memory list to preserve.
Further alternative, as shown in Figure 9, described system, also comprises: distributed data base 806;
Distributed data base 806, for regularly receiving and merging by described wide table corresponding to described process data and described accumulation calculating result.
Further alternative, as shown in Figure 9, described system, also comprises: inquiry service cluster module 807, and described inquiry service cluster module 807 comprises:
Resolution unit 8071, for resolving the data inquiry request that user sends, obtains index to be checked, View to be checked and time point to be checked;
Whether query unit 8072, for according to described index to be checked and described View to be checked, preserve described index to be checked and data to be checked corresponding to described View to be checked in audit memory;
Reading unit 8073, for when preserving described index to be checked and described data to be checked corresponding to described View to be checked in described internal memory, then directly reads described data to be checked;
Judging unit 8074, for when not preserving described index to be checked and described data to be checked corresponding to described View to be checked in described internal memory, then determine described View type to be checked according to Lattice, described View type to be checked comprises TopView and upper volume (RollupView);
Processing unit 8075, dissimilar for according to described View to be checked, obtains the data that described View to be checked is corresponding in different ways;
Described system also comprises: View storage services module 805, for preserving described data.
Further alternative, described processing module 8075, for:
When described View type to be checked is described TopView, directly obtain data corresponding to described View to be checked;
Described View storage services module 805, for preserving the described data acquisition obtained by the mode of full internal memory or internal memory list.
Further alternative, described processing module 8075, comprising:
Judging unit, for when described View type to be checked is described RollupView, judges that whether described View to be checked is View as calculated;
Acquiring unit, for when described View to be checked is View as calculated, directly obtains data corresponding to described View to be checked;
Acquiring unit, for as described View to be checked be not calculated View time, inquiry obtains and calculates described View Least-cost to be checked and the father View of calculated View described to be checked; Determining unit, for according to the recursion path between described View to be checked and described father View, determines calculation task; Described acquiring unit, also for according to described calculation task, obtains the data that described each task is corresponding; Computing unit, rolling up operation for data corresponding for described each task being performed, obtaining the data that View to be checked is corresponding;
Described View storage services module, for preserving the mode of data acquisition corresponding for described View to be checked with full internal memory or internal memory list.
Further alternative, described processing module 8075, also for as described View to be checked be not calculated View time, and when not inquiring the father View of calculated View described to be checked, then merge the data inquiry request that at least one user in preset time period sends, and according to the arithmetic path that described solidification dimension combination and described optimum solidification dimension are combined, batch starts the calculation task that dimension combination calculates, batch performs the calculation task that the combination of described dimension calculates, obtain result of calculation, the data that described View to be checked is corresponding are comprised in described result of calculation.
It should be noted that, in accompanying drawing 8 or accompanying drawing 9 shown device, the content such as information interaction between the specific implementation process of its modules and modules, due to the inventive method embodiment based on same inventive concept, see embodiment of the method, can not repeat one by one at this.
The computing system that the embodiment of the present invention provides a kind of various dimensions to split, by computing decision-making module, for generating recursion topology according to the index preset and dimensional information, and according to default solidification strategy, determine that solidifying dimension combines and the arithmetic path determining to form the combination of optimum solidification dimension; Pretreatment module, reports for receiving data the mass data a little reported, and carries out pre-service to described mass data, obtains process data; Dimension combines real-time calculation services module, for the arithmetic path according to default computation rule and the combination of described solidification dimension and the combination of described optimum solidification dimension, in real time different accumulation calculating is performed to described process data, obtain accumulation calculating result, make it possible to that real-time various dimensions are carried out to mass data and split calculating, reduce computation complexity.
It should be noted that, device embodiment described above is only schematic, the wherein said unit illustrated as separating component or can may not be and physically separates, parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of module wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.Those of ordinary skill in the art, when not paying creative work, are namely appreciated that and implement.
Through the above description of the embodiments, those skilled in the art can be well understood to the mode that the present invention can add required common hardware by software and realize, can certainly comprise special IC, dedicated cpu, private memory, special components and parts etc. by specialized hardware to realize, but in a lot of situation, the former is better embodiment.Based on such understanding, technical scheme of the present invention can embody with the form of software product the part that prior art contributes in essence in other words, this computer software product is stored in the storage medium that can read, as the floppy disk of computing machine, USB flash disk, portable hard drive, ROM (read-only memory) (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc., comprising some instructions in order to make a computer equipment (can be personal computer, server, or the network equipment etc.) perform method described in the present invention each embodiment.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, between each embodiment identical similar part mutually see, what each embodiment stressed is the difference with other embodiments.Especially, for device and system embodiment, because it is substantially similar to embodiment of the method, so describe fairly simple, relevant part illustrates see the part of embodiment of the method.
The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; change can be expected easily or replace, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should described be as the criterion with the protection domain of claim.

Claims (24)

1. computing method for various dimensions fractionation, is characterized in that, comprising:
Generate recursion topology according to the index preset and dimensional information, described recursion topology comprises the recursion path between dimension combination and the combination of each dimension, and described dimension combination comprises the Property Name of each dimension;
According to default solidification strategy, determine that solidifying dimension combines and the arithmetic path determining to form the combination of optimum solidification dimension;
Receive data and report the mass data a little reported, and pre-service is carried out to described mass data, obtain process data;
Combine according to default computation rule and described solidification dimension and the described optimum arithmetic path solidifying dimension combination, in real time different accumulation calculating is performed to described process data, obtain accumulation calculating result.
2. method according to claim 1, is characterized in that, the index that described basis is preset and dimensional information generate recursion topology, comprising:
According to current pending business, determine and pre-configured index to be calculated and dimensional information, described dimensional information comprises hierarchical relationship between each dimensional attribute and dimension;
According to described index and described dimensional information, generate recursion topology.
3. method according to claim 1, is characterized in that, described reception data report the mass data a little reported, and comprising:
Report a little according to pre-configured described data, receiving described data reports a little by the mass data that different data channel reports, and described data channel is used for carrying out the process of pooled data bag or bust this retransmission process or compression to the described mass data received and processes.
4. method according to claim 3, is characterized in that, describedly carries out pre-service to described mass data, after obtaining process data, also comprises:
The described process data acquisition corresponding by wide table is preserved by the mode of full internal memory or internal memory list.
5. method according to claim 2, it is characterized in that, combine according to default computation rule and described solidification dimension and the described optimum arithmetic path solidifying dimension combination, in real time different accumulation calculating is performed to described process data, obtain accumulation calculating result, comprising:
According to consistance Hash (Hash) by computing nodes extremely different for described process Data dissemination;
Obtain half sequence structure (Lattice), described Lattice comprises the arithmetic path of the combination of described solidification dimension and the combination of described optimum solidification dimension;
According to described default computation rule and described half sequence structure, determine top layer dimension combination (TopView) list to be calculated;
Each computing node, according to pointer type to be calculated, performs different accumulation calculating to described process data in real time, obtains accumulation calculating result, and described pointer type comprises simple cumulative sum duplicate removal and adds up.
6. method according to claim 5, is characterized in that, each computing node described, according to pointer type to be calculated, performs different accumulation calculating to described process data in real time, obtains accumulation calculating result, comprising:
When described pointer type be described simple cumulative time, according to described index and the process data that receive, from described TopView list to be calculated, extract related column;
By described related column and described finger target value write dimension combination (View) result table, by the data in described View result table and the same day View data accumulation, obtain accumulation calculating result, described accumulation calculating result adopts the mode of full internal memory or internal memory list to preserve.
7. method according to claim 5, is characterized in that, each computing node described, according to pointer type to be calculated, performs different accumulation calculating to described process data in real time, obtains accumulation calculating result, comprising:
When described pointer type be described duplicate removal add up time, according to described TopView list to be calculated and the process data that receive, determine unduplicated major key (Key) list;
The wide table corresponding by described process data of more described Key list and preservation, when there is not the first row in described Key list in described wide table, inserted in described wide table by described first row, described first row represents the arbitrary row in described Key list;
According to the new Key list that the row inserted in described wide table are formed, determine the new related column of TopView;
By described new related column and described finger target value write View result table, by the data in described View result table and View data accumulation on the same day, acquisition accumulation calculating result, described accumulation calculating result adopts the mode of full internal memory or internal memory list to preserve.
8. the method according to claim 6 or 7, is characterized in that, after described acquisition accumulation calculating result, also comprises:
Regularly the described wide table corresponding by described process data preserved and described accumulation calculating result are merged in distributed data base.
9. method according to claim 1, it is characterized in that, the arithmetic path of computation rule and the combination of described solidification dimension and the combination of described optimum solidification dimension is preset in described basis, in real time different accumulation calculating is performed to described process data, after obtaining accumulation calculating result, also comprise:
Resolve the data inquiry request that user sends, obtain index to be checked, View to be checked and time point to be checked;
According to described index to be checked and described View to be checked, in audit memory, whether preserve described index to be checked and data to be checked corresponding to described View to be checked;
When preserving described index to be checked and described data to be checked corresponding to described View to be checked in described internal memory, then directly read described data to be checked;
When not preserving described index to be checked and described data to be checked corresponding to described View to be checked in described internal memory, then determine described View type to be checked according to Lattice, described View type to be checked comprises TopView and upper volume (RollupView);
Dissimilar according to described View to be checked, obtains the data that described View to be checked is corresponding in different ways, and preserves described data.
10. method according to claim 9, is characterized in that, described dissimilar according to described View to be checked, obtains the data that described View to be checked is corresponding in different ways, and preserve described data, comprising:
When described View type to be checked is described TopView, directly obtains data corresponding to described View to be checked, and the described data acquisition obtained is preserved by the mode of full internal memory or internal memory list.
11. methods according to claim 9, is characterized in that, described dissimilar according to described View to be checked, obtains the data that described View to be checked is corresponding in different ways, and preserve described data, comprising:
When described View type to be checked is described RollupView, judge that whether described View to be checked is View as calculated;
When described View to be checked is View as calculated, directly obtain data corresponding to described View to be checked;
When described View to be checked be not calculated View time, inquiry obtains and calculates described View Least-cost to be checked and the father View of calculated View described to be checked; According to the recursion path between described View to be checked and described father View, determine calculation task; According to described calculation task, obtain the data that described each task is corresponding; Data corresponding for described each task are performed and rolls up operation, obtain the data that View to be checked is corresponding, and the mode of data acquisition corresponding for described View to be checked with full internal memory or internal memory list is preserved.
12. methods according to claim 11, is characterized in that,
When described View to be checked be not calculated View time, and when not inquiring the father View of calculated View described to be checked, then merge the data inquiry request that at least one user in preset time period sends, and according to the arithmetic path that described solidification dimension combination and described optimum solidification dimension are combined, batch starts the calculation task that dimension combination calculates, batch performs the calculation task that the combination of described dimension calculates, obtain result of calculation, in described result of calculation, comprise the data that described View to be checked is corresponding.
The computing system that 13. 1 kinds of various dimensions split, is characterized in that, comprising:
Computing decision-making module, for generating recursion topology according to the index preset and dimensional information, described recursion topology comprises the recursion path between dimension combination and the combination of each dimension, and described dimension combination comprises the Property Name of each dimension;
Described computing decision-making module, also for according to presetting solidification strategy, determines that solidifying dimension combines and the arithmetic path determining to form the combination of optimum solidification dimension;
Pretreatment module, reports for receiving data the mass data a little reported, and carries out pre-service to described mass data, obtains process data;
Dimension combines real-time calculation services module, for according to the arithmetic path presetting computation rule and the combination of described solidification dimension and the combination of described optimum solidification dimension, performs different accumulation calculating in real time, obtain accumulation calculating result to described process data.
14. systems according to claim 13, is characterized in that, described computing decision-making module, for:
According to current pending business, determine and pre-configured index to be calculated and dimensional information, described dimensional information comprises hierarchical relationship between each dimensional attribute and dimension;
According to described index and described dimensional information, generate recursion topology.
15. systems according to claim 13, is characterized in that, described pretreatment module, for:
Report a little according to pre-configured described data, receiving described data reports a little by the mass data that different data channel reports, and described data channel is used for carrying out the process of pooled data bag or bust this retransmission process or compression to the described mass data received and processes.
16. systems according to claim 15, is characterized in that, described system, also comprises:
Real-time wide table service module, for preserving the described process data acquisition corresponding by wide table by the mode of full internal memory or internal memory list.
17. systems according to claim 14, is characterized in that, described dimension combines real-time calculation services module, comprising:
Dispatching Unit, for according to consistance Hash (Hash) by described process Data dissemination to different computing node;
Acquiring unit, for obtaining half sequence structure (Lattice), described Lattice comprises the arithmetic path of the combination of described solidification dimension and the combination of described optimum solidification dimension;
Determining unit, for according to described default computation rule and described half sequence structure, determines top layer dimension combination (TopView) list to be calculated;
Computing node, for according to pointer type to be calculated, performs different accumulation calculating to described process data in real time, obtains accumulation calculating result, and described pointer type comprises simple cumulative sum duplicate removal and adds up.
18. systems according to claim 17, is characterized in that, described computing node, comprising:
Extract subelement, for when described pointer type be described simply add up time, according to described index and the process data that receive, from described TopView list to be calculated, extract related column;
Computation subunit, for combining (View) result table by described related column and described finger target value write dimension;
Described system also comprises: View storage services module, for by the data in described View result table and the same day View data accumulation, obtain accumulation calculating result, described accumulation calculating result adopts the mode of full internal memory or internal memory list to preserve.
19. systems according to claim 17, is characterized in that, described computing node, comprising:
Determine subelement, for when described pointer type be described duplicate removal add up time, according to described TopView list to be calculated and the process data that receive, determine unduplicated major key (Key) list;
Intron unit, for the wide table corresponding by described process data of more described Key list and preservation, when there is not the first row in described Key list in described wide table, inserted in described wide table by described first row, described first row represents the arbitrary row in described Key list;
Describedly determine subelement, the new Key list also for forming according to the row inserted in described wide table, determines the new related column of TopView;
Computation subunit, for writing View result table by described new related column and described finger target value;
Described system also comprises: View storage services module, for by the data in described View result table and the same day View data accumulation, obtain accumulation calculating result, described accumulation calculating result adopts the mode of full internal memory or internal memory list to preserve.
20. systems according to claim 18 or 19, it is characterized in that, described system, also comprises:
Distributed data base, for regularly receiving and merging by described wide table corresponding to described process data and described accumulation calculating result.
21. systems according to claim 13, is characterized in that, described system, also comprises: inquiry service cluster module, and described inquiry service cluster module comprises:
Parsing module, for resolving the data inquiry request that user sends, obtains index to be checked, View to be checked and time point to be checked;
Whether enquiry module, for according to described index to be checked and described View to be checked, preserve described index to be checked and data to be checked corresponding to described View to be checked in audit memory;
Read module, for when preserving described index to be checked and described data to be checked corresponding to described View to be checked in described internal memory, then directly reads described data to be checked;
Judge module, for when not preserving described index to be checked and described data to be checked corresponding to described View to be checked in described internal memory, then determine described View type to be checked according to Lattice, described View type to be checked comprises TopView and upper volume (RollupView);
Processing module, dissimilar for according to described View to be checked, obtains the data that described View to be checked is corresponding in different ways;
Described system also comprises: View storage services module, for preserving described data.
22. systems according to claim 21, is characterized in that, described processing module, for:
When described View type to be checked is described TopView, directly obtain data corresponding to described View to be checked;
Described View storage services module, for preserving the described data acquisition obtained by the mode of full internal memory or internal memory list.
23. systems according to claim 21, is characterized in that, described processing module, comprising:
Judging unit, for when described View type to be checked is described RollupView, judges that whether described View to be checked is View as calculated;
Acquiring unit, for when described View to be checked is View as calculated, directly obtains data corresponding to described View to be checked;
Acquiring unit, for as described View to be checked be not calculated View time, inquiry obtains and calculates described View Least-cost to be checked and the father View of calculated View described to be checked; Determining unit, for according to the recursion path between described View to be checked and described father View, determines calculation task; Described acquiring unit, also for according to described calculation task, obtains the data that described each task is corresponding; Computing unit, rolling up operation for data corresponding for described each task being performed, obtaining the data that View to be checked is corresponding;
Described View storage services module, for preserving the mode of data acquisition corresponding for described View to be checked with full internal memory or internal memory list.
24. systems according to claim 23, is characterized in that,
Described processing module, also for as described View to be checked be not calculated View time, and when not inquiring the father View of calculated View described to be checked, then merge the data inquiry request that at least one user in preset time period sends, and according to the arithmetic path that described solidification dimension combination and described optimum solidification dimension are combined, batch starts the calculation task that dimension combination calculates, batch performs the calculation task that the combination of described dimension calculates, obtain result of calculation, in described result of calculation, comprise the data that described View to be checked is corresponding.
CN201310376344.8A 2013-08-26 2013-08-26 A kind of calculation method and system that various dimensions are split Active CN104424229B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310376344.8A CN104424229B (en) 2013-08-26 2013-08-26 A kind of calculation method and system that various dimensions are split

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310376344.8A CN104424229B (en) 2013-08-26 2013-08-26 A kind of calculation method and system that various dimensions are split

Publications (2)

Publication Number Publication Date
CN104424229A true CN104424229A (en) 2015-03-18
CN104424229B CN104424229B (en) 2019-02-22

Family

ID=52973224

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310376344.8A Active CN104424229B (en) 2013-08-26 2013-08-26 A kind of calculation method and system that various dimensions are split

Country Status (1)

Country Link
CN (1) CN104424229B (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794162A (en) * 2015-03-25 2015-07-22 中国人民大学 Real-time data storage and query method
CN106250226A (en) * 2016-08-02 2016-12-21 福建华渔未来教育科技有限公司 Task Scheduling Mechanism based on concordance hash algorithm and system
CN106570064A (en) * 2016-10-10 2017-04-19 上海瀚之友信息技术服务有限公司 Real time calculating system and method for general structural data
CN106649687A (en) * 2016-12-16 2017-05-10 飞狐信息技术(天津)有限公司 Method and device for on-line analysis and processing of large data
CN106933902A (en) * 2015-12-31 2017-07-07 北京国双科技有限公司 Querying method and device that data multidimensional degree is freely dissected
CN107025542A (en) * 2016-10-27 2017-08-08 阿里巴巴集团控股有限公司 The method and apparatus that the integration capability of mix is provided
CN107122369A (en) * 2016-02-25 2017-09-01 阿里巴巴集团控股有限公司 A kind of business data processing method, device and system
CN107153651A (en) * 2016-03-03 2017-09-12 阿里巴巴集团控股有限公司 A kind of multidimensional intersects data processing method and processing device
CN107229730A (en) * 2017-06-08 2017-10-03 北京奇虎科技有限公司 Data query method and device
CN108304472A (en) * 2017-12-28 2018-07-20 中国银联股份有限公司 A kind of data compression storage method and compression storing data device
CN108334554A (en) * 2017-12-29 2018-07-27 上海跬智信息技术有限公司 A kind of novel OLAP precomputations model and construction method
CN108460094A (en) * 2018-01-30 2018-08-28 上海天旦网络科技发展有限公司 The method and system of storage statistical data
CN109597842A (en) * 2018-12-14 2019-04-09 深圳前海微众银行股份有限公司 Data real-time computing technique, device, equipment and computer readable storage medium
CN109710610A (en) * 2018-12-17 2019-05-03 北京三快在线科技有限公司 Data processing method, device and calculating equipment
CN109902132A (en) * 2019-02-26 2019-06-18 维正知识产权服务有限公司 A kind of relational model method for building up and its system for intellectual property multidimensional data
CN109960560A (en) * 2019-03-29 2019-07-02 北京九章云极科技有限公司 A kind of index processing method and system
CN110032582A (en) * 2019-03-07 2019-07-19 阿里巴巴集团控股有限公司 A kind of data processing method, device, equipment and system
WO2019165671A1 (en) * 2018-02-27 2019-09-06 平安科技(深圳)有限公司 Method for rapidly importing big data, apparatus, terminal device, and storage medium
CN111125109A (en) * 2019-12-24 2020-05-08 广州德久信息科技有限公司 Real-time statistical report system based on time grouping accumulation algorithm
CN112307024A (en) * 2020-10-29 2021-02-02 平安普惠企业管理有限公司 Data curing method and device, computer equipment and computer readable storage medium
CN112434036A (en) * 2020-11-24 2021-03-02 上海浦东发展银行股份有限公司 Account management system data processing method
CN112579655A (en) * 2020-12-15 2021-03-30 中国建设银行股份有限公司 Method, device and equipment for integrating customer portrait indexes
CN113392130A (en) * 2020-03-13 2021-09-14 阿里巴巴集团控股有限公司 Data processing method, device and equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6424967B1 (en) * 1998-11-17 2002-07-23 At&T Corp. Method and apparatus for querying a cube forest data structure
CN1564160A (en) * 2004-04-22 2005-01-12 重庆市弘越科技有限公司 Method of seting up and inquirying multiple-demensional data cube
CN101089846A (en) * 2006-06-16 2007-12-19 国际商业机器公司 Data analysis method, equipment and data analysis auxiliary method
EP2290594A1 (en) * 2009-08-31 2011-03-02 Accenture Global Services GmbH Adaptative analytics multidimensional processing system
CN102467559A (en) * 2010-11-19 2012-05-23 金蝶软件(中国)有限公司 Multilevel and multidimensional data attribute analysis method and device
CN102982103A (en) * 2012-11-06 2013-03-20 东南大学 On-line analytical processing (OLAP) massive multidimensional data dimension storage method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6424967B1 (en) * 1998-11-17 2002-07-23 At&T Corp. Method and apparatus for querying a cube forest data structure
CN1564160A (en) * 2004-04-22 2005-01-12 重庆市弘越科技有限公司 Method of seting up and inquirying multiple-demensional data cube
CN101089846A (en) * 2006-06-16 2007-12-19 国际商业机器公司 Data analysis method, equipment and data analysis auxiliary method
EP2290594A1 (en) * 2009-08-31 2011-03-02 Accenture Global Services GmbH Adaptative analytics multidimensional processing system
CN102467559A (en) * 2010-11-19 2012-05-23 金蝶软件(中国)有限公司 Multilevel and multidimensional data attribute analysis method and device
CN102982103A (en) * 2012-11-06 2013-03-20 东南大学 On-line analytical processing (OLAP) massive multidimensional data dimension storage method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
祝婕: "《olap中基于维层次聚类层次块树数据立方体存储的研究与应用》", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
赵斌: "《基于GraphOLAP的文献分析与可视化系统的研究与实现》", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794162A (en) * 2015-03-25 2015-07-22 中国人民大学 Real-time data storage and query method
CN104794162B (en) * 2015-03-25 2018-02-23 中国人民大学 Real-time data memory and querying method
CN106933902A (en) * 2015-12-31 2017-07-07 北京国双科技有限公司 Querying method and device that data multidimensional degree is freely dissected
CN106933902B (en) * 2015-12-31 2020-02-07 北京国双科技有限公司 Data multidimensional free analysis query method and device
CN107122369A (en) * 2016-02-25 2017-09-01 阿里巴巴集团控股有限公司 A kind of business data processing method, device and system
CN107153651A (en) * 2016-03-03 2017-09-12 阿里巴巴集团控股有限公司 A kind of multidimensional intersects data processing method and processing device
CN106250226A (en) * 2016-08-02 2016-12-21 福建华渔未来教育科技有限公司 Task Scheduling Mechanism based on concordance hash algorithm and system
CN106570064A (en) * 2016-10-10 2017-04-19 上海瀚之友信息技术服务有限公司 Real time calculating system and method for general structural data
CN107025542A (en) * 2016-10-27 2017-08-08 阿里巴巴集团控股有限公司 The method and apparatus that the integration capability of mix is provided
CN106649687A (en) * 2016-12-16 2017-05-10 飞狐信息技术(天津)有限公司 Method and device for on-line analysis and processing of large data
CN106649687B (en) * 2016-12-16 2023-11-21 飞狐信息技术(天津)有限公司 Big data online analysis processing method and device
CN107229730A (en) * 2017-06-08 2017-10-03 北京奇虎科技有限公司 Data query method and device
CN108304472A (en) * 2017-12-28 2018-07-20 中国银联股份有限公司 A kind of data compression storage method and compression storing data device
CN108334554A (en) * 2017-12-29 2018-07-27 上海跬智信息技术有限公司 A kind of novel OLAP precomputations model and construction method
CN108334554B (en) * 2017-12-29 2021-10-01 上海跬智信息技术有限公司 Novel OLAP pre-calculation model and construction method
CN108460094A (en) * 2018-01-30 2018-08-28 上海天旦网络科技发展有限公司 The method and system of storage statistical data
WO2019165671A1 (en) * 2018-02-27 2019-09-06 平安科技(深圳)有限公司 Method for rapidly importing big data, apparatus, terminal device, and storage medium
CN109597842A (en) * 2018-12-14 2019-04-09 深圳前海微众银行股份有限公司 Data real-time computing technique, device, equipment and computer readable storage medium
CN109710610A (en) * 2018-12-17 2019-05-03 北京三快在线科技有限公司 Data processing method, device and calculating equipment
CN109902132B (en) * 2019-02-26 2023-03-03 维正知识产权科技有限公司 Relation model establishing method and system for intellectual property multi-dimensional data
CN109902132A (en) * 2019-02-26 2019-06-18 维正知识产权服务有限公司 A kind of relational model method for building up and its system for intellectual property multidimensional data
CN110032582A (en) * 2019-03-07 2019-07-19 阿里巴巴集团控股有限公司 A kind of data processing method, device, equipment and system
CN110032582B (en) * 2019-03-07 2023-10-27 创新先进技术有限公司 Data processing method, device, equipment and system
CN109960560B (en) * 2019-03-29 2019-12-10 北京九章云极科技有限公司 Index processing method and system
CN109960560A (en) * 2019-03-29 2019-07-02 北京九章云极科技有限公司 A kind of index processing method and system
CN111125109A (en) * 2019-12-24 2020-05-08 广州德久信息科技有限公司 Real-time statistical report system based on time grouping accumulation algorithm
CN113392130A (en) * 2020-03-13 2021-09-14 阿里巴巴集团控股有限公司 Data processing method, device and equipment
CN112307024A (en) * 2020-10-29 2021-02-02 平安普惠企业管理有限公司 Data curing method and device, computer equipment and computer readable storage medium
CN112307024B (en) * 2020-10-29 2024-04-02 平安普惠企业管理有限公司 Data solidification method, device, computer equipment and computer readable storage medium
CN112434036A (en) * 2020-11-24 2021-03-02 上海浦东发展银行股份有限公司 Account management system data processing method
CN112579655A (en) * 2020-12-15 2021-03-30 中国建设银行股份有限公司 Method, device and equipment for integrating customer portrait indexes

Also Published As

Publication number Publication date
CN104424229B (en) 2019-02-22

Similar Documents

Publication Publication Date Title
CN104424229A (en) Calculating method and system for multi-dimensional division
CN101192227B (en) Log file analytical method and system based on distributed type computing network
CN101853287B (en) Data compression quick retrieval file system and method thereof
CN108509437B (en) ElasticSearch query acceleration method
CN102411533A (en) Log-management optimizing method for clustered storage system
CN103838867A (en) Log processing method and device
CN105992338B (en) Positioning method and device
CN113360554A (en) Method and equipment for extracting, converting and loading ETL (extract transform load) data
CN105049287A (en) Log processing method and log processing devices
CN104598495A (en) Hierarchical storage method and system based on distributed file system
CN102208991A (en) Blog processing method, device and system
CN102346751B (en) Information transmitting method and equipment
CN102750326A (en) Log management optimization method of cluster system based on downsizing strategy
JP6100900B2 (en) Method, device and system for online processing of data
CN111459986A (en) Data computing system and method
CN105512320A (en) User ranking obtaining method and device and server
CN103310087A (en) Service data statistic analysis method and device
CN104462430A (en) Relational database data processing method and device
CN103678293A (en) Data storage method and device
CN103455560A (en) Data query method and system
US20190050435A1 (en) Object data association index system and methods for the construction and applications thereof
CN104660427A (en) Method and device for real-time statistics of logs
CN103338260A (en) Distributed analytical system and analytical method for URL logs in network auditing
CN110727727A (en) Statistical method and device for database
CN106649691A (en) Stream data storage method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20190805

Address after: 518000 Nanshan District science and technology zone, Guangdong, Zhejiang Province, science and technology in the Tencent Building on the 1st floor of the 35 layer

Co-patentee after: Tencent cloud computing (Beijing) limited liability company

Patentee after: Tencent Technology (Shenzhen) Co., Ltd.

Address before: Shenzhen Futian District City, Guangdong province 518000 Zhenxing Road, SEG Science Park 2 East Room 403

Patentee before: Tencent Technology (Shenzhen) Co., Ltd.

TR01 Transfer of patent right