US20200387511A1 - Architecture of hybrid in-memory and paged dictionary - Google Patents
Architecture of hybrid in-memory and paged dictionary Download PDFInfo
- Publication number
- US20200387511A1 US20200387511A1 US15/931,063 US202015931063A US2020387511A1 US 20200387511 A1 US20200387511 A1 US 20200387511A1 US 202015931063 A US202015931063 A US 202015931063A US 2020387511 A1 US2020387511 A1 US 2020387511A1
- Authority
- US
- United States
- Prior art keywords
- value
- dictionary
- page
- different
- computing devices
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 239000013598 vector Substances 0.000 claims abstract description 61
- 238000000034 method Methods 0.000 claims abstract description 31
- 230000008859 change Effects 0.000 claims description 7
- 230000002085 persistent effect Effects 0.000 claims description 7
- 230000004044 response Effects 0.000 claims description 3
- 238000004590 computer program Methods 0.000 abstract description 4
- 238000012545 processing Methods 0.000 description 16
- 230000008569 process Effects 0.000 description 10
- 238000004891 communication Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 239000008186 active pharmaceutical agent Substances 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/221—Column-oriented storage; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2379—Updates performed during online database operations; commit processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
- G06F12/0238—Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
- G06F12/0246—Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory in block erasable memory, e.g. flash memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0875—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0877—Cache access modes
- G06F12/0882—Page mode
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2219—Large Object storage; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2237—Vectors, bitmaps or matrices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2255—Hash tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2272—Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
- G06F16/24558—Binary matching operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/064—Management of blocks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
- G06F2212/1024—Latency reduction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1041—Resource optimization
- G06F2212/1044—Space efficiency improvement
Definitions
- Hybrid in-memory and paged storage configurations allow storage of data in dynamic storage devices as well as persistent storage.
- providing on-demand access to such data requires loading the data stored in persistent storage to in-memory storage. This can leave a large memory footprint and can be burdensome on a computing system.
- database columns can store compressed values rather than the entire value of data.
- a dictionary corresponding to a database column can store the entire value.
- the compressed value can act as a value identifier in the dictionary.
- Conventional systems operated to load the entire dictionary in memory when attempting to retrieve the value corresponding to a value ID. But this can be burdensome on a computing system and waste operational resources, such as memory.
- FIG. 1 is a block diagram of an architecture of in-memory and paged dictionary, according to some embodiments.
- FIG. 2 is a block diagram of an example helper vector, in-memory dictionary, and paged dictionary, according to some embodiments.
- FIG. 3 is a flowchart illustrating a process for loading a page of a dictionary for executing a query, according to some embodiments.
- FIG. 4 is a flowchart illustrating a process for retrieving a value from a dictionary using a value ID, according to some embodiments.
- FIG. 5 is a flowchart illustrating a process for generating a new helper vector and value ID index, according to some embodiments
- FIG. 6 is an example computer system useful for implementing various embodiments.
- a server receives a query to be executed.
- the query includes a value for executing the query.
- the server executes a query on a dictionary to retrieve a value ID corresponding to the value.
- the dictionary includes pages including value IDs and corresponding values.
- the server executes a binary search on a helper vector of the dictionary based on the value.
- the helper vector includes the last value for each page of a dictionary.
- the server identifies a page of the dictionary including the value.
- the server loads the page into temporary memory and retrieves the value ID of the value from the page.
- the server executes the query on a column using the value ID.
- This configuration provides for loading the relevant page which includes the requested value rather than an entire dictionary into memory (in some embodiments, plural relevant pages are loaded). This solves the technical problem of reducing the burden on the computing system. Furthermore, this reduces the memory footprint and improves operational efficiency.
- the described architecture also provides for a unified persistency format and the ability for transient paged and in-memory structures.
- FIG. 1 is a block diagram of an architecture of in-memory and paged dictionary, according to some embodiments.
- the architecture can include a server 100 , database 128 , and client device 130 .
- Server 100 can be in communication with database 128 and client device 130 .
- Server 100 , database 128 , and client device 130 can be connected through wired connections, wireless connections, or a combination of wired and wireless connections.
- the network can be an ad hoc network, an intranet, an extranet, a virtual private network (VPN), a local area network (LAN), a wireless LAN (WLAN), a wide area network (WAN), a wireless wide area network (WWAN), a metropolitan area network (MAN), a portion of the Internet, a portion of the Public Switched Telephone Network (PSTN), a cellular telephone network, a wireless network, a WiFi network, a WiMax network, any other type of network, or a combination of two or more such networks.
- VPN virtual private network
- LAN local area network
- WLAN wireless LAN
- WAN wide area network
- WWAN wireless wide area network
- MAN metropolitan area network
- PSTN Public Switched Telephone Network
- PSTN Public Switched Telephone Network
- Server 100 can include a database management system (DMS) 102 , temporary memory 104 , main memory 108 , and secondary memory 123 .
- main memory 108 can be Random Access Memory (RAM) or any other type of dynamic memory.
- Secondary memory 123 can be persistent memory such as a non-volatile storage device.
- Temporary memory 104 can be a buffer or cache memory. The temporary memory 104 may reside within the main memory 108 .
- Each of the main memory 108 , secondary memory 123 , and temporary memory 104 can be configured to store portion or entire copies of a dictionary corresponding to a column of the database 128 .
- the column of the database 128 can be configured to store data including compressed versions of certain values (i.e., a value ID) rather than the entire value.
- the dictionary can store the value ID and the corresponding entire value.
- the dictionary can store the value and the value ID may be inferred based on the position of the value in the dictionary.
- the dictionary can be used to correlate and retrieve values of different value IDs so that these value IDs can be looked up in the column.
- the dictionary can include multiple pages of value IDs and values.
- An example dictionary 116 can be stored in secondary memory 123 .
- Dictionary 116 can include page 106 , 118 , and 120 .
- an example dictionary 124 can be stored in main memory 108 .
- Dictionary 124 can store a copy of pages 106 - a , 118 - a , and 120 - a which are a copy of pages 106 , 118 , and 120 separately or in contiguous memory space.
- dictionary 124 can be an in-memory version of dictionary 116 .
- main memory 108 can be dynamic memory.
- dictionary 124 may be an in-memory dictionary which offers in-memory processing.
- Secondary memory 123 may be a persistent storage device.
- dictionary 116 may be a paged dictionary which offers paged processing. This may be referred to as a hybrid storage configuration or hybrid column store.
- This hybrid storage configuration offers in-memory processing for performance critical operations and buffer-managed paged processing for less critical, economical data access.
- This hybrid capability extends up from a unified persistence format which can be used to load paged or in-memory primitive structures, which in turn form paged or in-memory column store structures (data vectors, dictionaries 116 or 124 , and indexes) that are ultimately arranged by the hybrid storage configuration or hybrid column store according to each column's load unit configuration.
- Hybrid columns can be configured to load all in-memory structures, all paged structures, or a mixture of both.
- a user can control which portions of the dictionary are stored in main memory 108 or secondary memory 123 , in whichever configuration necessary.
- a user can use client device 130 to manage the storage of the dictionaries 116 and 124 in main memory 108 and secondary memory 123 .
- the storage of the dictionaries 116 and 124 can be automatically configured based on workload, data access algorithms, and intelligent algorithms.
- Dictionary 124 can be referred to as an in-memory dictionary and dictionary 116 can be a paged dictionary.
- an entire copy of dictionary 116 can be stored in main memory 108 .
- a copy of a portion of dictionary 116 can be stored in main memory 108 .
- different portions of a dictionary can be stored in main memory 108 and secondary memory 123 .
- columns of database 128 can also be stored in a similar configuration as described above.
- the columns of database 128 can be stored in secondary memory 123 and an entire copy of columns can be stored in main memory 108 .
- the columns of database 128 can be stored in secondary memory 123 and a copy of a portion of the columns of database 128 can be stored in main memory 108 .
- different portions of the columns can be stored in main memory 108 and secondary memory 123 .
- Dictionary 116 can be a multi-page vector.
- the multi-page vector can be a type of paged primitive that provides a uniform interface with in-memory counterparts to read-write algorithms. This allows the codebase to seamlessly operate on either format with minimum adaption, hiding the details of operating on data in native store extension (NSE). Furthermore, paged primitives can be enriched with auxiliary and performant search structures and indexes.
- a multi-page vector is a large vector that can be stored on a single page chain with fixed-size pages. Having a fixed number of objects per page simplifies identifying the page that has the content for a given vector position.
- the multi-page vector is configured to store more than one vector on each page chain, with each vector having its metadata. Once a vector is sorted, each multi-page vector can be extended with a helper structure (e.g., helper vector) to facilitate search and to avoid loading pages that are guaranteed not to have a value that does not satisfy the search.
- helper structure e.g., helper vector
- Dictionary 116 can include a value ID index 126 .
- the value ID index 126 can be a data structure that stores the last value ID of each page of the dictionary 116 for data types of variable length.
- dictionary 124 may not include a value ID index.
- the value ID can act as the identifier of the value.
- the columns of database 128 can store a value ID representing the value, while the dictionaries can store both the value ID and value.
- Each page of dictionary 116 and dictionary 124 can store value IDs and the corresponding values.
- each page of dictionary 116 can include a fixed amount of value IDs and values.
- page 106 , page 118 , and page 120 can each include three value IDs and their corresponding value.
- Dictionary 116 and dictionary 124 can be used to retrieve a given value ID so that the value ID can be used to execute a query on a given column.
- the secondary memory 123 can also include a helper vector 122 .
- Helper vector can also be a paged primitive.
- helper vector 122 can be an arbitrary sized item. For an arbitrary sized item, a block is fully stored on a single page. The number of blocks stored on a page depends on block sizes. For data items larger than the page size, data is internally divided into multiple blocks.
- Helper vector 122 can also be referred to as a compact memory resident sorted vector. Helper vector 122 is sorted and stores the last value of each page stored in dictionary 116 . Helper vector 122 can be used to quickly identify a particular page including a given value.
- DMS 102 can include a wrapper 110 , unification engine 112 , paging engine 114 , and query engine 115 .
- DMS 102 can receive and process query requests by retrieving value ID of a given value or value for a given value ID from a given page of a dictionary and executing the query.
- Wrapper 110 can be configured to determine whether the dictionary to be queried is stored in main memory 108 or secondary memory 123 .
- Unification engine 112 can generate a new helper vector and value index in response to an occurrence of an event save operation of the dictionary from main memory 108 to secondary memory 123 .
- Paging engine 114 can be configured to load a particular page of dictionary 116 into temporary memory 104 .
- Query engine 115 can be configured to execute queries against a given column.
- DMS 102 can receive a request to execute a query including a given value, 2.50.
- the query may be received from client device 130 .
- the request can be to retrieve all plants which are 2.50 inches tall.
- Wrapper 110 can determine whether to search dictionary 116 or dictionary 124 for the value ID corresponding to 2.50.
- the request can indicate whether to query the in-memory dictionary (e.g., dictionary 124 ) or paged dictionary (e.g., dictionary 116 ).
- wrapper 110 can determine that dictionary 116 is to be queried.
- Paging engine 114 can execute a binary search on helper vector 122 to identify a particular page of dictionary 116 on which the value, 2.50, is located. To execute the binary search, paging engine 114 compares the target value (2.50) to the middle value of the helper vector. If they are not equal, the half in which the value cannot lie is eliminated and the search continues on the remaining half. Paging engine 114 repeats this process taking the middle element to compare to the target value, and repeating this until the target value is found or there is a single value remaining. As described above, helper vector 122 stores the last value of each page in dictionary 116 and the corresponding page. In the event, there is a single value remaining, paging engine 114 can determine that the 2.50 is on the page corresponding to the remaining value. In this example, the paging engine 114 can determine that 2.50 is included in page 106 .
- Paging engine 114 can load page 106 from secondary memory 123 to temporary memory 104 (which can be from the buffer cache) and retrieve the value ID, 2, for the value 2.50. By doing so, the paging engine 114 avoids having to load the entire dictionary 116 into memory. This reduces the memory footprint.
- Query engine 115 can execute the query for retrieving all plants that are 2.50 inches, using the value ID, 2.
- DMS 102 can receive a request to materialize a value corresponding to the value identifier observed in a data vector at the desired row position.
- the request can include the value ID. Since each page includes a fixed amount of values, the paging engine 114 can identify a particular page that includes the value based on the value ID included in the request and the fixed amount of values per page.
- the dictionary 116 can include values that are variable size data types (e.g., VARCHAR). Each page can include a different number of value IDs and corresponding values.
- the value ID 126 index can store the last value ID of each page.
- DMS 102 receives a request to materialize a value corresponding to the value ID observed in a data vector at the desired row position, the paging engine 114 can identify a particular page that includes the value based on the last value ID index 126 and the value ID included in the request. For example, the paging engine 114 may execute a binary search on the value ID index 126 using the value identifier to identify the page that includes the value.
- wrapper 110 can receive a request to execute a query from an upper layer.
- the upper layer may be an API, which is agnostic to the configuration dictionary 116 and dictionary 124 . That is, the upper layer may query the dictionary 116 or 124 without distinguishing the dictionary 116 or 124 .
- the lower layer includes services or APIs that are configured to distinguish dictionary 116 or 124 when querying either dictionary. In an example, these requests may specify a particular dictionary to query.
- the wrapper 110 can determine which dictionary to query. This allows the upper layer to make a single API call to a dictionary and the wrapper 110 can identify which implementation of the dictionary, in-memory (e.g., dictionary 124 ) or paged (e.g., dictionary 116 ) to query.
- in-memory e.g., dictionary 124
- paged e.g., dictionary 116
- wrapper 110 can allow a query for a value that matches a certain pattern rather than a specific value. Furthermore, the wrapper 110 allows such algorithms to be implemented and maintained once, independent of the storage location (in-memory vs paged).
- unification engine 112 determines that dictionary 124 is being stored in paged format and can generate auxiliary data structures.
- the auxiliary data structures can include metadata such as a value ID index 126 and a value index (i.e., the helper vector 122 ).
- the value ID index 126 is the last value ID on a page.
- the value index is the last value on a page.
- the save operation occurs any time there is a delta merge, optimized compression, or any data definition language operation. This ensures when there is a write operation performed on dictionary 124 , the data persisted to the secondary memory 123 is accurately reflected in the value ID index 126 and helper vector 122 , so that the next received query is processed accurately.
- FIG. 2 is a block diagram of an example helper vector, in-memory dictionary, and paged dictionary, according to some embodiments.
- dictionary 116 can include pages 106 , 118 , and 120 . Additionally, dictionary 116 can include a value ID index 126 , and a helper vector 122 .
- the value ID index 126 can include the last value ID for each page. In the event, the values stored in the dictionary 116 are of variable size data type, the value ID index 126 can be used to identify a page including a value using the corresponding value ID.
- the helper vector 122 can include the last value for each page.
- page 106 can include value IDs 1, 2, 3 and corresponding values 1.56, 2.50, 3.14.
- Page 118 can include value IDs 4, 5, 6 and corresponding values 4.98, 6.11, and 6.50.
- Page 120 can include value IDs 7, 8, 9 and corresponding values 6.70, 8.88, and 10.88.
- Helper vector 122 can include the last value of 3.14 for page 106 , 6.50 as the last value of page 118 , and 10.88 as the last value for page 120 .
- the DMS 102 as shown in FIG. 1 receives a query include value, 8.88
- the paging engine 114 as shown in FIG. 1 can execute a binary search on helper vector 122 to page number and subsequently the identify the page number and subsequently the value ID.
- the paging engine can identify the middle value of the helper vector 122 , which is 6.50.
- the paging engine can determine that 8.88 is greater than 6.50 so the paging engine would eliminate pages 106 and 118 .
- the last remaining page is page 120 .
- the paging engine can load page 120 into temporary memory 104 as shown in FIG. 1 .
- the paging engine can determine the value ID for value 8.88 is 8.
- the DMS 102 receives a request to materialize a value based on a value ID 5.
- the paging engine can determine that each page stores 3 value IDs. Based to this, the paging engine can determine page 106 stores the first three value IDs, page 118 can store the next three value IDs, and page 120 can store the last three value IDs.
- the paging engine can determine that value ID 5 is located on page 118 .
- the paging engine can load page 118 into temporary memory and determine the corresponding value for value ID 5 is 6.11.
- FIG. 3 is a flowchart illustrating a process for loading a page of a dictionary for executing a query, according to an embodiment.
- Method 300 can be performed by processing logic that can comprise hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (e.g., instructions executing on a processing device), or a combination thereof. It is to be appreciated that not all steps can be needed to perform the disclosure provided herein. Further, some of the steps can be performed simultaneously, or in a different order than shown in FIG. 3 , as will be understood by a person of ordinary skill in the art.
- Method 300 shall be described with reference to FIG. 1 . However, method 300 is not limited to that example embodiment.
- DMS 102 receives a request to execute a query including a value.
- the request can be directed to executing a query against a particular column.
- columns may not store entire values. Rather, columns can store value IDs representing the value.
- the request can be received from the client device 130 .
- the paging engine 114 queries a dictionary for a value ID corresponding to the value received in the request.
- the dictionary includes multiple pages including value IDs and the corresponding values.
- the wrapper 110 can determine whether to query an in-memory dictionary (dictionary 124 ) or a paged dictionary (dictionary 116 ). In this embodiment, the paged dictionary is queried.
- the paged dictionary includes a helper vector 122 .
- the paging engine 114 executes a binary search on the helper vector 122 of the dictionary 116 using the value.
- the helper vector 122 includes a last value for each page of a dictionary 116 .
- the binary search can search for a value in the helper vector based on the last value for each page.
- the paging engine 114 identifies a page of the dictionary including the value.
- the paging engine identifies the page by executing the binary search on the helper vector. For example, based on the binary search the paging engine 114 can eliminate pages on which the value is not included and based on processes of elimination identify the page on which the value is included.
- the paging engine 114 loads the page into temporary memory 104 .
- Temporary memory 104 can be buffer or cache memory.
- the page can include a list of value IDs and corresponding values.
- the paging engine 114 retrieves the value ID of the value from the page loaded in the temporary memory.
- the paging engine 114 can execute a query to retrieve the value ID from the page. Since a single page is loaded into temporary memory, executing this query is not burdensome on the operational resources.
- the query engine 116 executes the requested query on the column using the value ID.
- the query engine 116 can receive the results from the query and transmit the results to the client device 130 .
- FIG. 4 is a flowchart illustrating a process for retrieving a value from a dictionary using a value ID, according to an embodiment.
- Method 400 can be performed by processing logic that can comprise hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (e.g., instructions executing on a processing device), or a combination thereof. It is to be appreciated that not all steps can be needed to perform the disclosure provided herein. Further, some of the steps can be performed simultaneously, or in a different order than shown in FIG. 4 , as will be understood by a person of ordinary skill in the art.
- Method 400 shall be described with reference to FIG. 1 . However, method 400 is not limited to that example embodiment.
- the DMS 102 receives a request to materialize a value using a value ID.
- a (paged) dictionary 116 can include multiple pages of value IDs and corresponding values.
- the dictionary 116 include values of fixed data type, the dictionary 116 can store a fixed amount of value IDs and corresponding values per page.
- the paging engine 114 determines a number of values per page in the dictionary. As described above, each page of the dictionary can store a fixed amount of value IDs and values.
- the paging engine 114 identifies a page including the requested value based on value ID and the number of value IDs per page in the dictionary. For example, paging engine 114 can divide the number of value IDs by the value ID (and take the floor of the result) to identify the page including the value ID and requested value.
- the paging engine 114 loads the identified page into the temporary memory 104 .
- temporary memory 104 can be buffer or cache configured to store data for a short amount of time.
- the paging engine 114 retrieves value corresponding to the value ID form the page stored in the temporary memory 104 .
- the paging engine 114 can execute a query to retrieve the value ID from the page.
- FIG. 5 is a flowchart illustrating a process for generating a new value ID index and helper vector based on an event, according to an embodiment.
- Method 500 can be performed by processing logic that can comprise hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (e.g., instructions executing on a processing device), or a combination thereof. It is to be appreciated that not all steps can be needed to perform the disclosure provided herein. Further, some of the steps can be performed simultaneously, or in a different order than shown in FIG. 5 , as will be understood by a person of ordinary skill in the art.
- Method 500 shall be described with reference to FIG. 1 . However, method 500 is not limited to that example embodiment.
- the unification engine 112 detects an event causing an update to the values and value IDs in the dictionary.
- a merge, optimized compression, or DDL operation can be executed on the in-memory dictionary (e.g., dictionary 124 ).
- the dictionary 124 can be persisted or saved from the main memory 108 to the secondary memory 123 . This can cause the values or value IDs to be updated in the dictionary 116 .
- the unification engine 112 identifies a change in the last values for each page in the dictionary based on the event. For example, based on a merge, more data can be added in the dictionary. This can shift the positions of the values such that the last value for each page can change.
- the unification engine 112 identifies a change in the last value ID of each page in the dictionary. Continuing with the earlier example, based on a merge, more data can be added in the dictionary. This can shift the positions of the value IDs such that the last value ID for each page can change.
- the unification engine 112 generates a new helper vector to reflect the changed last values for each page in the dictionary.
- the new helper vector can store the updated last values for each page in the dictionary.
- the new helper vector will be persisted in the secondary memory 123 .
- the unification engine 112 generates a new value ID index to reflect the changed last value IDs for each page in the dictionary.
- the new value ID index can store the update last value ID for each page in the dictionary in the event the dictionary includes values of variable size data types.
- the new value ID index will be persisted in the secondary memory 123 .
- Computer system 600 can be used, for example, to implement method 300 of FIG. 3, 400 of FIG. 4, and 500 of FIG. 5 .
- Computer system 600 can be at least part of server 100 as shown in FIG. 1 .
- computer system 600 can identify a page from dictionary 116 or 124 as shown in FIGS. 1-2 , load the page and retrieve a requested value from dictionary 116 and 124 .
- Computer system 600 can be any computer capable of performing the functions described herein.
- Computer system 600 can be any well-known computer capable of performing the functions described herein.
- Computer system 600 includes one or more processors (also called central processing units, or CPUs), such as a processor 604 .
- processors also called central processing units, or CPUs
- Processor 604 is connected to a communication infrastructure or bus 606 .
- One or more processors 604 can each be a graphics processing unit (GPU).
- a GPU is a processor that is a specialized electronic circuit designed to process mathematically intensive applications.
- the GPU can have a parallel structure that is efficient for parallel processing of large blocks of data, such as mathematically intensive data common to computer graphics applications, images, videos, etc.
- Computer system 600 also includes user input/output device(s) 603 , such as monitors, keyboards, pointing devices, etc., that communicate with communication infrastructure 606 through user input/output interface(s) 602 .
- user input/output device(s) 603 such as monitors, keyboards, pointing devices, etc.
- communication infrastructure 606 such as keyboards, pointing devices, etc.
- Computer system 600 also includes a main or primary memory 608 , such as random access memory (RAM).
- Main memory 608 can include one or more levels of cache.
- Main memory 608 has stored therein control logic (i.e., computer software) and/or data.
- Computer system 600 can also include one or more secondary storage devices or memory 610 .
- Secondary memory 610 can include, for example, a hard disk drive 612 and/or a removable storage device or drive 614 .
- Removable storage drive 614 can be a floppy disk drive, a magnetic tape drive, a compact disk drive, an optical storage device, tape backup device, and/or any other storage device/drive.
- Removable storage drive 614 can interact with a removable storage unit 618 .
- Removable storage unit 618 includes a computer usable or readable storage device having stored thereon computer software (control logic) and/or data.
- Removable storage unit 618 can be a floppy disk, magnetic tape, compact disk, DVD, optical storage disk, and/any other computer data storage device.
- Removable storage drive 614 reads from and/or writes to removable storage unit 618 in a well-known manner.
- secondary memory 610 can include other means, instrumentalities or other approaches for allowing computer programs and/or other instructions and/or data to be accessed by computer system 600 .
- Such means, instrumentalities or other approaches can include, for example, a removable storage unit 622 and an interface 620 .
- the removable storage unit 622 and the interface 620 can include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM or PROM) and associated socket, a memory stick and USB port, a memory card and associated memory card slot, and/or any other removable storage unit and associated interface.
- Computer system 600 can further include a communication or network interface 624 .
- Communication interface 624 enables computer system 600 to communicate and interact with any combination of remote devices, remote networks, remote entities, etc. (individually and collectively referenced by reference number 628 ).
- communication interface 624 can allow computer system 600 to communicate with remote devices 628 over communications path 626 , which can be wired and/or wireless, and which can include any combination of LANs, WANs, the Internet, etc. Control logic and/or data can be transmitted to and from computer system 600 via communication path 626 .
- a tangible, non-transitory apparatus or article of manufacture comprising a tangible, non-transitory computer useable or readable medium having control logic (software) stored thereon is also referred to herein as a computer program product or program storage device.
- control logic software stored thereon
- control logic when executed by one or more data processing devices (such as computer system 600 ), causes such data processing devices to operate as described herein.
- references herein to “one embodiment,” “an embodiment,” “an example embodiment,” or similar phrases indicate that the embodiment described can include a particular feature, structure, or characteristic, but every embodiment can not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it would be within the knowledge of persons skilled in the relevant art(s) to incorporate such feature, structure, or characteristic into other embodiments whether or not explicitly mentioned or described herein. Additionally, some embodiments can be described using the expression “coupled” and “connected” along with their derivatives. These terms are not necessarily intended as synonyms for each other.
- Coupled can also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
- This application claims priority to U.S. Provisional Application No. 62/858,693, filed on Jun. 7, 2019, the contents of which are incorporated herein in their entirety.
- Hybrid in-memory and paged storage configurations allow storage of data in dynamic storage devices as well as persistent storage. However, providing on-demand access to such data requires loading the data stored in persistent storage to in-memory storage. This can leave a large memory footprint and can be burdensome on a computing system.
- For example, to save space, database columns can store compressed values rather than the entire value of data. A dictionary corresponding to a database column can store the entire value. The compressed value can act as a value identifier in the dictionary. However, it can be difficult to retrieve a value from the dictionary as the dictionary can be voluminous and can include multiple different pages. Conventional systems operated to load the entire dictionary in memory when attempting to retrieve the value corresponding to a value ID. But this can be burdensome on a computing system and waste operational resources, such as memory.
- The accompanying drawings are incorporated herein and form a part of the specification.
-
FIG. 1 is a block diagram of an architecture of in-memory and paged dictionary, according to some embodiments. -
FIG. 2 is a block diagram of an example helper vector, in-memory dictionary, and paged dictionary, according to some embodiments. -
FIG. 3 is a flowchart illustrating a process for loading a page of a dictionary for executing a query, according to some embodiments. -
FIG. 4 is a flowchart illustrating a process for retrieving a value from a dictionary using a value ID, according to some embodiments. -
FIG. 5 is a flowchart illustrating a process for generating a new helper vector and value ID index, according to some embodiments -
FIG. 6 is an example computer system useful for implementing various embodiments. - In the drawings, like reference numbers generally indicate identical or similar elements. Additionally, generally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.
- Provided herein are system, apparatus, device, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for identifying and loading a relevant page of a dictionary into temporary memory.
- In an embodiment, a server receives a query to be executed. The query includes a value for executing the query. The server executes a query on a dictionary to retrieve a value ID corresponding to the value. The dictionary includes pages including value IDs and corresponding values. The server executes a binary search on a helper vector of the dictionary based on the value. The helper vector includes the last value for each page of a dictionary. The server identifies a page of the dictionary including the value. The server loads the page into temporary memory and retrieves the value ID of the value from the page. The server executes the query on a column using the value ID.
- This configuration provides for loading the relevant page which includes the requested value rather than an entire dictionary into memory (in some embodiments, plural relevant pages are loaded). This solves the technical problem of reducing the burden on the computing system. Furthermore, this reduces the memory footprint and improves operational efficiency. The described architecture also provides for a unified persistency format and the ability for transient paged and in-memory structures.
-
FIG. 1 is a block diagram of an architecture of in-memory and paged dictionary, according to some embodiments. In an embodiment, the architecture can include aserver 100,database 128, andclient device 130.Server 100 can be in communication withdatabase 128 andclient device 130.Server 100,database 128, andclient device 130 can be connected through wired connections, wireless connections, or a combination of wired and wireless connections. - As an
example server 100,database 128, andclient device 130, can be connected through a network. The network can be an ad hoc network, an intranet, an extranet, a virtual private network (VPN), a local area network (LAN), a wireless LAN (WLAN), a wide area network (WAN), a wireless wide area network (WWAN), a metropolitan area network (MAN), a portion of the Internet, a portion of the Public Switched Telephone Network (PSTN), a cellular telephone network, a wireless network, a WiFi network, a WiMax network, any other type of network, or a combination of two or more such networks. -
Server 100 can include a database management system (DMS) 102,temporary memory 104,main memory 108, andsecondary memory 123. As an example,main memory 108 can be Random Access Memory (RAM) or any other type of dynamic memory.Secondary memory 123 can be persistent memory such as a non-volatile storage device.Temporary memory 104 can be a buffer or cache memory. Thetemporary memory 104 may reside within themain memory 108. - Each of the
main memory 108,secondary memory 123, andtemporary memory 104 can be configured to store portion or entire copies of a dictionary corresponding to a column of thedatabase 128. The column of thedatabase 128 can be configured to store data including compressed versions of certain values (i.e., a value ID) rather than the entire value. The dictionary can store the value ID and the corresponding entire value. Alternatively, the dictionary can store the value and the value ID may be inferred based on the position of the value in the dictionary. In this regard, the dictionary can be used to correlate and retrieve values of different value IDs so that these value IDs can be looked up in the column. - The dictionary can include multiple pages of value IDs and values. An
example dictionary 116 can be stored insecondary memory 123.Dictionary 116 can includepage example dictionary 124 can be stored inmain memory 108.Dictionary 124 can store a copy of pages 106-a, 118-a, and 120-a which are a copy ofpages dictionary 124 can be an in-memory version ofdictionary 116. As described above,main memory 108 can be dynamic memory. In this regard,dictionary 124 may be an in-memory dictionary which offers in-memory processing.Secondary memory 123 may be a persistent storage device. In this regard,dictionary 116 may be a paged dictionary which offers paged processing. This may be referred to as a hybrid storage configuration or hybrid column store. - This hybrid storage configuration offers in-memory processing for performance critical operations and buffer-managed paged processing for less critical, economical data access. This hybrid capability extends up from a unified persistence format which can be used to load paged or in-memory primitive structures, which in turn form paged or in-memory column store structures (data vectors,
dictionaries - It can be appreciated, a user can control which portions of the dictionary are stored in
main memory 108 orsecondary memory 123, in whichever configuration necessary. For example, a user can useclient device 130 to manage the storage of thedictionaries main memory 108 andsecondary memory 123. Alternatively, the storage of thedictionaries Dictionary 124 can be referred to as an in-memory dictionary anddictionary 116 can be a paged dictionary. - In an example, an entire copy of
dictionary 116 can be stored inmain memory 108. Alternatively, a copy of a portion ofdictionary 116 can be stored inmain memory 108. In another example, different portions of a dictionary can be stored inmain memory 108 andsecondary memory 123. - Furthermore, columns of
database 128 can also be stored in a similar configuration as described above. In an example, the columns ofdatabase 128 can be stored insecondary memory 123 and an entire copy of columns can be stored inmain memory 108. Alternatively, the columns ofdatabase 128 can be stored insecondary memory 123 and a copy of a portion of the columns ofdatabase 128 can be stored inmain memory 108. In another example, different portions of the columns can be stored inmain memory 108 andsecondary memory 123. -
Dictionary 116 can be a multi-page vector. The multi-page vector can be a type of paged primitive that provides a uniform interface with in-memory counterparts to read-write algorithms. This allows the codebase to seamlessly operate on either format with minimum adaption, hiding the details of operating on data in native store extension (NSE). Furthermore, paged primitives can be enriched with auxiliary and performant search structures and indexes. - A multi-page vector is a large vector that can be stored on a single page chain with fixed-size pages. Having a fixed number of objects per page simplifies identifying the page that has the content for a given vector position. The multi-page vector is configured to store more than one vector on each page chain, with each vector having its metadata. Once a vector is sorted, each multi-page vector can be extended with a helper structure (e.g., helper vector) to facilitate search and to avoid loading pages that are guaranteed not to have a value that does not satisfy the search.
-
Dictionary 116 can include avalue ID index 126. Thevalue ID index 126 can be a data structure that stores the last value ID of each page of thedictionary 116 for data types of variable length. In an embodiment,dictionary 124 may not include a value ID index. The value ID can act as the identifier of the value. As described above, the columns ofdatabase 128 can store a value ID representing the value, while the dictionaries can store both the value ID and value. Each page ofdictionary 116 anddictionary 124 can store value IDs and the corresponding values. - The values can be fixed data types such as integers, doubles, longs, shorts, or the like. Furthermore, in the event the values are fixed size data types, each page of
dictionary 116 can include a fixed amount of value IDs and values. As a non-limiting example,page 106,page 118, andpage 120 can each include three value IDs and their corresponding value.Dictionary 116 anddictionary 124 can be used to retrieve a given value ID so that the value ID can be used to execute a query on a given column. - The
secondary memory 123 can also include ahelper vector 122. Helper vector can also be a paged primitive. In particular,helper vector 122 can be an arbitrary sized item. For an arbitrary sized item, a block is fully stored on a single page. The number of blocks stored on a page depends on block sizes. For data items larger than the page size, data is internally divided into multiple blocks. -
Helper vector 122 can also be referred to as a compact memory resident sorted vector.Helper vector 122 is sorted and stores the last value of each page stored indictionary 116.Helper vector 122 can be used to quickly identify a particular page including a given value. -
DMS 102 can include awrapper 110,unification engine 112,paging engine 114, andquery engine 115.DMS 102 can receive and process query requests by retrieving value ID of a given value or value for a given value ID from a given page of a dictionary and executing the query.Wrapper 110 can be configured to determine whether the dictionary to be queried is stored inmain memory 108 orsecondary memory 123.Unification engine 112 can generate a new helper vector and value index in response to an occurrence of an event save operation of the dictionary frommain memory 108 tosecondary memory 123. Pagingengine 114 can be configured to load a particular page ofdictionary 116 intotemporary memory 104.Query engine 115 can be configured to execute queries against a given column. - As a non-limiting example,
DMS 102 can receive a request to execute a query including a given value, 2.50. The query may be received fromclient device 130. For example, the request can be to retrieve all plants which are 2.50 inches tall.Wrapper 110 can determine whether to searchdictionary 116 ordictionary 124 for the value ID corresponding to 2.50. In some instances, the request can indicate whether to query the in-memory dictionary (e.g., dictionary 124) or paged dictionary (e.g., dictionary 116). In this example,wrapper 110 can determine thatdictionary 116 is to be queried. - Paging
engine 114 can execute a binary search onhelper vector 122 to identify a particular page ofdictionary 116 on which the value, 2.50, is located. To execute the binary search,paging engine 114 compares the target value (2.50) to the middle value of the helper vector. If they are not equal, the half in which the value cannot lie is eliminated and the search continues on the remaining half. Pagingengine 114 repeats this process taking the middle element to compare to the target value, and repeating this until the target value is found or there is a single value remaining. As described above,helper vector 122 stores the last value of each page indictionary 116 and the corresponding page. In the event, there is a single value remaining,paging engine 114 can determine that the 2.50 is on the page corresponding to the remaining value. In this example, thepaging engine 114 can determine that 2.50 is included inpage 106. - Paging
engine 114 can loadpage 106 fromsecondary memory 123 to temporary memory 104 (which can be from the buffer cache) and retrieve the value ID, 2, for the value 2.50. By doing so, thepaging engine 114 avoids having to load theentire dictionary 116 into memory. This reduces the memory footprint.Query engine 115 can execute the query for retrieving all plants that are 2.50 inches, using the value ID, 2. - In an embodiment,
DMS 102 can receive a request to materialize a value corresponding to the value identifier observed in a data vector at the desired row position. The request can include the value ID. Since each page includes a fixed amount of values, thepaging engine 114 can identify a particular page that includes the value based on the value ID included in the request and the fixed amount of values per page. - In an alternative embodiment, the
dictionary 116 can include values that are variable size data types (e.g., VARCHAR). Each page can include a different number of value IDs and corresponding values. Thevalue ID 126 index can store the last value ID of each page. In this regard, in the event,DMS 102 receives a request to materialize a value corresponding to the value ID observed in a data vector at the desired row position, thepaging engine 114 can identify a particular page that includes the value based on the lastvalue ID index 126 and the value ID included in the request. For example, thepaging engine 114 may execute a binary search on thevalue ID index 126 using the value identifier to identify the page that includes the value. - In an embodiment,
wrapper 110 can receive a request to execute a query from an upper layer. The upper layer may be an API, which is agnostic to theconfiguration dictionary 116 anddictionary 124. That is, the upper layer may query thedictionary dictionary dictionary wrapper 110 can determine which dictionary to query. This allows the upper layer to make a single API call to a dictionary and thewrapper 110 can identify which implementation of the dictionary, in-memory (e.g., dictionary 124) or paged (e.g., dictionary 116) to query. This allows for more complex queries ondictionary wrapper 110 can allow a query for a value that matches a certain pattern rather than a specific value. Furthermore, thewrapper 110 allows such algorithms to be implemented and maintained once, independent of the storage location (in-memory vs paged). - In an embodiment, when
dictionary 124 is being saved or persisted frommain memory 108 tosecondary memory 123,unification engine 112 determines thatdictionary 124 is being stored in paged format and can generate auxiliary data structures. The auxiliary data structures can include metadata such as avalue ID index 126 and a value index (i.e., the helper vector 122). Thevalue ID index 126 is the last value ID on a page. The value index is the last value on a page. The save operation occurs any time there is a delta merge, optimized compression, or any data definition language operation. This ensures when there is a write operation performed ondictionary 124, the data persisted to thesecondary memory 123 is accurately reflected in thevalue ID index 126 andhelper vector 122, so that the next received query is processed accurately. -
FIG. 2 is a block diagram of an example helper vector, in-memory dictionary, and paged dictionary, according to some embodiments. As described inFIG. 1 ,dictionary 116 can includepages dictionary 116 can include avalue ID index 126, and ahelper vector 122. Thevalue ID index 126 can include the last value ID for each page. In the event, the values stored in thedictionary 116 are of variable size data type, thevalue ID index 126 can be used to identify a page including a value using the corresponding value ID. Thehelper vector 122 can include the last value for each page. - As a non-limiting example,
page 106 can includevalue IDs Page 118 can includevalue IDs Page 120 can includevalue IDs Helper vector 122 can include the last value of 3.14 forpage 106, 6.50 as the last value ofpage 118, and 10.88 as the last value forpage 120. - In the event, the
DMS 102 as shown inFIG. 1 , receives a query include value, 8.88, thepaging engine 114 as shown inFIG. 1 can execute a binary search onhelper vector 122 to page number and subsequently the identify the page number and subsequently the value ID. The paging engine can identify the middle value of thehelper vector 122, which is 6.50. The paging engine can determine that 8.88 is greater than 6.50 so the paging engine would eliminatepages page 120. The paging engine can loadpage 120 intotemporary memory 104 as shown inFIG. 1 . The paging engine can determine the value ID for value 8.88 is 8. - In the event, the
DMS 102 receives a request to materialize a value based on avalue ID 5. The paging engine can determine that each page stores 3 value IDs. Based to this, the paging engine can determinepage 106 stores the first three value IDs,page 118 can store the next three value IDs, andpage 120 can store the last three value IDs. The paging engine can determine thatvalue ID 5 is located onpage 118. The paging engine can loadpage 118 into temporary memory and determine the corresponding value forvalue ID 5 is 6.11. -
FIG. 3 is a flowchart illustrating a process for loading a page of a dictionary for executing a query, according to an embodiment.Method 300 can be performed by processing logic that can comprise hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (e.g., instructions executing on a processing device), or a combination thereof. It is to be appreciated that not all steps can be needed to perform the disclosure provided herein. Further, some of the steps can be performed simultaneously, or in a different order than shown inFIG. 3 , as will be understood by a person of ordinary skill in the art. -
Method 300 shall be described with reference toFIG. 1 . However,method 300 is not limited to that example embodiment. - In 302,
DMS 102 receives a request to execute a query including a value. The request can be directed to executing a query against a particular column. As described above, in some embodiments, columns may not store entire values. Rather, columns can store value IDs representing the value. For example, the request can be received from theclient device 130. - In 304, the
paging engine 114 queries a dictionary for a value ID corresponding to the value received in the request. The dictionary includes multiple pages including value IDs and the corresponding values. Before executing the query, thewrapper 110 can determine whether to query an in-memory dictionary (dictionary 124) or a paged dictionary (dictionary 116). In this embodiment, the paged dictionary is queried. The paged dictionary includes ahelper vector 122. - In 306, the
paging engine 114 executes a binary search on thehelper vector 122 of thedictionary 116 using the value. Thehelper vector 122 includes a last value for each page of adictionary 116. The binary search can search for a value in the helper vector based on the last value for each page. - In 308, the
paging engine 114 identifies a page of the dictionary including the value. The paging engine identifies the page by executing the binary search on the helper vector. For example, based on the binary search thepaging engine 114 can eliminate pages on which the value is not included and based on processes of elimination identify the page on which the value is included. - In 310, the
paging engine 114 loads the page intotemporary memory 104.Temporary memory 104 can be buffer or cache memory. The page can include a list of value IDs and corresponding values. - In 312, the
paging engine 114 retrieves the value ID of the value from the page loaded in the temporary memory. When the page is loaded into memory thepaging engine 114 can execute a query to retrieve the value ID from the page. Since a single page is loaded into temporary memory, executing this query is not burdensome on the operational resources. - In 314, the
query engine 116 executes the requested query on the column using the value ID. Thequery engine 116 can receive the results from the query and transmit the results to theclient device 130. -
FIG. 4 is a flowchart illustrating a process for retrieving a value from a dictionary using a value ID, according to an embodiment.Method 400 can be performed by processing logic that can comprise hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (e.g., instructions executing on a processing device), or a combination thereof. It is to be appreciated that not all steps can be needed to perform the disclosure provided herein. Further, some of the steps can be performed simultaneously, or in a different order than shown inFIG. 4 , as will be understood by a person of ordinary skill in the art. -
Method 400 shall be described with reference toFIG. 1 . However,method 400 is not limited to that example embodiment. - In 402, the
DMS 102 receives a request to materialize a value using a value ID. As described above, a (paged)dictionary 116 can include multiple pages of value IDs and corresponding values. In the event, thedictionary 116 include values of fixed data type, thedictionary 116 can store a fixed amount of value IDs and corresponding values per page. - In 404, the
paging engine 114 determines a number of values per page in the dictionary. As described above, each page of the dictionary can store a fixed amount of value IDs and values. - In 406, the
paging engine 114 identifies a page including the requested value based on value ID and the number of value IDs per page in the dictionary. For example,paging engine 114 can divide the number of value IDs by the value ID (and take the floor of the result) to identify the page including the value ID and requested value. - In 408, the
paging engine 114 loads the identified page into thetemporary memory 104. As described above,temporary memory 104 can be buffer or cache configured to store data for a short amount of time. - In 410, the
paging engine 114 retrieves value corresponding to the value ID form the page stored in thetemporary memory 104. When the page is loaded into memory thepaging engine 114 can execute a query to retrieve the value ID from the page. -
FIG. 5 is a flowchart illustrating a process for generating a new value ID index and helper vector based on an event, according to an embodiment.Method 500 can be performed by processing logic that can comprise hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (e.g., instructions executing on a processing device), or a combination thereof. It is to be appreciated that not all steps can be needed to perform the disclosure provided herein. Further, some of the steps can be performed simultaneously, or in a different order than shown inFIG. 5 , as will be understood by a person of ordinary skill in the art. -
Method 500 shall be described with reference toFIG. 1 . However,method 500 is not limited to that example embodiment. - In 502, the
unification engine 112 detects an event causing an update to the values and value IDs in the dictionary. For example, a merge, optimized compression, or DDL operation can be executed on the in-memory dictionary (e.g., dictionary 124). Thedictionary 124 can be persisted or saved from themain memory 108 to thesecondary memory 123. This can cause the values or value IDs to be updated in thedictionary 116. - In 504, the
unification engine 112 identifies a change in the last values for each page in the dictionary based on the event. For example, based on a merge, more data can be added in the dictionary. This can shift the positions of the values such that the last value for each page can change. - In 506, the
unification engine 112 identifies a change in the last value ID of each page in the dictionary. Continuing with the earlier example, based on a merge, more data can be added in the dictionary. This can shift the positions of the value IDs such that the last value ID for each page can change. - In 508, the
unification engine 112 generates a new helper vector to reflect the changed last values for each page in the dictionary. The new helper vector can store the updated last values for each page in the dictionary. The new helper vector will be persisted in thesecondary memory 123. - In 510, the
unification engine 112 generates a new value ID index to reflect the changed last value IDs for each page in the dictionary. The new value ID index can store the update last value ID for each page in the dictionary in the event the dictionary includes values of variable size data types. The new value ID index will be persisted in thesecondary memory 123. - Various embodiments can be implemented, for example, using one or more computer systems, such as
computer system 600 shown inFIG. 6 .Computer system 600 can be used, for example, to implementmethod 300 ofFIG. 3, 400 ofFIG. 4, and 500 ofFIG. 5 . Furthermore,computer system 600 can be at least part ofserver 100 as shown inFIG. 1 . For example,computer system 600 can identify a page fromdictionary FIGS. 1-2 , load the page and retrieve a requested value fromdictionary Computer system 600 can be any computer capable of performing the functions described herein. -
Computer system 600 can be any well-known computer capable of performing the functions described herein. -
Computer system 600 includes one or more processors (also called central processing units, or CPUs), such as aprocessor 604.Processor 604 is connected to a communication infrastructure orbus 606. - One or
more processors 604 can each be a graphics processing unit (GPU). In an embodiment, a GPU is a processor that is a specialized electronic circuit designed to process mathematically intensive applications. The GPU can have a parallel structure that is efficient for parallel processing of large blocks of data, such as mathematically intensive data common to computer graphics applications, images, videos, etc. -
Computer system 600 also includes user input/output device(s) 603, such as monitors, keyboards, pointing devices, etc., that communicate withcommunication infrastructure 606 through user input/output interface(s) 602. -
Computer system 600 also includes a main orprimary memory 608, such as random access memory (RAM).Main memory 608 can include one or more levels of cache.Main memory 608 has stored therein control logic (i.e., computer software) and/or data. -
Computer system 600 can also include one or more secondary storage devices ormemory 610.Secondary memory 610 can include, for example, ahard disk drive 612 and/or a removable storage device or drive 614.Removable storage drive 614 can be a floppy disk drive, a magnetic tape drive, a compact disk drive, an optical storage device, tape backup device, and/or any other storage device/drive. -
Removable storage drive 614 can interact with aremovable storage unit 618.Removable storage unit 618 includes a computer usable or readable storage device having stored thereon computer software (control logic) and/or data.Removable storage unit 618 can be a floppy disk, magnetic tape, compact disk, DVD, optical storage disk, and/any other computer data storage device.Removable storage drive 614 reads from and/or writes toremovable storage unit 618 in a well-known manner. - According to an exemplary embodiment,
secondary memory 610 can include other means, instrumentalities or other approaches for allowing computer programs and/or other instructions and/or data to be accessed bycomputer system 600. Such means, instrumentalities or other approaches can include, for example, aremovable storage unit 622 and aninterface 620. Examples of theremovable storage unit 622 and theinterface 620 can include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM or PROM) and associated socket, a memory stick and USB port, a memory card and associated memory card slot, and/or any other removable storage unit and associated interface. -
Computer system 600 can further include a communication ornetwork interface 624.Communication interface 624 enablescomputer system 600 to communicate and interact with any combination of remote devices, remote networks, remote entities, etc. (individually and collectively referenced by reference number 628). For example,communication interface 624 can allowcomputer system 600 to communicate with remote devices 628 overcommunications path 626, which can be wired and/or wireless, and which can include any combination of LANs, WANs, the Internet, etc. Control logic and/or data can be transmitted to and fromcomputer system 600 viacommunication path 626. - In an embodiment, a tangible, non-transitory apparatus or article of manufacture comprising a tangible, non-transitory computer useable or readable medium having control logic (software) stored thereon is also referred to herein as a computer program product or program storage device. This includes, but is not limited to,
computer system 600,main memory 608,secondary memory 610, andremovable storage units - Based on the teachings contained in this disclosure, it will be apparent to persons skilled in the relevant art(s) how to make and use embodiments of this disclosure using data processing devices, computer systems and/or computer architectures other than that shown in
FIG. 6 . In particular, embodiments can operate with software, hardware, and/or operating system implementations other than those described herein. - It is to be appreciated that the Detailed Description section, and not any other section, is intended to be used to interpret the claims. Other sections can set forth one or more but not all exemplary embodiments as contemplated by the inventor(s), and thus, are not intended to limit this disclosure or the appended claims in any way.
- While this disclosure describes exemplary embodiments for exemplary fields and applications, it should be understood that the disclosure is not limited thereto. Other embodiments and modifications thereto are possible, and are within the scope and spirit of this disclosure. For example, and without limiting the generality of this paragraph, embodiments are not limited to the software, hardware, firmware, and/or entities illustrated in the figures and/or described herein. Further, embodiments (whether or not explicitly described herein) have significant utility to fields and applications beyond the examples described herein.
- Embodiments have been described herein with the aid of functional building blocks illustrating the implementation of specified functions and relationships thereof. The boundaries of these functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternate boundaries can be defined as long as the specified functions and relationships (or equivalents thereof) are appropriately performed. Also, alternative embodiments can perform functional blocks, steps, operations, methods, etc. using orderings different than those described herein.
- References herein to “one embodiment,” “an embodiment,” “an example embodiment,” or similar phrases, indicate that the embodiment described can include a particular feature, structure, or characteristic, but every embodiment can not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it would be within the knowledge of persons skilled in the relevant art(s) to incorporate such feature, structure, or characteristic into other embodiments whether or not explicitly mentioned or described herein. Additionally, some embodiments can be described using the expression “coupled” and “connected” along with their derivatives. These terms are not necessarily intended as synonyms for each other. For example, some embodiments can be described using the terms “connected” and/or “coupled” to indicate that two or more elements are in direct physical or electrical contact with each other. The term “coupled,” however, can also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other.
- The breadth and scope of this disclosure should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/931,063 US20200387511A1 (en) | 2019-06-07 | 2020-05-13 | Architecture of hybrid in-memory and paged dictionary |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201962858693P | 2019-06-07 | 2019-06-07 | |
US15/931,063 US20200387511A1 (en) | 2019-06-07 | 2020-05-13 | Architecture of hybrid in-memory and paged dictionary |
Publications (1)
Publication Number | Publication Date |
---|---|
US20200387511A1 true US20200387511A1 (en) | 2020-12-10 |
Family
ID=73650063
Family Applications (14)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/857,982 Active 2040-10-29 US11514027B2 (en) | 2019-06-07 | 2020-04-24 | Paged hybrid LOBs |
US16/858,344 Active 2040-06-04 US11194789B2 (en) | 2019-06-07 | 2020-04-24 | Content agnostic memory pageable storage model |
US16/863,834 Active 2041-02-26 US11514028B2 (en) | 2019-06-07 | 2020-04-30 | Hybrid data storage and load system with ROWID lookup |
US16/863,860 Abandoned US20200387451A1 (en) | 2019-06-07 | 2020-04-30 | Hybrid data storage and load system with position lookup |
US16/866,766 Active US11151126B2 (en) | 2019-06-07 | 2020-05-05 | Hybrid column store providing both paged and memory-resident configurations |
US15/931,179 Active 2040-05-27 US11341120B2 (en) | 2019-06-07 | 2020-05-13 | Hash composite indexes |
US15/931,063 Abandoned US20200387511A1 (en) | 2019-06-07 | 2020-05-13 | Architecture of hybrid in-memory and paged dictionary |
US16/890,020 Active 2041-10-27 US11726985B2 (en) | 2019-06-07 | 2020-06-02 | Hybrid in-memory/pageable spatial column data |
US16/893,697 Active US11386082B2 (en) | 2019-06-07 | 2020-06-05 | Space efficient vector for columnar data storage |
US16/893,703 Active 2040-10-02 US11372845B2 (en) | 2019-06-07 | 2020-06-05 | In-place load unit conversion |
US17/496,108 Active US11755565B2 (en) | 2019-06-07 | 2021-10-07 | Hybrid column store providing both paged and memory-resident configurations |
US17/540,921 Active US11663200B2 (en) | 2019-06-07 | 2021-12-02 | Content agnostic memory pageable storage model |
US18/216,218 Active US12019622B2 (en) | 2019-06-07 | 2023-06-29 | Hybrid in-memory/pageable spatial column data |
US18/674,387 Pending US20240311371A1 (en) | 2019-06-07 | 2024-05-24 | Hybrid in-memory/pageable spatial column data |
Family Applications Before (6)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/857,982 Active 2040-10-29 US11514027B2 (en) | 2019-06-07 | 2020-04-24 | Paged hybrid LOBs |
US16/858,344 Active 2040-06-04 US11194789B2 (en) | 2019-06-07 | 2020-04-24 | Content agnostic memory pageable storage model |
US16/863,834 Active 2041-02-26 US11514028B2 (en) | 2019-06-07 | 2020-04-30 | Hybrid data storage and load system with ROWID lookup |
US16/863,860 Abandoned US20200387451A1 (en) | 2019-06-07 | 2020-04-30 | Hybrid data storage and load system with position lookup |
US16/866,766 Active US11151126B2 (en) | 2019-06-07 | 2020-05-05 | Hybrid column store providing both paged and memory-resident configurations |
US15/931,179 Active 2040-05-27 US11341120B2 (en) | 2019-06-07 | 2020-05-13 | Hash composite indexes |
Family Applications After (7)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/890,020 Active 2041-10-27 US11726985B2 (en) | 2019-06-07 | 2020-06-02 | Hybrid in-memory/pageable spatial column data |
US16/893,697 Active US11386082B2 (en) | 2019-06-07 | 2020-06-05 | Space efficient vector for columnar data storage |
US16/893,703 Active 2040-10-02 US11372845B2 (en) | 2019-06-07 | 2020-06-05 | In-place load unit conversion |
US17/496,108 Active US11755565B2 (en) | 2019-06-07 | 2021-10-07 | Hybrid column store providing both paged and memory-resident configurations |
US17/540,921 Active US11663200B2 (en) | 2019-06-07 | 2021-12-02 | Content agnostic memory pageable storage model |
US18/216,218 Active US12019622B2 (en) | 2019-06-07 | 2023-06-29 | Hybrid in-memory/pageable spatial column data |
US18/674,387 Pending US20240311371A1 (en) | 2019-06-07 | 2024-05-24 | Hybrid in-memory/pageable spatial column data |
Country Status (1)
Country | Link |
---|---|
US (14) | US11514027B2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12072888B1 (en) * | 2023-05-12 | 2024-08-27 | Sap Se | Cooperative memory management in database management systems |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11514027B2 (en) * | 2019-06-07 | 2022-11-29 | Sap Se | Paged hybrid LOBs |
US12038979B2 (en) * | 2020-11-25 | 2024-07-16 | International Business Machines Corporation | Metadata indexing for information management using both data records and associated metadata records |
CN112732174B (en) * | 2020-12-25 | 2024-09-13 | 北京金山云网络技术有限公司 | Data processing method and device, electronic equipment and storage medium |
US11550762B2 (en) | 2021-02-24 | 2023-01-10 | Sap Se | Implementation of data access metrics for automated physical database design |
EP4341831A1 (en) * | 2021-05-20 | 2024-03-27 | Cinchy Inc. | System for enabling temporal structural metadata changes using virtual addressing of data |
US20220398235A1 (en) * | 2021-06-11 | 2022-12-15 | Actian Corporation | Method and apparatus for storing object tokens in a database |
US11687560B2 (en) * | 2021-07-16 | 2023-06-27 | International Business Machines Corporation | Database replication using adaptive compression |
US20230101310A1 (en) * | 2021-09-29 | 2023-03-30 | Micron Technology, Inc. | Early detection of compression status using inline metadata |
US11960448B2 (en) | 2021-10-28 | 2024-04-16 | Netapp, Inc. | Unified object format for retaining compression and performing additional compression for reduced storage consumption in an object store |
US12079360B2 (en) * | 2022-06-30 | 2024-09-03 | Atlassian Pty Ltd. | Enabling limited database access using randomized limited access pointers |
US20240086392A1 (en) * | 2022-09-14 | 2024-03-14 | Sap Se | Consistency checks for compressed data |
US20240126746A1 (en) * | 2022-10-13 | 2024-04-18 | Oracle International Corporation | Multi-lob caching and lob writing |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120221528A1 (en) * | 2011-01-14 | 2012-08-30 | Sap Ag | Logging scheme for column-oriented in-memory databases |
US20130124491A1 (en) * | 2011-11-11 | 2013-05-16 | Gerald Pepper | Efficient Pipelined Binary Search |
US20140222418A1 (en) * | 2012-04-30 | 2014-08-07 | Martin Richtarsky | Fixed string dictionary |
US20160012089A1 (en) * | 2014-07-10 | 2016-01-14 | Reza Sherkat | Main memory database management using page index vectors |
Family Cites Families (89)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5625711A (en) * | 1994-08-31 | 1997-04-29 | Adobe Systems Incorporated | Method and apparatus for producing a hybrid data structure for displaying a raster image |
US5729637A (en) * | 1994-08-31 | 1998-03-17 | Adobe Systems, Inc. | Method and apparatus for producing a hybrid data structure for displaying a raster image |
US6307962B1 (en) * | 1995-09-01 | 2001-10-23 | The University Of Rochester | Document data compression system which automatically segments documents and generates compressed smart documents therefrom |
US6112286A (en) * | 1997-09-19 | 2000-08-29 | Silicon Graphics, Inc. | Reverse mapping page frame data structures to page table entries |
US20020099691A1 (en) * | 1998-06-24 | 2002-07-25 | Michael Dean Lore | Method and apparatus for aggregation of data in a database management system |
US6191710B1 (en) | 1998-12-30 | 2001-02-20 | Intel Corp. | Data compression and decompression method and system for data compression and decompression |
US7054504B2 (en) * | 1999-02-25 | 2006-05-30 | Ludwig Lester F | Relative optical path phase reconstruction in the correction of misfocused images using fractional powers of the fourier transform |
US20030041110A1 (en) * | 2000-07-28 | 2003-02-27 | Storymail, Inc. | System, Method and Structure for generating and using a compressed digital certificate |
US20020199096A1 (en) * | 2001-02-25 | 2002-12-26 | Storymail, Inc. | System and method for secure unidirectional messaging |
US7089567B2 (en) * | 2001-04-09 | 2006-08-08 | International Business Machines Corporation | Efficient RPC mechanism using XML |
US8413205B2 (en) * | 2001-09-19 | 2013-04-02 | Tvworks, Llc | System and method for construction, delivery and display of iTV content |
US8042132B2 (en) * | 2002-03-15 | 2011-10-18 | Tvworks, Llc | System and method for construction, delivery and display of iTV content |
US9033569B2 (en) * | 2010-11-22 | 2015-05-19 | Tseng-Lu Chien | Lamp holder has built-in night light |
US7383414B2 (en) * | 2004-05-28 | 2008-06-03 | Oracle International Corporation | Method and apparatus for memory-mapped input/output |
US8200887B2 (en) * | 2007-03-29 | 2012-06-12 | Violin Memory, Inc. | Memory management system and method |
US8224026B2 (en) * | 2005-12-08 | 2012-07-17 | Lenel Systems International, Inc. | System and method for counting people near external windowed doors |
JP5041458B2 (en) * | 2006-02-09 | 2012-10-03 | 本田技研工業株式会社 | Device for detecting three-dimensional objects |
FR2899353B1 (en) * | 2006-03-31 | 2008-06-27 | Infovista Sa Sa | MEMORY MANAGEMENT SYSTEM FOR REDUCING MEMORY FRAGMENTATION |
US8526049B2 (en) * | 2006-03-31 | 2013-09-03 | Konica Minolta Laboratory U.S.A., Inc. | Systems and methods for display list management |
US20090273598A1 (en) * | 2008-05-01 | 2009-11-05 | M.E.P. Cad, Inc. | Methods and apparatuses for automatically converting objects in CAD drawing from two-dimensions to three-dimensions |
US8032499B2 (en) | 2007-05-21 | 2011-10-04 | Sap Ag | Compression of tables based on occurrence of values |
US20090102924A1 (en) * | 2007-05-21 | 2009-04-23 | Masten Jr James W | Rapidly Deployable, Remotely Observable Video Monitoring System |
TW200907764A (en) * | 2007-08-01 | 2009-02-16 | Unique Instr Co Ltd | Three-dimensional virtual input and simulation apparatus |
US8427552B2 (en) * | 2008-03-03 | 2013-04-23 | Videoiq, Inc. | Extending the operational lifetime of a hard-disk drive used in video data storage applications |
US20090327621A1 (en) * | 2008-06-27 | 2009-12-31 | Microsoft Corporation | Virtual memory compaction and compression using collaboration between a virtual memory manager and a memory manager |
US8495036B2 (en) * | 2008-10-24 | 2013-07-23 | Microsoft Corporation | Blob manipulation in an integrated structured storage system |
US10509304B2 (en) * | 2008-11-12 | 2019-12-17 | Tseng-Lu Chien | LED projection light has features |
US9292549B2 (en) * | 2008-12-22 | 2016-03-22 | Sap Se | Method and system for index serialization |
US8060857B2 (en) * | 2009-01-31 | 2011-11-15 | Ted J. Biggerstaff | Automated partitioning of a computation for parallel or other high capability architecture |
US8326893B2 (en) * | 2009-02-24 | 2012-12-04 | International Business Machines Corporation | Allocating data sets to a container data set |
US8260030B2 (en) * | 2009-03-30 | 2012-09-04 | Koh Young Technology Inc. | Inspection method |
US8935223B2 (en) * | 2009-04-30 | 2015-01-13 | Oracle International Corporation | Structure of hierarchical compressed data structure for tabular data |
KR101725044B1 (en) * | 2010-05-27 | 2017-04-11 | 삼성전자주식회사 | Imaging display apparatus |
US20120089775A1 (en) * | 2010-10-08 | 2012-04-12 | Sandeep Ranade | Method and apparatus for selecting references to use in data compression |
US8819056B2 (en) * | 2010-11-19 | 2014-08-26 | International Business Machines Corporation | Facilitation of search, list, and retrieval operations on persistent data set using distributed shared memory |
US9201652B2 (en) * | 2011-05-03 | 2015-12-01 | Qualcomm Incorporated | Methods and apparatus for storage and translation of entropy encoded software embedded within a memory hierarchy |
US20130346700A1 (en) * | 2012-06-21 | 2013-12-26 | Alexander I. Tomlinson | Systems and methods for managing memory |
DE102012022728A1 (en) * | 2012-11-21 | 2014-05-22 | Unify Gmbh & Co. Kg | A method of controlling a flash memory for mass storage comprised of a communication device connectable to a host, and computer program product for executing the method |
US9009439B2 (en) | 2013-03-12 | 2015-04-14 | Sap Se | On-disk operations on fragments to support huge data sizes |
US10474652B2 (en) * | 2013-03-14 | 2019-11-12 | Inpixon | Optimizing wide data-type storage and analysis of data in a column store database |
US8972465B1 (en) * | 2013-03-15 | 2015-03-03 | Emc Corporation | Burst buffer appliance with small file aggregation |
US20160085955A1 (en) * | 2013-06-10 | 2016-03-24 | Doosra, Inc. | Secure Storing and Offline Transferring of Digitally Transferable Assets |
RU2538941C1 (en) * | 2013-06-14 | 2015-01-10 | Общество с ограниченной ответственностью "Аби Девелопмент" | Recognition quality enhancements by increasing image resolution |
US9128968B2 (en) * | 2013-07-25 | 2015-09-08 | Facebook, Inc. | Systems and methods for data compression |
US9350550B2 (en) * | 2013-09-10 | 2016-05-24 | M2M And Iot Technologies, Llc | Power management and security for wireless modules in “machine-to-machine” communications |
US10311154B2 (en) * | 2013-09-21 | 2019-06-04 | Oracle International Corporation | Combined row and columnar storage for in-memory databases for OLTP and analytics workloads |
US9684682B2 (en) * | 2013-09-21 | 2017-06-20 | Oracle International Corporation | Sharding of in-memory objects across NUMA nodes |
US9430390B2 (en) * | 2013-09-21 | 2016-08-30 | Oracle International Corporation | Core in-memory space and object management architecture in a traditional RDBMS supporting DW and OLTP applications |
US9977802B2 (en) * | 2013-11-21 | 2018-05-22 | Sap Se | Large string access and storage |
US9384206B1 (en) * | 2013-12-26 | 2016-07-05 | Emc Corporation | Managing data deduplication in storage systems |
US9317208B2 (en) * | 2013-12-31 | 2016-04-19 | Sybase, Inc. | Data row cache for an acid compliant in-memory row store in a page-based RDBMS engine |
DE102014201571B4 (en) * | 2014-01-29 | 2022-08-04 | Carl Zeiss Meditec Ag | Module for data mirroring in a visualization device, visualization device and method for adapting the device |
US9342544B2 (en) * | 2014-01-30 | 2016-05-17 | International Business Machines Corporation | Parallel load in a column-store database |
EP3107429B1 (en) * | 2014-02-20 | 2023-11-15 | MBL Limited | Methods and systems for food preparation in a robotic cooking kitchen |
US9875180B2 (en) * | 2014-02-24 | 2018-01-23 | Sandisk Technologies Llc | Systems and methods for managing storage compression operations |
US9898551B2 (en) | 2014-11-25 | 2018-02-20 | Sap Se | Fast row to page lookup of data table using capacity index |
US20160148000A1 (en) * | 2014-11-25 | 2016-05-26 | Freescale Semiconductor, Inc. | Method and apparatus for encoding image data |
US9811549B2 (en) * | 2014-11-25 | 2017-11-07 | Sap Se | Applying a database transaction log record directly to a database table container |
US9891831B2 (en) | 2014-11-25 | 2018-02-13 | Sap Se | Dual data storage using an in-memory array and an on-disk page structure |
US9817858B2 (en) | 2014-12-10 | 2017-11-14 | Sap Se | Generating hash values |
US20160188176A1 (en) * | 2014-12-31 | 2016-06-30 | Valepro, LLC | Systems and Methods for Resident Space Object Visualization |
US10387414B2 (en) * | 2015-04-13 | 2019-08-20 | Risk Management Solutions, Inc. | High performance big data computing system and platform |
US10037270B2 (en) * | 2015-04-14 | 2018-07-31 | Microsoft Technology Licensing, Llc | Reducing memory commit charge when compressing memory |
US10552058B1 (en) * | 2015-07-17 | 2020-02-04 | Radian Memory Systems, Inc. | Techniques for delegating data processing to a cooperative memory controller |
US10372706B2 (en) * | 2015-07-29 | 2019-08-06 | Oracle International Corporation | Tracking and maintaining expression statistics across database queries |
US10204135B2 (en) * | 2015-07-29 | 2019-02-12 | Oracle International Corporation | Materializing expressions within in-memory virtual column units to accelerate analytic queries |
US9990308B2 (en) * | 2015-08-31 | 2018-06-05 | Oracle International Corporation | Selective data compression for in-memory databases |
US11249968B2 (en) * | 2016-05-09 | 2022-02-15 | Sap Se | Large object containers with size criteria for storing mid-sized large objects |
US10002010B2 (en) * | 2016-05-13 | 2018-06-19 | International Business Machines Corporation | Multi-byte compressed string representation |
US10691688B2 (en) * | 2016-06-17 | 2020-06-23 | Sap Se | Cracking page-loadable columns for in-memory data management |
US10733045B2 (en) * | 2016-07-14 | 2020-08-04 | Microsoft Technology Licensing, Llc | Online repair of metadata for structured data including file systems |
US11204895B1 (en) * | 2016-09-28 | 2021-12-21 | Amazon Technologies, Inc. | Data payload clustering for data storage systems |
US10642841B2 (en) * | 2016-11-17 | 2020-05-05 | Sap Se | Document store utilizing partial object compression |
US10762071B2 (en) * | 2016-11-29 | 2020-09-01 | Sap Se | Value-ID-based sorting in column-store databases |
US10901639B2 (en) * | 2016-11-29 | 2021-01-26 | Sap Se | Memory allocation in multi-core processors |
US11132128B2 (en) * | 2017-03-24 | 2021-09-28 | Veritas Technologies Llc | Systems and methods for data placement in container-based storage systems |
US10831787B2 (en) * | 2017-06-30 | 2020-11-10 | Sap Se | Security of a computer system |
EP3646193B1 (en) * | 2017-06-30 | 2023-03-15 | Microsoft Technology Licensing, LLC | Online schema change of range-partitioned index in distributed storage system |
US11884934B2 (en) * | 2017-07-21 | 2024-01-30 | Washington University | Methods and compositions for T cell activation |
US10803040B2 (en) * | 2017-08-28 | 2020-10-13 | International Business Machines Corporation | Efficient and accurate lookups of data by a stream processor using a hash table |
FI20176151A1 (en) * | 2017-12-22 | 2019-06-23 | Vuolearning Ltd | A heuristic method for analyzing content of an electronic document |
US11550485B2 (en) * | 2018-04-23 | 2023-01-10 | Sap Se | Paging and disk storage for document store |
TWI671631B (en) * | 2018-08-01 | 2019-09-11 | 大陸商深圳大心電子科技有限公司 | Memory management method and storage controller |
US10803087B2 (en) * | 2018-10-19 | 2020-10-13 | Oracle International Corporation | Language interoperable runtime adaptable data collections |
US11562085B2 (en) * | 2018-10-19 | 2023-01-24 | Oracle International Corporation | Anisotropic compression as applied to columnar storage formats |
US20200241792A1 (en) * | 2019-01-29 | 2020-07-30 | Sap Se | Selective Restriction of Large Object Pages in a Database |
US20200320159A1 (en) * | 2019-03-22 | 2020-10-08 | Robert R. Matthews | Magnetic-enthalpy symbol H gasiform species dilute, or solution a heater fluid dynamic mechanic class system transfer the integral sub-plasma thermal cycle state of matter dilute, or solution containing the greater amount of solute synthesis extraction the solution pipeline standard a heater fluid. |
US11514027B2 (en) | 2019-06-07 | 2022-11-29 | Sap Se | Paged hybrid LOBs |
US11502705B2 (en) * | 2019-06-21 | 2022-11-15 | Sap Se | Advanced database decompression |
-
2020
- 2020-04-24 US US16/857,982 patent/US11514027B2/en active Active
- 2020-04-24 US US16/858,344 patent/US11194789B2/en active Active
- 2020-04-30 US US16/863,834 patent/US11514028B2/en active Active
- 2020-04-30 US US16/863,860 patent/US20200387451A1/en not_active Abandoned
- 2020-05-05 US US16/866,766 patent/US11151126B2/en active Active
- 2020-05-13 US US15/931,179 patent/US11341120B2/en active Active
- 2020-05-13 US US15/931,063 patent/US20200387511A1/en not_active Abandoned
- 2020-06-02 US US16/890,020 patent/US11726985B2/en active Active
- 2020-06-05 US US16/893,697 patent/US11386082B2/en active Active
- 2020-06-05 US US16/893,703 patent/US11372845B2/en active Active
-
2021
- 2021-10-07 US US17/496,108 patent/US11755565B2/en active Active
- 2021-12-02 US US17/540,921 patent/US11663200B2/en active Active
-
2023
- 2023-06-29 US US18/216,218 patent/US12019622B2/en active Active
-
2024
- 2024-05-24 US US18/674,387 patent/US20240311371A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120221528A1 (en) * | 2011-01-14 | 2012-08-30 | Sap Ag | Logging scheme for column-oriented in-memory databases |
US20130124491A1 (en) * | 2011-11-11 | 2013-05-16 | Gerald Pepper | Efficient Pipelined Binary Search |
US20140222418A1 (en) * | 2012-04-30 | 2014-08-07 | Martin Richtarsky | Fixed string dictionary |
US20160012089A1 (en) * | 2014-07-10 | 2016-01-14 | Reza Sherkat | Main memory database management using page index vectors |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12072888B1 (en) * | 2023-05-12 | 2024-08-27 | Sap Se | Cooperative memory management in database management systems |
Also Published As
Publication number | Publication date |
---|---|
US12019622B2 (en) | 2024-06-25 |
US11663200B2 (en) | 2023-05-30 |
US11194789B2 (en) | 2021-12-07 |
US20230350879A1 (en) | 2023-11-02 |
US11372845B2 (en) | 2022-06-28 |
US20240311371A1 (en) | 2024-09-19 |
US11151126B2 (en) | 2021-10-19 |
US20200387490A1 (en) | 2020-12-10 |
US20200387451A1 (en) | 2020-12-10 |
US20220092057A1 (en) | 2022-03-24 |
US20200387487A1 (en) | 2020-12-10 |
US11726985B2 (en) | 2023-08-15 |
US11755565B2 (en) | 2023-09-12 |
US11341120B2 (en) | 2022-05-24 |
US11514028B2 (en) | 2022-11-29 |
US20220027354A1 (en) | 2022-01-27 |
US20200387305A1 (en) | 2020-12-10 |
US20200387502A1 (en) | 2020-12-10 |
US11514027B2 (en) | 2022-11-29 |
US20200387495A1 (en) | 2020-12-10 |
US20200387486A1 (en) | 2020-12-10 |
US11386082B2 (en) | 2022-07-12 |
US20200387488A1 (en) | 2020-12-10 |
US20200387509A1 (en) | 2020-12-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20200387511A1 (en) | Architecture of hybrid in-memory and paged dictionary | |
US9886464B2 (en) | Versioned bloom filter | |
US11226965B2 (en) | Non-homogenous synopsis for efficient partition pruning | |
US20170228373A1 (en) | Dynamic Hash Table Size Estimation During Database Aggregation Processing | |
US11036517B2 (en) | Database management system performing column operations using a set of SIMD processor instructions selected based on performance | |
US20180349422A1 (en) | Database management system, database server, and database management method | |
US10565205B2 (en) | Incrementally building hash collision tables | |
EP3173947B1 (en) | Paged inverted index | |
US11775496B2 (en) | Data transfer and management system for in-memory database | |
US10565204B2 (en) | Hash collision tables for relational join operations | |
US9449046B1 (en) | Constant-vector computation system and method that exploits constant-value sequences during data processing | |
US11442862B2 (en) | Fair prefetching in hybrid column stores | |
US9471634B2 (en) | Execution of negated conditions using a bitmap | |
US12072903B2 (en) | Data management system for managing inferences | |
US11836121B1 (en) | Delta merge with location data | |
US20200311062A1 (en) | Data Partitioning and Transfer System | |
CN114625548A (en) | Look-ahead ranking for accelerated data extraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SAP SE, GERMANY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SHERKAT, REZA;FLORENDO, COLIN;GOTTIPATI, CHAITANYA;AND OTHERS;SIGNING DATES FROM 20200427 TO 20200511;REEL/FRAME:052851/0401 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |