CN118626418A

CN118626418A - Method and computing device for memory caching system

Info

Publication number: CN118626418A
Application number: CN202411081350.5A
Authority: CN
Inventors: 吴迪; 时建军; 王少军; 牛昕宇
Original assignee: Shenzhen Corerain Technologies Co Ltd
Current assignee: Shenzhen Corerain Technologies Co Ltd
Priority date: 2024-08-08
Filing date: 2024-08-08
Publication date: 2024-09-10

Abstract

The invention provides a method and computing equipment for a memory caching system, wherein the memory caching system is used for storing key value objects, and the method comprises the following steps: creating a cache instance according to the designated cache capacity and the estimated object number, wherein the cache instance comprises a small queue and a main queue, the small queue is used for storing objects with the recent or access frequency lower than an access threshold value, and the main queue is used for storing objects with the access frequency equal to and/or greater than the access threshold value; writing an object to be cached to the cache instance, and always performing writing operation to the small queue; and when the cache capacity is insufficient, preferentially eliminating the cache key value object from the small queue. According to the technical scheme of the invention, the cache hit rate can be ensured, the cache pollution is reduced, the safety and concurrency of the memory are ensured, and the development and maintenance of a cache system are simplified.

Description

Method and computing device for memory caching system

Technical Field

The invention relates to the technical field of software development, in particular to a method and computing equipment for an efficient memory caching system.

Background

With the rapid growth of the internet, web applications face increasing performance challenges. The requirements of high concurrency, mass data, low latency, etc. place higher demands on traditional cache systems. Existing caching algorithms such as LRU (least recently used) are simple to implement, but have low hit rates and high memory consumption under certain workloads.

The traditional LFU algorithm (the algorithm is least frequently used) needs to maintain an access frequency counter of each object, the space complexity is high, and as the number of cache objects increases, the memory occupied by the frequency counter increases sharply, so that the system memory pressure is too high. Traditional LFU algorithms eliminate based on the least frequently accessed objects, which may lead to cache pollution problems in some scenarios. For example, for the scan access mode, the LFU may reside a large number of disposable access objects in the cache for a long period of time, occupying the space of the hot spot object, and reducing the overall hit rate of the cache.

Therefore, a technical scheme is needed, which can ensure the cache hit rate, reduce the cache pollution, ensure the safety and concurrency of the memory and simplify the development and maintenance of the cache system.

Disclosure of Invention

The invention aims to provide a method and computing equipment for an efficient memory cache system, which can solve the problem of data overflow during chip convolution computation, improve the computing efficiency of a chip and reduce the computing amount of the chip.

According to an aspect of the present invention, there is provided a method for a memory cache system for storing key-value objects, the method comprising:

Creating a cache instance according to the designated cache capacity and the estimated object number, wherein the cache instance comprises a small queue and a main queue, the small queue is used for storing objects with the recent or access frequency lower than an access threshold value, and the main queue is used for storing objects with the access frequency equal to and/or greater than the access threshold value;

writing an object to be cached to the cache instance, and always performing writing operation to the small queue;

and when the cache capacity is insufficient, preferentially eliminating the cache key value object from the small queue.

According to some embodiments, writing the object to be cached to the cache instance includes:

calling a reading method, inquiring a corresponding value according to a key of the object to be cached, wherein if the key exists, returning the corresponding value and increasing the access frequency of the object to be cached; if the query is that the key does not exist, a null value is returned.

According to some embodiments, the cached key object is moved from the small queue to the main queue when the number of times the cached key object is accessed is greater than the access threshold.

According to some embodiments, when the cache capacity is insufficient, preferentially eliminating the cache key object from the small queue includes:

and preferentially eliminating the cache key value objects with the access frequencies lower than a frequency elimination threshold value from the small queue.

According to some embodiments, when the cache capacity is insufficient, preferentially eliminating the cached key-value object from the small queue, including:

And if the small queue is empty, eliminating the cache key value object from the main queue.

According to some embodiments, when the small queue is empty, the cached key object for which the access frequency is below a frequency elimination threshold and/or for which the residence time is below a residence time elimination threshold is eliminated.

According to some embodiments, when the cached key object is read, corresponding object metadata is updated, the object metadata including access frequency and belonging queue.

According to some embodiments, the write and retirement policies of the memory cache system are dynamically adjusted.

And when the workload of the memory caching system changes, the capacities of the small queue and the main queue and/or the residence time and/or the access frequency requirement of the cache key value object are/is adaptively adjusted.

According to another aspect of the invention, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method as claimed in any one of the above.

According to another aspect of the present invention, there is provided a computing device comprising:

A processor; and

A memory storing a computer program which, when executed by the processor, implements the method of any of the above.

According to an embodiment of the present invention, a cache system method for high performance is provided, which aims to improve the cache hit rate and the resource utilization efficiency in a multithreading environment. According to the method, various data structures and algorithms are combined, after a cache instance is created according to the designated cache capacity and the estimated object quantity, a writing method is called, the cache object is written into the cache, and when the cache capacity is insufficient, the object with low access frequency is automatically eliminated according to the algorithm, so that a cache space is vacated for a new cache object. Through a flexible cache management strategy and an optimized concurrent processing mechanism, high performance is ensured, and meanwhile memory overhead and cache replacement loss are reduced to the greatest extent. The invention can ensure the high cache hit rate, reduce cache pollution, ensure the safety and concurrency of the memory and simplify the development and maintenance of a cache system.

According to some embodiments, by setting the upper limit of the buffer capacity, unlimited buffer expansion can be avoided, and reasonable control and efficient utilization of system resources are ensured. Meanwhile, the elimination strategy based on the access frequency can preferentially reserve hot objects, improve the cache hit rate, reduce frequent access to the rear-end data source, and accordingly improve the response speed and performance of the whole system.

According to some embodiments, the policy can dynamically adjust the cache content according to the real-time access mode and the cache state, so that the cache is closer to the needs of the user or the application program. When new objects are added into the cache, objects with low access frequency can be eliminated, so that the freshness and the relevance of the cache content can be ensured.

According to some embodiments, an automated elimination mechanism reduces the need for manual intervention, reducing the complexity and cost of cache management. Meanwhile, the selection and elimination of the cache objects are automatically completed based on an algorithm, so that performance bottleneck or resource waste caused by human judgment errors can be reduced.

According to some embodiments, according to different application scenarios and requirements, the buffer capacity can be flexibly adjusted, different elimination algorithms can be selected, and even a plurality of strategies can be combined, so that the optimal buffer effect can be achieved.

According to some embodiments, by rapidly providing frequently accessed data, delay is reduced, front-end response speed is improved, and user experience is significantly improved. This is particularly important in applications where high concurrency or real-time requirements are high.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are required to be used in the description of the embodiments will be briefly described below.

FIG. 1 illustrates a method flow diagram for an efficient memory caching system, according to an example embodiment.

FIG. 2 illustrates a memory caching system workflow diagram according to an example embodiment.

FIG. 3 illustrates a flow diagram for removing queue objects according to an example embodiment.

FIG. 4 illustrates a block diagram of a computing device in accordance with an exemplary embodiment.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments can be embodied in many forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the invention.

The block diagrams depicted in the figures are merely functional entities and do not necessarily correspond to physically separate entities. That is, the functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

The flow diagrams depicted in the figures are exemplary only, and do not necessarily include all of the elements and operations/steps, nor must they be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations.

It will be understood that, although the terms first, second, third, etc. may be used herein to describe various components, these components should not be limited by these terms. These terms are used to distinguish one element from another element. Accordingly, a first component discussed below could be termed a second component without departing from the teachings of the present inventive concept. As used herein, the term "and/or" includes any one of the associated listed items and all combinations of one or more.

The user information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present invention are information and data authorized by the user or fully authorized by each party, and the collection, use and processing of related data is required to comply with the relevant laws and regulations and standards of the relevant country and region, and is provided with corresponding operation entries for the user to select authorization or rejection.

Those skilled in the art will appreciate that the drawings are schematic representations of example embodiments and that the modules or flows in the drawings are not necessarily required to practice the invention and therefore should not be taken to limit the scope of the invention.

Existing memory caching systems, while having significant advantages in terms of improving data access speed and system performance, present some inherent disadvantages and challenges. Memory is small relative to disk storage, which means that the cache may not be able to hold all the data, limiting the size and effectiveness of the cache. The data in the memory is non-persistent and is lost once the system is powered down or restarted. This requires that the cache system must have a data backup and restore mechanism to ensure data integrity and consistency.

If there is a data inconsistency between the cache and the database, a business logic error may result. Cache update policies, such as write-through, write-bypass, read repair, etc., need to be carefully designed to maintain data consistency. The update policy of the cache needs to take into account the timeliness of the data, the consistency of the data, and how efficiently the cache is updated, which increases the complexity of the system. Cache penetration refers to the absence of queried data in both the database and the cache, resulting in each request falling onto the database, increasing the burden on the database. Cache avalanche refers to that a large amount of cache data fails at the same time, resulting in a large amount of requests to flow into the database, which may cause system crash. In a multi-threaded or distributed environment, concurrent access to the same cached data may result in resource contention and lock contention, affecting system performance. Memory caching requires periodic management and maintenance, including monitoring, optimization, fault recovery, etc., which increases the operating cost of the system. Memory is typically more expensive than disk storage, and thus large-scale deployment of memory cache systems may incur higher hardware costs. A single node memory caching system may have difficulty handling large-scale data and high concurrency requests, requiring a distributed caching scheme to improve scalability and fault tolerance. Configuration and tuning of cache systems can be complex, requiring deep understanding of business requirements and system architecture to achieve optimal performance and efficiency.

Therefore, the invention provides a method for the high-efficiency memory cache system, which can ensure the cache hit rate, reduce the cache pollution, ensure the memory safety and concurrency and simplify the development and maintenance of the cache system.

Before describing embodiments of the present invention, some terms or concepts related to the embodiments of the present invention are explained.

LRU policy: LRU (LEAST RECENTLY Used least recently) policy is a common cache-eviction algorithm that is Used to determine which entry should be removed to make room for new data when the cache is full. The basic idea of LRU policy is: if the data has not been accessed within the last period of time, then it is less likely to be accessed in the future, so such data may be considered "least recently used" as a good candidate for cache elimination.

LFU strategy: the LFU (Least Frequently Used) policy is a cache-eviction algorithm that is primarily used to determine which entry should be removed when the cache is full. Unlike LRU (least recently used) policies, LFU policies focus on the frequency of access of data, not access time. The core idea of LFU algorithm is: if a data is accessed less frequently in the past, it is less likely to be accessed in the future, and so such data may be preferentially obsolete.

FIFO policy: FIFO (FIRST IN FIRST Out) policy is a simple cache-eviction or data structure management algorithm that follows the principle that the earliest incoming data should be removed first. Such policies find application in a variety of systems, including cache management in computer science, process scheduling in operating systems, transaction processing in database management, and data buffering in hardware design.

TTL strategy: the TTL (Time To Live) policy is a cache management and data elimination policy, and is mainly used for processing data entries with a limited life cycle. In the TTL policy, each data entry is given an expiration time, and when this time arrives, the data entry is considered to be expired and removed from the cache. The strategy is particularly suitable for scenes with high data effectiveness requirements, such as dynamic pricing, real-time data analysis, short-term authentication tokens and the like.

Rust language: rust is a system level programming language focused on speed, memory security, and concurrency. The design goal of Rust is to solve the common problems in C and c++, especially memory security, while not sacrificing performance. Its syntax is somewhat similar to c++ but provides a more rigid memory management and ownership model that allows programmers to write safer, more reliable code while still enjoying near-C or c++ runtime performance.

Compact structure: the compact refers to an optimized form of data structure or storage layout, which aims at reducing memory occupation, improving data access efficiency or simplifying data processing flow.

Fast data structure: data structures with high efficiency in specific operations can be quickly performed such as find, insert, delete, etc.

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings.

Referring to fig. 1, the present invention proposes a high performance memory cache system, which significantly reduces the problem of cache pollution while ensuring the cache hit rate. Furthermore, the caching system provides flexible configuration options, which users can trade off between time and space efficiency depending on the actual scenario. The invention adopts Rust language to realize, which ensures the safety and concurrency of the memory and greatly simplifies the development and maintenance of the cache system.

In S101, a cache instance is created according to the specified cache capacity and the estimated number of objects.

The cache instance includes a small queue for storing objects that have recent or access frequencies below an access threshold and a main queue for storing objects that have access frequencies equal to and/or greater than the access threshold.

When a program is initialized and started, a cache instance is created according to configuration, and according to a designated cache capacity and the number of estimated objects, several key steps are generally involved, namely, determining a cache policy, selecting a proper cache data structure and configuring cache parameters.

Cache policies are first determined, common policies including, but not limited to, LRU policies, LFU policies, FIFO policies, and TTL policies. LRU policy is the least recently used policy that removes the least recently used objects when the cache is full. LFU policies are the least frequently used policies, removing the least frequently accessed objects. The FIFO policy is a first-in first-out policy, removing the earliest added object. The TTL strategy is a time-based strategy, each object has a survival time, and the TTL strategy is automatically removed after overtime.

And selecting a proper cache data structure according to the selected cache strategy. The invention merges FIFO and LFU, LFU strategy, which can be realized by hash table combined with priority queue (minimum heap), FIFO strategy can be realized by queue.

And configuring cache parameters such as cache capacity, object number and the like. The cache size (total_weight_limit) defines the maximum size of the cache, typically in bytes or number of objects. The number of objects presets the initial size of the cache and optimizes the selection of the data structure. The object size predicts the average size of each object to better manage the buffer space. Details of the replacement policy, such as count update rules of the LFU, etc.

According to some embodiments, all data objects of int (integer data type), string (text data type), byte (character data type), struct (structure), etc. may be cache objects. The use of a cache system and invoking a write operation is dependent on the developer and makes clear the use scenario and resulting advantages of the cache system, as well as writing objects to the cache where appropriate and reading from the cache when use of the cache object is needed elsewhere.

At S103, the object to be cached is written to the cache instance.

The object to be cached is written to the cache instance, and a write operation is always performed to the small queue. During writing, if the cache capacity is insufficient, part of the objects are eliminated according to the FIFO policy until enough space is available to accommodate new objects.

An object to be written to the cache is determined, and a unique key to reference this object. The key should be sufficiently unique so that the object can be accurately retrieved from the cache later. And calling a writing method, writing the object into a cache by using the writing method provided by the memory cache system, using the transfer key and the value as parameters, and writing the cache object according to a cache policy.

According to some embodiments, a reading method is called, and a corresponding value is queried according to a key of the object to be cached, wherein if the key exists, the corresponding value is returned and the access frequency of the object to be cached is increased; if the query is that the key does not exist, a null value is returned. Calling a get method to inquire a corresponding value according to a key, and if the key is inquired to exist, returning the corresponding value and increasing the access frequency of the cache object; if the key is not found, a null None is returned.

According to some embodiments, when the cached key object is read, corresponding object metadata is updated, the object metadata including access frequency and belonging queue. The metadata is used for guiding a cache elimination decision to ensure the efficient utilization of the cache. When the cache object is read, the cache system updates metadata of the object, such as access frequency, belonging queues and the like, and the metadata is used for guiding cache elimination decision to ensure efficient utilization of the cache.

And when the number of times of the access frequency of the cached key-value object is larger than the access threshold value, moving the cached key-value object from the small queue to the main queue.

When the cache capacity is insufficient, the cache key object is preferentially eliminated from the small queue in S105.

According to some embodiments, the cache key objects whose access frequency is below a frequency elimination threshold are preferentially eliminated from the small queue. And if the small queue is empty, eliminating the cache key value object from the main queue, and eliminating the cache key value object with the access frequency lower than a frequency elimination threshold value and/or with the residence time lower than a residence time elimination threshold value.

According to some embodiments, when the cache capacity is insufficient, the cache system automatically eliminates portions of the object to make room for new objects. The elimination process follows a FIFO strategy, and eliminates the cache object with lower access frequency from the small queue (small queue) preferentially; if the small queue is empty, cache objects with lower access frequency and longer residence time are eliminated from the main queue (main queue).

According to some embodiments, the elimination method for eliminating the small queue from the main queue when the small queue is empty is a method for processing a complex cache scene, the reasons are as follows: preventing deadlock, if the small queue is empty or all objects are lifted to the master queue, the system needs a method to continue the retirement operation; maintaining total weight limits, even if there is room in the small queue, if the total weight exceeds the limits, the system still needs to eliminate objects; adapting to dynamic changes, in a concurrent environment, queue status may change between checking and execution; optimizing the cache content, sometimes eliminating less-used objects in the main queue, may be more advantageous than adding new objects to the small queue.

The system distinguishes and processes objects of different frequencies and residence times through several mechanisms. TinyLFU the estimator is used to track the access frequency of the object, and the uses counter is used to track the number of times the object is accessed.

The obsolete object will inform the method caller by a return value. And the memory caching system dynamically adjusts the write-in and elimination strategies of the cache, and when the workload of the memory caching system changes, the cache self-adaptively adjusts the capacities of the small queue and the main queue, and the residence time and the access frequency requirement of the cache key value object. The buffer memory system dynamically adjusts the admission and elimination strategies of the buffer memory through the combination of the LFU and the FIFO strategies. The cache will adaptively adjust the small and main queues as well as the residence time and access frequency requirements of the objects as the workload changes.

The space complexity of the existing cache system is high, the traditional LFU algorithm needs to maintain an access frequency counter of each object, the space complexity is high, and as the number of cache objects increases, the memory occupied by the frequency counter increases sharply, so that the system memory pressure is overlarge. The traditional LFU algorithm eliminates the objects with the lowest access frequency, and in some scenes, the strategy may cause the problem of cache pollution, for example, for a scanning access mode, the LFU can reside a large number of disposable access objects in a cache for a long time, occupy the space of a hot spot object, and reduce the overall hit rate of the cache.

The invention provides a high-efficiency cache system, which aims to improve the cache hit rate and the resource utilization efficiency in a multithreading environment. The cache system combines various data structures and algorithms, and through flexible cache management strategies and an optimized concurrent processing mechanism, memory overhead and cache replacement loss are reduced to the greatest extent while high performance is ensured.

The cache system of the present invention employs a multi-slice jump table design (SHARDED SKIP LIST DESIGN). N-SHARD SKIP LIST refers to dividing a large skip list into N smaller skip lists (slices), each slice being responsible for storing a portion of the data. The design has high memory efficiency, and the jump table itself saves more memory than the hash table, and is more flexible after slicing. The lookup complexity of the skip list is O (log N), but is actually O (log (N/N)) due to the slicing, when N is large, the lookup of average constant time can be realized near the constant time. Different fragments can be accessed concurrently, lock competition is reduced, and concurrency performance is good.

And realizing high-efficiency memory cache management through a Compact structure. Compact uses SkipMap (a mapping structure based on a jump table) across multiple slices, each containing one SkipMap instance, providing constant time complexity lookup and update operations in situations where memory usage is intense. The design fully utilizes the ordering of the jump table and the parallelism of the fragments, and improves the searching speed and the inserting and deleting efficiency of the cache.

The cache system provides a Fast data structure, and realizes cache access through concurrent HashMap (hash table). Fast is suitable for scenes requiring high throughput, provides Fast read-write operations, and supports multi-thread concurrent access. By flexibly selecting Compact or Fast data structures, the cache system can achieve optimal performance and memory use in different application scenarios. For example, store backend, fast < T > is based on concurrent hash map, compact < T > is based on N-sliced hop table; fast uses more memory but provides faster access speed, compact uses less memory but has a slight sacrifice in access speed; fast provides the best inquiry performance, is suitable for high-frequency access scenes, provides better memory efficiency under the condition of sacrificing some performances, and is suitable for large-scale cache items or memory-limited environments.

The cache system records the access frequency through a plurality of hash functions and a counter matrix, and carries out attenuation (imaging) processing on the frequency data in a period, so that the accuracy of frequency estimation is improved. The invention provides a data structure and algorithm named as 'Count-minimum Write' (Count-Min Write), which are used for efficiently estimating the occurrence frequency of elements in a data stream and recording the access frequency.

The roles of frequency estimation include: the object access pattern is tracked and the frequency estimator records the frequency with which each object is accessed (or attempted to be written). This helps identify hot spot data (data that is often accessed); auxiliary buffer replacement decision, when buffer is full, frequency information is used for deciding whether the existing buffer item should be replaced; the LFU policy, least Frequently Used policy, is implemented, which is an improved cache replacement algorithm that uses frequency information to make more intelligent cache decisions.

Frequency estimation plays a key role in the cache system, which does not trigger write operations directly, but rather optimizes cache performance by affecting write decisions. The system can make more intelligent cache replacement decisions, improve the cache efficiency and adapt to various data access modes. This mechanism allows the cache system to consider not only the most recent use of data, but also its long-term access frequency, thereby achieving more efficient cache management.

An embodiment of an algorithmic implementation is described below.

The data structure of the Count-Min Write is a two-dimensional table, the number of rows is less, usually 4-5 rows, the number of columns is more, such as hundreds or thousands of columns, each unit cell is a counter, and the initial value is 0; separate hash functions equal to the number of rows are prepared, which can map the input elements onto the indices of the columns. When an element arrives, calculating a column index by using each hash function, and then adding 1 to the column counter of the corresponding row, namely, inserting operation; the query operation estimates the frequency of an element, calculates the column index using all hash functions as well, and then takes the minimum of all corresponding counters.

Some counters may be shared by multiple different elements, resulting in a higher count, as there may be hash collisions. Minimizing this effect is minimized because it is unlikely that all rows will collide severely.

The Count-Min Write has the advantages of high space efficiency, no need of distributing storage space for each possible element, and suitability for processing huge data sets; the time complexity of the insert and query operations is constant in time efficiency; parallelization can be realized, and operations of different rows can be performed in parallel, so that the processing speed is improved; by adjusting the number of rows and columns, a balance can be achieved between accuracy and resource consumption, with adjustability. The Count-Min Write algorithm provides an innovative method for rapidly estimating the frequency of elements in a large-scale data stream with limited resources. The method provides sufficiently accurate frequency estimation for many practical applications while maintaining high efficiency, and brings new possibility for big data processing and analysis.

An imaging process is described by way of example below.

Attenuation triggering mechanism:

the decay process is triggered in the incr method of TinyLfu:

pub fn incr<T: Hash>(&self, key: T) ->u8 {

let window_size = self.window_counter.fetch_add(1, Ordering::Relaxed);

if window_size == self.window_limit || window_size>self.window_limit * 2 {

self.window_counter.store(0, Ordering::Relaxed);

self counter. Age (1);// shift right by 1 bit

}

self.estimator.incr(key)

}

Every time incr is called, window_counter is incremented by 1; checking whether the window_counter reaches the window_limit, and if so, triggering a decay process; the window_counter is reset to 0.

The attenuation is achieved as follows:

The attenuation operation is implemented in the age method of Estimator:

pub fn age(&self, shift: u8) {

for (slot, _) in self.estimator.iter() {

for counter in slot.iter() {

let c = counter.load(Ordering::Relaxed);

counter.store(c>>shift, Ordering::Relaxed);

}

Traversing all the counters, a right shift operation (here, a right shift of 1 bit) is performed for each counter, which achieves halving of the count value, i.e., a decaying effect.

The meaning of decay is that time decays, with recent accesses being more important than older accesses; overflow is prevented, and the counter is prevented from reaching the maximum value due to long-term operation; adaptation to changes, allowing the cache to adapt to changes in data access patterns.

The invention skillfully realizes a self-adaptive frequency estimation system. It also ensures that the system can adapt to changing data access patterns while maintaining estimation accuracy. The mechanism achieves good balance among memory efficiency, calculation efficiency and accuracy, so that the cache system can be excellent in various complex cache scenes.

The cache system designs an efficient hierarchical cache queue structure FiFoQueues, including small queues (small queues) and main queues (main queues). The small queue primarily stores recent or less frequent cache objects, which are promoted to the main queue as the access frequency of these objects increases, thereby reducing the probability of being replaced. The main queue is used for storing objects with higher frequency, and the cache hit rate and the management efficiency are improved through hierarchical management.

FiFoQueues is a core component for implementing an eviction policy In the cache system, which is an improved FIFO (First-In-First-Out) cache replacement policy, and aims to improve the hit rate and performance of the cache. The structure is as follows:

struct FiFoQueues<T>{

total_weight_limit: usize,

small: SegQueue<Key>,

small_weight: AtomicUsize,

main: SegQueue<Key>,

main_weight: AtomicUsize,

estimator: TinyLfu,

_t: PhantomData<T>,

}

in a dual queue system, small queues are used to store items that newly enter the cache, the main queue is used for storing hot items upgraded from the small queue. estimator: tinyLFU is a frequency estimator for deciding whether an item should be admitted or upgraded.

The FiFoQueues strategy has the advantages that: the method is quick in admittance, new items quickly enter a small queue, and the method is suitable for a burst access mode; the hot spot is protected, and frequently accessed items are upgraded to the main queue and are not easy to be evicted; the method has the advantages that the efficient cleaning is realized, and the items which are not commonly used are quickly removed from the small queue; anti-scanning, can effectively cope with the situation of a large number of one-time accesses.

FiFoQueues implements an efficient dual queue system that combines the simplicity of FIFO with the frequency awareness capabilities of LFU (Least Frequently Used). By the cooperation of the small queue and the main queue and the assistance of the estimator, the method can provide good cache performance under various access modes, and balances the response speed and the hit rate of the cache

The caching system uses a frequency promotion policy to manage the lifecycle of the cached objects. When the access frequency of the cache object increases, the cache object is lifted from the small queue to the main queue, so that the frequently accessed object is prevented from being deleted by mistake. When the cache is replaced, an adaptive strategy is adopted, the low-frequency objects are preferentially eliminated from the small queue, if the demands cannot be met, the objects are eliminated from the main queue, and the strategy realizes higher cache hit rate and more effective cache management.

The cache system provides flexible cache structure configuration capability, and a user can select the most suitable cache implementation (such as Compact or Fast) according to specific application scenarios, and if an application program is insensitive to memory use and the highest performance is required, fast is selected. Compact is selected if the application needs to handle a large number of cache entries, or run in a memory-constrained environment, such as an embedded system or a high concurrency server. Through the flexible design, the optimal balance of performance and memory occupation can be realized under different resource environments and application requirements.

As shown in fig. 4, computing device 30 includes processor 12 and memory 14. Computing device 30 may also include a bus 22, a network interface 16, and an I/O interface 18. The processor 12, memory 14, network interface 16, and I/O interface 18 may communicate with each other via a bus 22.

The processor 12 may include one or more general purpose CPUs (Central Processing Unit, processors), microprocessors, or application specific integrated circuits, etc. for executing associated program instructions. According to some embodiments, computing device 30 may also include a high performance display adapter (GPU) 20 that accelerates processor 12.

Memory 14 may include machine-system-readable media in the form of volatile memory, such as Random Access Memory (RAM), read Only Memory (ROM), and/or cache memory. Memory 14 is used to store one or more programs including instructions as well as data. The processor 12 may read instructions stored in the memory 14 to perform the methods according to embodiments of the invention described above.

Computing device 30 may also communicate with one or more networks through network interface 16. The network interface 16 may be a wireless network interface.

Bus 22 may be a bus including an address bus, a data bus, a control bus, etc. Bus 22 provides a path for exchanging information between the components.

It should be noted that, in the implementation, the computing device 30 may further include other components necessary to achieve normal operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the embodiments of the present description, and not all the components shown in the drawings.

The present invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the above method. The computer readable storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, DVDs, CD-ROMs, micro-drives, and magneto-optical disks, ROM, RAM, EPROM, EEPROM, DRAM, VRAM, flash memory devices, magnetic or optical cards, nanosystems (including molecular memory ICs), network storage devices, cloud storage devices, or any type of media or device suitable for storing instructions and/or data.

Embodiments of the present invention also provide a computer program product comprising a computer program operable to cause a computer to perform part or all of the steps of any one of the methods described in the method embodiments above.

It will be clear to a person skilled in the art that the solution according to the invention can be implemented by means of software and/or hardware. "Unit" and "module" in this specification refer to software and/or hardware capable of performing a specific function, either alone or in combination with other components, where the hardware may be, for example, a field programmable gate array, an integrated circuit, or the like.

It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present invention is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present invention. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present invention.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.

In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, such as a division of units, merely a division of logic functions, and there may be additional divisions in actual implementation, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some service interface, device or unit indirect coupling or communication connection, electrical or otherwise.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable memory. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in whole or in part in the form of a software product stored in a memory, comprising several instructions for causing a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the method of the various embodiments of the present invention.

The exemplary embodiments of the present invention have been particularly shown and described above. It is to be understood that this invention is not limited to the precise arrangements, instrumentalities and instrumentalities described herein; on the contrary, the invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A method for a memory caching system for storing a key-value object, the method comprising:

2. The method of claim 1, wherein writing the object to be cached to the cache instance comprises:

3. The method as recited in claim 1, further comprising:

And when the frequency of the access frequency of the cache key value object is larger than the access threshold value, the cache key value object is moved from the small queue to the main queue.

4. The method of claim 1, wherein preferentially eliminating cache key objects from the small queue when cache capacity is insufficient comprises:

5. The method of claim 1, wherein preferentially eliminating cached key objects from the small queue when cache capacity is insufficient comprises:

6. The method of claim 5, wherein the evicting the cache key object from the master queue comprises:

And when the small queue is empty, eliminating the cache key value object with the access frequency lower than a frequency elimination threshold value and/or with the residence time lower than a residence time elimination threshold value.

7. The method as recited in claim 1, further comprising:

And when the cache key value object is read, updating corresponding object metadata, wherein the object metadata comprises access frequency and a belonging queue.

8. The method as recited in claim 1, further comprising:

Dynamically adjusting write-in and elimination strategies of the memory cache system;

9. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-8.

10. A computing device, comprising:

A processor; and

Memory storing a computer program which, when executed by the processor, implements the method according to any of claims 1-8.