8000 GitHub - dinosaure/art: Adaptive Radix Tree in OCaml
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

dinosaure/art

Repository files navigation

Adaptive Radix Tree (ART) in OCaml

This is an implementation in OCaml of ART. Adaptive Radix Tree is like a simple Hashtbl with order:

# let tree = Art.make () ;;
# Art.insert tree (Art.key "foo") 42 ;;
# Art.insert tree (Art.key "bar") 21 ;;
# Art.find tree (Art.key "foo")
- : int = 42

Operation like minimum or maximum are available (which don't exist for a simple Hashtbl.t):

# let tree = Art.make () ;;
# Art.insert tree (Art.key "0") 0 ;;
# Art.insert tree (Art.key "1") 1 ;;
# Art.insert tree (Art.key "2") 2 ;;
# Art.minimum tree
- : int = 0
# Art.maximum tree
- : int = 2

If you want the order and the speed of Hashtbl.t, Art is your library:

The function prefix_iter is also available if you want to get a subset of your tree:

# let t = Art.make () ;;
# Art.insert t (Art.key
# Art.insert t (Art.key "Dalton Joe") 0 ;;
# Art.insert t (Art.key "Dalton Jack") 1 ;;
# Art.insert t (Art.key "Dalton William") 2 ;;
# Art.insert t (Art.key "Dalton Averell") 3 ;;
# Art.insert t (Art.key "Rantanplan") 4 ;;
# let dalton = Art.prefix_iter ~prefix:(Art.key "Dalton")
  (fun k _ a -> (k :> string) :: a) [] t ;;
- : string list = [ "Dalton Joe"
                  ; "Dalton Jack"
		  ; "Dalton William"
		  ; "Dalton Averell" ]

Read Optimised Write Exclusion (ROWEX) in OCaml

ROWEX is a second implementation of ART with atomic operations. It's a functor which expects an implementation of atomic operations such as load or store.

Parallelism, atomic operation & OCaml

The current version of OCaml has a global lock for the GC. By this way, it's not possible for us to execute ROWEX operations (find/insert) with true parallelism if we use the same OCaml runtime. Even if you use LWT or ASYNC, you execute jobs concurrently.

However, ROWEX wants to provide an implementation where find/insert can be executed in parallel without any problems (race condition or ABA problem). So ROWEX provides an implementation, persistent, which implements atomic operations on a memory area. Then, we are able, as parmap, to simulate true parallelism as long as each operations are executed into their own fork().

The goal of this library is provide:

  • the most easy way to switch the implementation to ocaml-multicore
  • a baby step to be able to manipulate a file by several processes (consumers/find, producers/insert) in parallel

ROWEX follows two main papers:

  • The initial implementation of ROWEX
  • A derivation of it to be persistent: PART

Tools

The distribution comes with some tools to manipulate an index:

$ opam pin add -y https://github.com/dinosaure/art
$ opam install rowex
$ part.make index.idx
$ ls -lh
-rw-r--r-- 1 user user 8,0M ----- -- --:-- index.idx
prw------- 1 user user    0 ----- -- --:-- index.idx.socket
prw------- 1 user user    0 ----- -- --:-- index.idx-truncate.socket
$ part.insert index.idx foo 1
$ part.find index.idx foo
1

On the OCaml side, a Part module exists which implements these functions:

type 'a t constraint 'a = [< `Rd | `Wr ]

val create : ?len:int -> string -> unit
val insert : [> `Rd | `Wr ] t -> string -> int -> unit
val lookup : [> `Rd ] t -> string -> int

part is Unix dependent (and it need an Unix named pipe). It ensures with explained internal mechanisms to use multiple readers and one writer:

  • The writer can take the exclusive ownership on the index file and its named pipe
  • readers don't need to take the ownership but they must send a signal into the named pipe (to the writer) that they start to introspect the index

For readers, some functions exist to signal their existence to the write:

val append_reader : Ipc.t -> unit
val delete_reader : Ipc.t -> unit

val ipc : _ t -> Ipc.t

Status: experimental

This part of the distribution is experimental - even if the distribution comes with several tests to ensure that the implementation works, ROWEX is fragile! It still need a synchronization mechanism fsync() which is added pervasively in some parts of the code according to outcomes of errors.

About

Adaptive Radix Tree in OCaml

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  
0