8000 GitHub - jmr/marisa-trie at win-hash
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

jmr/marisa-trie

 
 

Repository files navigation

README

Project name

marisa-trie

Project summary

MARISA: Matching Algorithm with Recursively Implemented StorAge

Latest version

0.2.7

Description

Matching Algorithm with Recursively Implemented StorAge (MARISA) is a static and space-efficient trie data structure. And libmarisa is a C++ library to provide an implementation of MARISA. Also, the package of libmarisa contains a set of command line tools for building and operating a MARISA-based dictionary.

A MARISA-based dictionary supports not only lookup but also reverse lookup, common prefix search and predictive search.

  • Lookup is to check whether or not a given string exists in a dictionary.
  • Reverse lookup is to restore a key from its ID.
  • Common prefix search is to find keys from prefixes of a given string.
  • Predictive search is to find keys starting with a given string.

The biggest advantage of libmarisa is that its dictionary size is considerably more compact than others. See below for the dictionary size of other implementations.

  • Input
    • Source: enwiki-20121101-all-titles-in-ns0.gz
    • Contents: all page titles of English Wikipedia (Nov. 2012)
    • Number of keys: 9,805,576
    • Total size: 200,435,403 bytes (plain) / 54,933,690 bytes (gzipped)
Implementation Size (bytes) Remarks
darts-clone 376,613,888 Compacted double-array trie
tx-trie 127,727,058 LOUDS-based trie
marisa-trie 50,753,560 MARISA trie

Documentation

Build instructions

You can get the latest version via git clone. Then, you can build and install libmarisa and its command line tools via cmake.

$ git clone https://github.com/s-yata/marisa-trie.git
$ cd marisa-trie
$ cmake -S. -Bbuild-rel -DCMAKE_BUILD_TYPE=Release -DENABLE_NATIVE_CODE=ON -DBUILD_TESTING=OFF
$ cmake --build build-rel
$ sudo cmake --install bulld-rel

To install just the library without the binaries:

$ sudo cmake --install build-rel --component Library

To install just the binaries:

$ sudo cmake --install build-rel --component Binaries

Source code license

Licensed under BSD-2-Clause OR LGPL-2.1-or-later.

See COPYING for details.

About

MARISA: Matching Algorithm with Recursively Implemented StorAge

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 95.0%
  • Python 2.0%
  • CMake 1.1%
  • Perl 1.0%
  • C 0.4%
  • Ruby 0.3%
  • Other 0.2%
0