8000 memory optimization · Issue #7 · elmindreda/duff · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
memory optimization #7
Open
Open
@marcin-gryszkalis

Description

@marcin-gryszkalis

Hi
I'm doing some work on duff because I found it useful when fixing broken rsnapshot repositories (I will make some pull request in few days). Unfortunately such repositories are a bit unusual (millions of files, mostly hardlinked in groups of 30-50).

It seems that I'm having problem with large buckets (long lists): because each sampled file allocates 4KB of data that is going to be free at the end of bucket processing - I'm getting "out of memeory" errors at 3GB of memory allocated (because the box is light 32-bit atom-based system).

As sizeof(FileList) == 12 I see no problem increasing HASH_BITS to 16 (~800KB) or even 20 (~13MB).
I wonder what you think - if it's a good idea to add an option to make it runtime-configurable?

Another idea is to replace (optionally?) sample with some simple fast running checksum (crc64?).

Metadata

Metadata

Assignees

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0