8000 GitHub - dloss/binary-parsing: A list of generic tools for parsing binary data structures, such as file formats, network protocols or bitstreams
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

A list of generic tools for parsing binary data structures, such as file formats, network protocols or bitstreams

License

Notifications You must be signed in to change notification settings

dloss/binary-parsing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 

Repository files navigation

Awesome Binary Parsing

A curated collection of tools and resources for parsing and analyzing binary data structures, such as file formats, network protocols or bitstreams.

Libraries and Tools by Programming Language

Python

  • Construct: library for parsing and building of data structures (binary or textual). Define your data structures in a declarative manner
  • Hachoir: view and edit a binary stream field by field. Long list of parsers for all kinds of formats
  • Caterpillar: Python 3.12+ library to pack and unpack structurized binary data
  • Scapy: send, sniff and dissect and forge network packets. Usable interactively or as a library
  • Mr. Crowbar: Django-esque model framework for reading and writing binary file formats. Includes a suite of command-line tools for visualising and digging through binary data

JavaScript

  • Binary-parser: binary parser builder library which enables you to write efficient parsers in a simple & declarative way
  • jBinary: High-level API for working with binary data.

C/C++

  • Hammer (C): bit-oriented parsing library
  • FormatFuzzer (C++): framework for high-efficiency, high-quality generation and parsing of binary inputs
  • Marpa (C/C++, Perl, Go): libmarpa (C)
  • Wuffs: a memory-safe programming language (and a standard library written in that language) for Wrangling Untrusted File Formats Safely. Wrangling includes parsing, decoding and encoding.
  • libtins (C++): crafting, sending, sniffing and interpreting raw network packets
  • libcrafter (C++): high level library for C++ designed to create and decode network packets

Java

Go

  • restruct: library for reading and writing binary data

Rust

  • Nom: Rust parser combinator framework
  • Deku: bit-level, symmetric, serialization/deserialization implementations for structs and enums
  • binrw: binrw helps you write maintainable & easy-to-read declarative binary data readers and writers using ✨macro magic✨.

Ruby

  • BinData: provides a declarative way to read and write structured binary data

Other Programming Languages

  • FlexT (Delphi): a DSL and a tool for generating parsers in Delphi
  • Haka (Lua): open source security oriented language which allows to describe protocols and apply security policies on (live) captured traffic
  • binarylang (Nim): extensible Nim DSL for creating binary parsers/encoders in a symmetric fashion
  • binaryparse (Nim): In-language DSL for reading and writing binary data supporting all sorts of patterns. Generates an efficient stream based reader and writer for the runtime execution.
  • Gloss (Clojure): turn complicated byte formats into Clojure data structures and Clojure data structures into compact byte representations
  • scodec (Scala): Combinator library for working with binary data
  • attoparsec and attoparsec-binary: (Haskell): fast parser combinator library, aimed particularly at dealing efficiently with network protocols and complicated text/binary file formats
  • Parsifal (OCaml): OCaml-based parsing engine. Paper: A pragmatic solution to the binary parsing problem. Olivier Levillain

Language-Agnostic Tools

Binary Format Description Languages

  • Kaitai Struct (DSL): declarative language used for describe various binary data structures, laid out in files or in memory
  • RecordFlux: toolset for the formal specification of messages and the generation of verifiable binary parsers and message generators (Ada-inspired).
  • Spicy (DSL, C/C++, Zeek): a next-generation parser generator for network protocols and file formats
  • DataScript Tools (DSL): DataScript is a formal language for modelling binary datatypes, bitstreams or file formats. PDF
  • Dogma (DSL): human-friendly metalanguage for describing data formats in documentation using the familiar patterns of Backus-Naur Form.
  • EverParse: a framework for generating verified secure parsers and formatters from domain-specific format specification languages

Standalone Applications

Hex Editors with Grammars

  • Synalyze It! (macOS): hex editor with grammar-based binary format parsing
  • Hexinator (Windows): hex editor with grammar-based binary format parsing
  • 010 Editor (Windows/macOS/Linux): hex editor with C-style binary templates and large template library
  • Kiewtai: plugin for the Hiew hex editor that makes the Kaitai parsers available
  • Hobbits: multi-platform GUI for bit-based analysis, processing, and visualization. Has a Kaitai plugin.
  • ImHex (Windows/macOS/Linux): A Hex Editor for Reverse Engineers, Programmers and people who value their retinas when working at 3 AM.

Binary Analysis Tools

  • GNU poke: The extensible editor for structured binary data
  • fq: jq for binary formats - tool, language and decoders for working with binary and text formats
  • radare2 (C, with bindings/pipe for almost all languages): Unix-like reverse engineering framework and commandline tools. See Parsing a fileformat with radare2 and Types.
  • Veles: open source tool for binary analysis

Network Protocol Analysis

  • Wireshark: network protocol analyzer that includes dissectors for over two thousand protocols.

    • TShark: command line version, can easily be called from shell scripts.
    • Wireshark Generic Dissector: add-on, allows dissection of a protocol based on a text description of the protocol elements
    • Wireshark Lua: dissectors can be written in Lua (Examples)
    • pyreshark: plugin providing a simple interface for writing Wireshark dissectors in Python
    • Sharktools (Python, Matlab): Tools for programmatic parsing of packet captures using Wireshark functionality
  • Netzob: open source tool for reverse engineering, traffic generation and fuzzing of communication protocols

  • Cat Karat Packet Builder: packet generation tool that allows to build custom packets for firewall or target testing

  • Scapy: send, sniff and dissect and forge network packets.

Research papers

  • Interval Parsing Grammars for File Format Parsing (2023): Jialun Zhang, Greg Morrisett, Gang Tan
  • LangSec Platform (2021): Towards a Platform to Compare Binary Parser Generators. Olivier Levillain, Sébastien Naud, Aina Toky Rasoamanana (Video)
  • Narcissus (2019): Correct-By-Construction Derivation of Decoders and Encoders from Binary Formats. Benjamin Delaware, Sorawit Suriyakarn, Clément Pit-Claudel, Qianchuan Ye, Adam Chlipala
  • EverParse (2019): Verified Secure Zero-Copy Parsers for Authenticated Message Formats. Tahina Ramananandro et. al.
  • Generic packet descriptions (2017): Verified parsing and pretty printing of low-level data. Marcell van Geest, Wouter Swierstra
  • Nail (2014): A Practical Tool for Parsing and Generating Data Formats. Julian Bangert and Nickolai Zeldovich
  • FlowSifter (2014): High-Speed Application Protocol Parsing and Extraction for Deep Flow Inspection. Alex X. Liu, Chad R. Meiners, Eric Norige, and Eric Torng
  • Zebra (2013): Improving the Performance of Message Parsers for Embedded Systems. Jigar Solanki et. al.
  • W. Underwood (2012): Grammar-Based Specification and Parsing of Binary File Formats. William Underwood
  • Yakker (2010): Semantics and Algorithms for Data-dependent Grammars. Trevor Jim, Yitzhak Mandelbaum, David Walker
  • Zebu (2009): A Language-Based Approach for Improving the Robustness of Network Application Protocol Implementations. Larent Burgy et. al.
  • z2z (2009): Automatic Generation of Network Protocol Gateways. Yerom-David Bromberg, Laurent Reveillere, Julia L. Lawall, Gilles Muller
  • Tupni (2008): Automatic Reverse Engineering of Input Formats. Weidong Cui et. al.
  • PADS/ML (2007): a functional data description language. Y. Mandelbaum, K. Fisher, D. Walker, M. F. Fernandez, and A. Gleyzer.
  • BinPAC (2006): Superseded by BinPAC++, which is now known as Spicy
  • NetPDL (2006): Markup Language that aims to describe Protocols from OSI layer 2 to OSI layer 7
  • TSN.1 (2005): Transfer Syntax Notation One (TSN.1). A formal notation for describing messages in binary protocols
  • GAPA (2005): Generic Application-Level Protocol Analyzer and its Language. Nikita Borisov, David J. Brumley, Helen J. Wang, Chuanxiong Guo
  • PacketTypes (2000): P. J. McCann and S. Chandra. Packet types: Abstract specification of network protocol messages.

Binary Format References

Related Topics

About

A list of generic tools for parsing binary data structures, such as file formats, network protocols or bitstreams

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 13

  • 2E66
0