This project is focused on processing and segmenting OCR (Optical Character Recognition) data. It includes tools and scripts to parse OCR outputs and prepare them for further analysis or processing.
- Extract text from videos of documents.
- Parse raw OCR outputs into structured formats.
- Clean and preprocess OCR data for downstream AI/ML.
- Export video scans to PDF.
The make parse_OCR
target is designed to parse OCR data and extract meaningful information. This target automates the process of handling raw OCR outputs, cleaning the data, and organizing it into a structured format.
Put your videos in Videos/
before continuing. These are videos of flipping pages of documents. The OCR will be run on these videos, and the output will be saved in the test_frames/
directory.
Once the videos are in the appropriate directory, simply execute the following command (from the main project directory):
make parse_OCR
To parse OCR data from a sample video, follow these steps:
- Place your video file in the
Videos/
directory (e.g.,sample_video.mov
). - Run the following command from the main project directory:
make parse_OCR
- The output will be saved in the
test_frames/
directory. - Check the structured output in
test_frames/sample_video.mov/
.
Contributions are welcome! Please follow these steps:
- Fork the repository.
- Create a new branch for your feature or bugfix.
- Commit your changes and push the branch.
- Submit a pull request.
This project is licensed under the Apache License 2.0. See the LICENSE file for details.