OCR WebUI
OCR WebUI allows users to upload images, crop specific sections, and extract text using Optical Character Recognition (OCR). The project is built in Go, leveraging gosseract for OCR and HTMX for seamless interactions while being lightweight.
Installation
Docker
- Run
docker run -p 3000:3000 ghcr.io/purylte/ocr-webui:latest
- Open http://localhost:3000/app
Local
- Ensure Tesseract and Leptonica is installed
- Add required languages by placing traineddata file in your tesseract installation.
- Run
./ocr-webui
- Open http://localhost:3000/app
Development
Using Dev Container (VS Code)
- Ensure Docker and Dev Containers extension is installed
- Open this project in VS Code
git clone https://github.com/purylte/ocr-webui.git
code ocr-webui
- Run "Dev Containers: Reopen in Container" in VS Code
- Run
air
to start hot reload
Manually
- Clone the repository
git clone https://github.com/purylte/ocr-webui.git
cd ocr-webui
-
Install Tesseract and Leptonica
-
Install the required Go tools:
go install github.com/a-h/templ/cmd/templ@latest
go install github.com/air-verse/air@latest```
- Run
air
to start hot reload
Todo
- Preprocess image before doing OCR using gocv
- Test
- Better logging & error handling
Contributing
Feel free to fork this project, submit issues, and create pull requests. Contributions are welcome!
License
This project is licensed under the MIT License - see the LICENSE file for details.