Hugo Persson

Image to Markdown OCR

About the Project

This tool converts images containing mathematical formulas or text into Markdown or LaTeX format. It supports multiple OCR providers to give users flexibility in their workflow:

  • Texify - A self-hosted service converting images to Markdown (Recommended)
  • SimpleTex - A hosted service specializing in LaTeX conversion
  • pix2tex - A self-hosted alternative for LaTeX conversion

How It Works

The conversion process is straightforward:

  1. Capture an image or formula using screenshot
  2. The image is processed through your chosen OCR provider
  3. The result is returned as clean Markdown or LaTeX code
Demo of Image to Markdown OCR conversion process

Technology Stack

This project leverages several powerful technologies and services:

  • Node.js for backend processing
  • Docker support for containerization
  • Python-based OCR engines
  • RESTful APIs for service integration

Future Development

Planned improvements include:

  • Support for batch processing multiple images
  • Enhanced recognition accuracy
  • Additional output format options
  • Improved error handling and validation