Texify — screenshot of github.com

Texify

Texify is an OCR model for converting images and PDFs containing mathematical content into Markdown and LaTeX, rendering it with MathJax. It was designed to handle both block and inline equations.

Visit github.com →

Questions & Answers

What is Texify?
Texify is an OCR model designed to convert images or PDFs containing mathematical content into Markdown and LaTeX formats. It supports both block equations and inline math mixed with text, rendering the output via MathJax.
Who is Texify intended for?
Texify is useful for anyone who needs to extract mathematical expressions and accompanying text from image-based documents into editable digital formats like LaTeX or Markdown. This includes academics, researchers, and students working with scientific or technical content.
How does Texify compare to other OCR tools like Pix2tex or Nougat?
Unlike Pix2tex, which focuses solely on block LaTeX equations and can hallucinate on text, or Nougat, designed for entire page OCR and less effective on small math images, Texify is trained on diverse web data to handle a range of images, including both equations and surrounding text.
When should I use Texify?
Texify should be used when you need to accurately OCR sections of documents that contain mathematical formulas, whether standalone or embedded within text. It is optimized for converting image crops containing math rather than entire pages of general text.
What are the installation requirements for Texify?
Texify requires Python 3.9+ and PyTorch. Model weights are downloaded automatically upon the first run. Users can install it via pip and override default settings like the PyTorch device (e.g., CPU, GPU, or MPS) using environment variables.