type
status
date
slug
summary
tags
category
icon
password
URL
Rating
 
[English] | [中文版]
 
Pix2Text (P2T) aims to be a free and open-source Python alternative to Mathpix, and it can already accomplish Mathpix's core functionality. Pix2Text (P2T) can recognize layouts, tables, images, text, mathematical formulas, and integrate all of these contents into Markdown format. P2T can also convert an entire PDF file (which can contain scanned images or any other format) into Markdown format. The text recognition engine of Pix2Text supports 80+ languages, including English, Simplified Chinese, Traditional Chinese, Vietnamese, etc.
Pix2Text (P2T) integrates the following models:
  • Text Recognition Engine: Supports 80+ languages such as English, Simplified Chinese, Traditional Chinese, Vietnamese, etc. For English and Simplified Chinese recognition, it uses the open-source OCR tool CnOCR, while for other languages, it uses the open-source OCR tool EasyOCR.
Several models are contributed by other open-source authors, and their contributions are highly appreciated.
notion image
For detailed explanations, please refer to the Models.

Online Service

 
Everyone can use the P2T Online Service for free, with a daily limit of 10,000 characters per account, which should be sufficient for normal use. Please refrain from bulk API calls, as machine resources are limited, and this could prevent others from accessing the service.
 
Due to hardware constraints, the Online Service currently only supports Simplified Chinese and English languages. To try the models in other languages, please use the following Online Demo.

Demo 🤗

 
You can also try the Online Demo to see the performance of P2T in various languages. However, the online demo operates on lower hardware specifications and may be slower. For Simplified Chinese or English images, it is recommended to use the P2T Online Service.

Documentation

Available Models

P2T includes two kinds of models: Math Formula Detection (MFD) and Math Formula Recognition (MFR). For details, see the project description. By default, P2T uses free open-source models and will automatically download them when in use. Besides the free models, I will continue to optimize the models. The latest models require purchase for downloading and usage. If you are not deploying locally, it's recommended to directly use the P2T Online Service, as the Online Service always utilizes the most recent models.
 
The current models (both latest) used in the Online Service are:
  • MFR-Plus/MFR-Pro V1.0
  • MFD-Pro V1.1.1: version-20240618
The paid models used in the Online Service perform better than the open-source models. If you need to deploy the P2T service on your own, it's advisable to purchase the same models used in the Online Service.
 
To thank our Planet Members for their support, all models (only for personal use) are available at a 20% discount for Planet Members. To purchase, add the assistant as a friend, and after arranging payment, the assistant will provide the model files directly. Note: No discounts are offered for the enterprise versions.
 
Things to note before purchasing:
📌
For personal use, please follow the column “Individual Purchase” of the tables; For business or commercial use, please follow the column “Commercial Purchase” of the tables, or contact the author (Email: breezedeus AT gmail.com).

Model Stores

Model purchases are available at the following two stores:
Store
Description
Only sells models for personal use. Cannot issue invoices.
Sells models for commercial and personal use. The platform can issue invoices (US-style invoices).
Here are more specific instructions.

Purchasing the Math Formula Detection (MFD) models

Here are the purchase links for different versions. It is recommended to try the Online Demo to verify the model's performance before making a purchase. Each version has a different License; please click the links in the table to view the product details. If you have any issues, you can contact the author. The Enterprise version includes both the MFD and MFR models, so there is no need to buy them separately.
MFD Model Version
Commercial Purchase
Individual Purchase
For Planet Members
Free Download
mfd
✖️
✖️
✔️
✔️
mfd-advanced
✖️
✔️ Free
✖️
mfd-pro
✔️ 20% off for personal use from Bilibili
✖️
📌
These models are only compatible with Pix2Text V1.1.1.
 

Purchasing the Math Formula Recognition (MFR) models

Here are the purchase links for different versions. It is recommended to try the Online Demo to verify the model's performance before making a purchase. Each version has a different License; please click the links in the table to view the product details. If you have any issues, you can contact the author. The Enterprise version includes both the MFD and MFR models, so there is no need to buy them separately.
MFR Model Version
Commercial Purchase
Individual Purchase
For Planet Members
Free Download
mfr
✖️
✖️
✔️
✔️
mfr-pro
✔️ 20% off for personal use from Bilibili
✖️
mfr-plus
✖️
✖️
✖️
📌
These models are compatible with both Pix2Text V1.0, V1.1, V1.1.1.
 
 
Pix2Text V1.1/V1.0 includes two types of enterprise editions. The differences of both are shown in the figure below. The Enterprise Pro Edition is a one-time purchase; new models require a separate purchase. The Enterprise Pro Edition is allowed only for internal corporate use or for providing free services externally (such as educational institutions), and cannot be used for offering paid services. The Enterprise Plus Edition comes with free access to all new models for one year after purchase. The Enterprise Plus Edition not only provides the Pro models but also offers the Plus models. Additionally, it includes PyTorch versions of all models, enabling enterprises to fine-tune the models with their own data or convert them into other required model formats, such as CoreML. The Enterprise Plus Edition permits the provision of paid services.
For more detailed information, please visit the Model Store (specific details are available on the product detail pages).
notion image
 

Usage Instructions After Purchase

After purchasing the Enterprise Pro/Plus Edition through the Model Store, you can download two compressed files related to the models. The file starting with p2t-mfd- is the MFD (Math Formula Detection) model, and the one starting with p2t-mfr- is the MFR (Math Formula Recognition) model. After unzipping the MFD model file, you will find a folder named yolov7-model containing the model file, for example, mfd-yolov7-20230613.pt. Suppose the path to the file p2t-mfr-20230702.pth is abc/def/yolov7-model/p2t-mfr-20230702.pth. After unzipping the MFR model file, you will find a folder named mfr-pro-onnx, which includes the model file and related configuration files. Assume the path to the mfr-pro-onnx folder is abc/def/mfr-pro-onnx.
 
The usage instructions for the various versions of Pix2Text are as follows (the latest version is recommended always):
If you are using P2T V1.0, please refer to: Pix2Text V1.0 New Release: The Best Open-Source Formula Recognition Model .
When initializing Pix2Text, pass the parameters as follows. The usage after initialization is the same as the open-source model, and the structure of the detection and recognition results is also the same.
 
If you purchase the Enterprise Pro Subscription Edition, you will have access to more model files (currently 5), including the PyTorch version of MFR and the latest paid model of CnOCR (text OCR) (both ONNX and PyTorch versions), which has better recognition performance for English and Simplified Chinese text. Use the following method to input the corresponding model.
📌
The CnOCR text model only supports English and Simplified Chinese. If you need to recognize text in other languages, do not use the CnOCR model. Simply remove the text_config from the code above.
 

Code Repo

 
📌
P2T uses CnOCR or EasyOCR to recognize the text part in images. For more information on CnOCR, refer to this link.
 
📌
Make sure you've successfully run Pix2Text using the open-source models. Otherwise, after downloading the paid models, you might encounter problems getting them to work. Detailed installation and usage instructions can be found in the Pix2Text project documentation. If you face any issues, feel free to comment here or join the group chat to communicate with me. However, please note that helping you to get the code running is not within the services provided by the Planet host (refer to Planet Description).
P2T详细资料FM 类模型
Loading...