PDFMathTranslate

PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero

25,542
2,208
<div align="center">

English | 简体中文 | 繁體中文 | 日本語 | 한국어

<img src="./docs/images/banner.png" width="320px" alt="PDF2ZH"/> <h2 id="title">PDFMathTranslate</h2> <p> <!-- PyPI --> <a href="https://pypi.org/project/pdf2zh/"> <img src="https://img.shields.io/pypi/v/pdf2zh"></a> <a href="https://pepy.tech/projects/pdf2zh"> <img src="https://static.pepy.tech/badge/pdf2zh"></a> <a href="https://hub.docker.com/repository/docker/byaidu/pdf2zh"> <img src="https://img.shields.io/docker/pulls/byaidu/pdf2zh"></a> <a href="https://hellogithub.com/repository/8ec2cfd3ef744762bf531232fa32bc47" target="_blank"><img src="https://api.hellogithub.com/v1/widgets/recommend.svg?rid=8ec2cfd3ef744762bf531232fa32bc47&claim_uid=JQ0yfeBNjaTuqDU&theme=small" alt="Featured|HelloGitHub" /></a> <a href="https://gitcode.com/Byaidu/PDFMathTranslate/overview"> <img src="https://gitcode.com/Byaidu/PDFMathTranslate/star/badge.svg"></a> <a href="https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker"> <img src="https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-FF9E0D"></a> <a href="https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate"> <img src="https://img.shields.io/badge/ModelScope-Demo-blue"></a> <a href="https://github.com/Byaidu/PDFMathTranslate/pulls"> <img src="https://img.shields.io/badge/contributions-welcome-green"></a> <a href="https://t.me/+Z9_SgnxmsmA5NzBl"> <img src="https://img.shields.io/badge/Telegram-2CA5E0?style=flat-squeare&logo=telegram&logoColor=white"></a> <!-- License --> <a href="./LICENSE"> <img src="https://img.shields.io/github/license/Byaidu/PDFMathTranslate"></a> </p>

<a href="https://trendshift.io/repositories/12424" target="_blank"><img src="https://trendshift.io/api/badge/repositories/12424" alt="Byaidu%2FPDFMathTranslate | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

</div>

PDF scientific paper translation and bilingual comparison.

Feel free to provide feedback in GitHub Issues or Telegram Group.

For details on how to contribute, please consult the Contribution Guide.

<h2 id="updates">Updates</h2>
  • [May 9, 2025] pdf2zh 2.0 Preview Version #586: The Windows ZIP file and Docker image are now available.

[!NOTE]

2.0 Moved to a new repository under the organization: PDFMathTranslate/PDFMathTranslate-next

Version 2.0 official release has been published.

  • [Mar. 3, 2025] Experimental support for the new backend BabelDOC WebUI added as an experimental option (by @awwaawwa)
  • [Feb. 22 2025] Better release CI and well-packaged windows-amd64 exe (by @awwaawwa)
  • [Dec. 24 2024] The translator now supports local models on Xinference (by @imClumsyPanda)
  • [Dec. 19 2024] Non-PDF/A documents are now supported using -cp (by @reycn)
  • [Dec. 13 2024] Additional support for backend by (by @YadominJinta)
  • [Dec. 10 2024] The translator now supports OpenAI models on Azure (by @yidasanqian)
<h2 id="preview">Preview</h2> <div align="center"> <img src="./docs/images/preview.gif" width="80%"/> </div> <h2 id="demo">Online Service 🌟</h2>

You can try our application out using either of the following demos:

Note that the computing resources of the demo are limited, so please avoid abusing them.

<h2 id="install">Installation and Usage</h2>

Methods

For different use cases, we provide distinct methods to use our program:

<details open> <summary>1. UV install</summary>
  1. Python installed (3.10 <= version <= 3.12)
  2. Install our package:
pip install uv
uv tool install --python 3.12 pdf2zh
  1. Execute translation, files generated in current working directory:
pdf2zh document.pdf
</details> <details> <summary>2. Windows exe</summary>
  1. Download pdf2zh-version-win64.zip from release page
  2. Unzip and double-click pdf2zh.exe to run.
</details> <details> <summary>3. Graphic user interface</summary>
  1. Python installed (3.10 <= version <= 3.12)
  2. Install our package:
pip install pdf2zh
  1. Start using in browser:
pdf2zh -i
  1. If your browser has not been started automatically, goto
http://localhost:7860/
<img src="./docs/images/gui.gif" width="500"/>

See documentation for GUI for more details.

</details> <details> <summary>4. Docker</summary>
  1. Pull and run:
docker pull byaidu/pdf2zh
docker run -d -p 7860:7860 byaidu/pdf2zh
  1. Open in browser:
http://localhost:7860/

For docker deployment on cloud service:

<div> <a href="https://www.heroku.com/deploy?template=https://github.com/Byaidu/PDFMathTranslate"> <img src="https://www.herokucdn.com/deploy/button.svg" alt="Deploy" height="26"></a> <a href="https://render.com/deploy"> <img src="https://render.com/images/deploy-to-render-button.svg" alt="Deploy to Koyeb" height="26"></a> <a href="https://zeabur.com/templates/5FQIGX?referralCode=reycn"> <img src="https://zeabur.com/button.svg" alt="Deploy on Zeabur" height="26"></a> <a href="https://template.sealos.io/deploy?templateName=pdf2zh"> <img src="https://sealos.io/Deploy-on-Sealos.svg" alt="Deploy on Sealos" height="26"></a> <a href="https://app.koyeb.com/deploy?type=git&builder=buildpack&repository=github.com/Byaidu/PDFMathTranslate&branch=main&name=pdf-math-translate"> <img src="https://www.koyeb.com/static/images/deploy/button.svg" alt="Deploy to Koyeb" height="26"></a> </div> </details> <details> <summary>5. Zotero Plugin</summary>

See Zotero PDF2zh for more details.

</details> <details> <summary>6. Commandline</summary>
  1. Python installed (3.10 <= version <= 3.12)
  2. Install our package:
pip install pdf2zh
  1. Execute translation, files generated in current working directory:
pdf2zh document.pdf
</details>

[!TIP]

docker pull ghcr.io/byaidu/pdfmathtranslate
docker run -d -p 7860:7860 ghcr.io/byaidu/pdfmathtranslate

Unable to install?

The present program needs an AI model(wybxc/DocLayout-YOLO-DocStructBench-onnx) before working and some users are not able to download due to network issues. If you have a problem with downloading this model, we provide a workaround using the following environment variable:

set HF_ENDPOINT=https://hf-mirror.com

For PowerShell user:

$env:HF_ENDPOINT = https://hf-mirror.com

If the solution does not work to you / you encountered other issues, please refer to frequently asked questions.

<h2 id="usage">Advanced Options</h2>

Execute the translation command in the command line to generate the translated document example-mono.pdf and the bilingual document example-dual.pdf in the current working directory. Use Google as the default translation service. More support translation services can find HERE.

<img src="./docs/images/cmd.explained.png" width="580px" alt="cmd"/>

In the following table, we list all advanced options for reference:

OptionFunctionExample
filesLocal filespdf2zh ~/local.pdf
linksOnline filespdf2zh http://arxiv.org/paper.pdf
-iEnter GUIpdf2zh -i
-pPartial document translationpdf2zh example.pdf -p 1
-liSource languagepdf2zh example.pdf -li en
-loTarget languagepdf2zh example.pdf -lo zh
-sTranslation servicepdf2zh example.pdf -s deepl
-tMulti-threadspdf2zh example.pdf -t 1
-oOutput dirpdf2zh example.pdf -o output
-f, -cExceptionspdf2zh example.pdf -f "(MS.*)"
-cpCompatibility Modepdf2zh example.pdf --compatible
--skip-subset-fontsSkip font subsetpdf2zh example.pdf --skip-subset-fonts
--ignore-cacheIgnore translate cachepdf2zh example.pdf --ignore-cache
--sharePublic linkpdf2zh -i --share
--authorizedAuthorizationpdf2zh -i --authorized users.txt [auth.html]
--promptCustom Promptpdf2zh --prompt [prompt.txt]
--onnx[Use Custom DocLayout-YOLO ONNX model]pdf2zh --onnx [onnx/model/path]
--serverport[Use Custom WebUI port]pdf2zh --serverport 7860
--dir[batch translate]pdf2zh --dir /path/to/translate/
--configconfiguration filepdf2zh --config /path/to/config/config.json
--serverport[custom gradio server port]pdf2zh --serverport 7860
--babeldocUse Experimental backend BabelDOC to translatepdf2zh --babeldoc -s openai example.pdf
--mcpEnable MCP STDIO modepdf2zh --mcp
--sseEnable MCP SSE modepdf2zh --mcp --sse

For detailed explanations, please refer to our document about Advanced Usage for a full list of each option.

<h2 id="downstream">Secondary Development (APIs)</h2>

For downstream applications, please refer to our document about API Details for further information about:

  • Python API, how to use the program in other Python programs
  • HTTP API, how to communicate with a server with the program installed
<h2 id="todo">TODOs</h2>
  • Parse layout with DocLayNet based models, PaddleX, PaperMage, SAM2
  • Fix page rotation, table of contents, format of lists
  • Fix pixel formula in old papers
  • Async retry except KeyboardInterrupt
  • Knuth–Plass algorithm for western languages
  • Support non-PDF/A files
  • Plugins of Zotero and Obsidian
<h2 id="acknowledgement">Acknowledgements</h2> <h2 id="contrib">Contributors</h2> <a href="https://github.com/Byaidu/PDFMathTranslate/graphs/contributors"> <img src="https://opencollective.com/PDFMathTranslate/contributors.svg?width=890&button=false" /> </a>

Alt

<h2 id="star_hist">Star History</h2> <a href="https://star-history.com/#Byaidu/PDFMathTranslate&Date"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date&theme=dark" /> <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date" /> <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date"/> </picture> </a>

Repository

BY
Byaidu

Byaidu/PDFMathTranslate

Created

September 6, 2024

Updated

July 7, 2025

Language

Python

Category

AI