如何提取图片中文字

要提取图片中的文字，可以使用光学字符识别(OCR)技术，OCR 是一种将图片中的文字转换为可编辑、可搜索文本的技术，有许多 OCR 工具和库可以用于 Python,如 pytesseract 和 Tesseract，以下是使用 pytesseract 提取图片中文字的简短解答：

1、安装 pytesseract 和 Pillow 库，在命令行中输入以下命令：

pip install pytesseractpip install Pillow

2、导入所需的库，并设置 Tesseract 的路径，在 Python 代码中输入以下内容：

from PIL import Imageimport pytesseract如果需要，设置 Tesseract 的路径pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

3、打开图片并使用 pytesseract 提取文字，在 Python 代码中输入以下内容：

image = Image.open('path/to/your/image.jpg')text = pytesseract.image_to_string(image, lang='chi_sim')  # 如果是中文图片，使用 'chi_sim' 语言包print(text)

4、保存提取的文字到文件，在 Python 代码中输入以下内容：

with open('output.txt', 'w', encoding='utf-8') as f:    f.write(text)

就是使用 pytesseract 提取图片中文字的方法。

问答网