linux – tesseract(v3.03)输出为PDF
发布时间:2020-12-14 00:53:43 所属栏目:Linux 来源:网络整理
导读:为什么会返回此错误? root@amd-3700-2gb ~/ocr_test # tesseract -l dan pdf.png out pdfTesseract Open Source OCR Engine v3.03 with LeptonicaError opening data file /usr/local/share/tessdata/osd.traineddataPlease make sure the TESSDATA_PREFIX
为什么会返回此错误?
root@amd-3700-2gb ~/ocr_test # tesseract -l dan pdf.png out pdf Tesseract Open Source OCR Engine v3.03 with Leptonica Error opening data file /usr/local/share/tessdata/osd.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language 'osd' Tesseract couldn't load any languages! Warning: Auto orientation and script detection requested,but osd language failed to load 语言清单 root@amd-3700-2gb ~/ocr_test # tesseract --list-langs List of available languages (3): eng dan dan-frak 输出为txt 这很好,输出文本到out.txt tesseract -l dan pdf.png out 输出PDF 这会创建out.pdf,但也会返回提到的错误,并且PDF中的可搜索文本没有意义 tesseract -l dan pdf.png out pdf 解决方法
错误消息很明确:它需要osd.traineddata文件.您可以安装或下载Orientation&来自
https://github.com/tesseract-ocr/tessdata的Tesseract脚本检测数据.
(编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |