android - 如何将 pdf 转换为 android 应用程序中的文本？

好吧，我想制作一个将文本转换为语音的 pdf 阅读器，我为 .txt 文件制作了这个，但我对如何将 pdf 文件转换为 txt 感到困惑。

有些pdf文件是扫描件怎么办？

最佳答案

要做到这一点，你必须使用一些东西来识别代码中的文本，根据维基百科:

Optical character recognition
Optical Character Recognition, usually abbreviated to OCR, is the mechanical or electronic conversion of scanned or photographed images of typewritten or printed text into machine-encoded/computer-readable text. It is widely used as a form of data entry from some sort of original paper data source, whether passport documents, invoices, bank statement, receipts, business card, mail, or any number of printed records. It is a common method of digitizing printed texts so that they can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data extraction and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

一些引用资料:

有一些可用的教程:http://kurup87.blogspot.nl/2012/03/android-ocr-tutorial-image-to-text.html
示例应用程序:https://github.com/rmtheis/android-ocr https://github.com/GautamGupta/Simple-Android-OCR
API 的 http://ocrapiservice.com
图书馆 http://www.abbyy.com/mobileocr/android/

如果你不能选择选择什么，有很多关于这个的stackoverflow帖子，只需谷歌“android ocr stackoverflow”

关于android - 如何将 pdf 转换为 android 应用程序中的文本？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/22794264/

上一篇：android - 如何通过点击推送通知知道应用程序是否启动

下一篇：android - ListView 不适用于 fragment

android - 在 Marshmallow (Android 6) 上使用 TTS 中的声音文件因权限问题而失败

android - 以编程方式为 TTS 设置语言？

objective-c - NSSpeechSynthesizer 中可能存在错误？

ibm-cloud - IBM Watson 文本转语音curl 示例不起作用

android - 当存在多种 flavor /类型时，gradle 是否仅支持排除某些 flavor /构建类型的源文件？

android - 为什么我在 Android list 中收到 "integer expected"错误？

android - Xamarin android - 从相机拍照然后将其传递给其他 Activity

android - 无法将 android.support.v7 添加到我的项目构建路径

android - 更改我的应用程序中语音识别的默认语言