ios - Swift 中的语音转字符串

标签 ios swift string voice-recognition

我目前使用 Swift 开发的应用程序将帮助盲人使用这个综合解决方案来探索世界。我想为应用程序制作一个通用功能,当调用时,会立即开始录音,听用户说些什么,一旦用户停止说话,它会自动停止录音,将录音转换为字符串,然后返回它。此功能应该在单个 View Controller 中多次使用。

我已尝试使用本文中的技术,但没有奏效:https://medium.com/ios-os-x-development/speech-recognition-with-swift-in-ios-10-50d5f4e59c48

记录器将收集建筑物或建筑物中房间的名称,因此不需要记录很长时间 - 即使设置为 5 秒的时间长度也可以。我希望使用 Speech 之类的框架来配合 Siri,但如果效果更好,我不反对使用 Watson 这样的外部框架。请帮忙!

最佳答案

有一个漂亮的 appcoda 教程 here ,这非常适合。

这是他们用来用语音结果更新文本字段的代码。将文本字段中的文本引导到您用来处理结果的任何变量/函数都不会太困难。

//
//  ViewController.swift
//  Siri
//
//  Created by Sahand Edrisian on 7/14/16.
//  Copyright © 2016 Sahand Edrisian. All rights reserved.
//

import UIKit
import Speech

class ViewController: UIViewController, SFSpeechRecognizerDelegate {

    @IBOutlet weak var textView: UITextView!
    @IBOutlet weak var microphoneButton: UIButton!

    private let speechRecognizer = SFSpeechRecognizer(locale: Locale.init(identifier: "en-US"))!

    private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
    private var recognitionTask: SFSpeechRecognitionTask?
    private let audioEngine = AVAudioEngine()

    override func viewDidLoad() {
        super.viewDidLoad()

        microphoneButton.isEnabled = false

        speechRecognizer.delegate = self

        SFSpeechRecognizer.requestAuthorization { (authStatus) in

            var isButtonEnabled = false

            switch authStatus {
            case .authorized:
                isButtonEnabled = true

            case .denied:
                isButtonEnabled = false
                print("User denied access to speech recognition")

            case .restricted:
                isButtonEnabled = false
                print("Speech recognition restricted on this device")

            case .notDetermined:
                isButtonEnabled = false
                print("Speech recognition not yet authorized")
            }

            OperationQueue.main.addOperation() {
                self.microphoneButton.isEnabled = isButtonEnabled
            }
        }
    }

    @IBAction func microphoneTapped(_ sender: AnyObject) {
        if audioEngine.isRunning {
            audioEngine.stop()
            recognitionRequest?.endAudio()
            microphoneButton.isEnabled = false
            microphoneButton.setTitle("Start Recording", for: .normal)
        } else {
            startRecording()
            microphoneButton.setTitle("Stop Recording", for: .normal)
        }
    }

    func startRecording() {

        if recognitionTask != nil {  //1
            recognitionTask?.cancel()
            recognitionTask = nil
        }

        let audioSession = AVAudioSession.sharedInstance()  //2
        do {
            try audioSession.setCategory(AVAudioSessionCategoryRecord)
            try audioSession.setMode(AVAudioSessionModeMeasurement)
            try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
        } catch {
            print("audioSession properties weren't set because of an error.")
        }

        recognitionRequest = SFSpeechAudioBufferRecognitionRequest()  //3

        guard let inputNode = audioEngine.inputNode else {
            fatalError("Audio engine has no input node")
        }  //4

        guard let recognitionRequest = recognitionRequest else {
            fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
        } //5

        recognitionRequest.shouldReportPartialResults = true  //6

        recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in  //7

            var isFinal = false  //8

            if result != nil {

                self.textView.text = result?.bestTranscription.formattedString  //9
                isFinal = (result?.isFinal)!
            }

            if error != nil || isFinal {  //10
                self.audioEngine.stop()
                inputNode.removeTap(onBus: 0)

                self.recognitionRequest = nil
                self.recognitionTask = nil

                self.microphoneButton.isEnabled = true
            }
        })

        let recordingFormat = inputNode.outputFormat(forBus: 0)  //11
        inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
            self.recognitionRequest?.append(buffer)
        }

        audioEngine.prepare()  //12

        do {
            try audioEngine.start()
        } catch {
            print("audioEngine couldn't start because of an error.")
        }

        textView.text = "Say something, I'm listening!"

    }

    func speechRecognizer(_ speechRecognizer: SFSpeechRecognizer, availabilityDidChange available: Bool) {
        if available {
            microphoneButton.isEnabled = true
        } else {
            microphoneButton.isEnabled = false
        }
    }
}

关于ios - Swift 中的语音转字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44681644/

相关文章:

ios - 检测 DropBox 应用是否未安装

XCode 4.2 未检测到装有 iOS 5.1 的 iPhone

css - 如何创建一个自动调整大小以适应 iOS 邮件的 Rails 电子邮件用户邮件程序 View ?

ios - 检查字典中的对象是否为 Int (Swift)

ios - 在Swift中自动调整多行UILabel的大小

string - 如何使用 Bash 将多个空格替换为单个空格?

ios - CoreGraphics、UIViews和CALayers的关系

ios - 如何在不更改有效负载的情况下在 iOS Swift 中隐藏推送通知?

string - 构建 16 位操作系统 - 字符数组不起作用

C# 字符串替换