是否可以使用 W3C Web Speech API 编写 Javascript 代码来生成音频文件(wav、ogg 或 mp3),并在给定的文本中使用语音说话?我的意思是,我想做类似的事情:
window.speechSynthesis.speak(new SpeechSynthesisUtterance("0 1 2 3"))
但我希望用它生成的声音不输出到扬声器,而是输出到文件。
最佳答案
单独使用 Web Speech API 无法实现该要求,请参阅 Re: MediaStream, ArrayBuffer, Blob audio result from speak() for recording? , How to implement option to return Blob, ArrayBuffer, or AudioBuffer from window.speechSynthesis.speak() call
尽管使用库可以满足要求,例如 espeak
或 meSpeak
, 请参阅 How to create or convert text to audio at chromium browser? .
fetch("https://gist.githubusercontent.com/guest271314/f48ee0658bc9b948766c67126ba9104c/raw/958dd72d317a6087df6b7297d4fee91173e0844d/mespeak.js")
.then(response => response.text())
.then(text => {
const script = document.createElement("script");
script.textContent = text;
document.body.appendChild(script);
return Promise.all([
new Promise(resolve => {
meSpeak.loadConfig("https://gist.githubusercontent.com/guest271314/8421b50dfa0e5e7e5012da132567776a/raw/501fece4fd1fbb4e73f3f0dc133b64be86dae068/mespeak_config.json", resolve)
}),
new Promise(resolve => {
meSpeak.loadVoice("https://gist.githubusercontent.com/guest271314/fa0650d0e0159ac96b21beaf60766bcc/raw/82414d646a7a7ef11bb04ddffe4091f78ef121d3/en.json", resolve)
})
])
})
.then(() => {
// takes approximately 14 seconds to get here
console.log(meSpeak.isConfigLoaded());
console.log(meSpeak.speak("what it do my ninja", {
amplitude: 100,
pitch: 5,
speed: 150,
wordgap: 1,
variant: "m7",
rawdata: "mime"
}));
})
.catch(err => console.log(err));
还有使用 MediaRecorder
的解决方法,具体取决于系统硬件 How to capture generated audio from window.speechSynthesis.speak() call? .
关于javascript - 使用 W3C Web Speech API 生成音频文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38727696/