问题描述
问题
我正在使用Google Text to Speech API来获取给定文本的音频数据。我的问题是音频播放,但是您根本听不懂。绝对是错误的音频。听起来像一堆随机的声音。也许我解码的数据有误吗?
谁能帮助我获得正确的音频并播放。
我的代码
JSON的可编码结构
struct TextToSpeechBody: Codable {
var input: SynthesisInput
var voice: VoiceSelectionParams
var audioConfig: AudioConfig
}
struct SynthesisInput: Codable {
var text: String
}
struct VoiceSelectionParams: Codable {
var languageCode: String
var name: String
var ssmlGender: String
}
struct AudioConfig: Codable {
var audioEncoding: String
}
struct ResponseBody: Codable {
var audioContent: String
}
文本到语音管理器类
playAudio(of text: String)
用作我们要讲话的文本。我正在使用“你好,伙计!”。
它将text参数传递到fetchAudio(from text: String,completion: @escaping (_ audioData: Data) -> Void)
函数,该函数处理所有联网。
class TextToSpeechManager: NSObject {
//MARK: - Variables
static let shared = TextToSpeechManager()
private let baseURL = "https://texttospeech.googleapis.com/v1/text:synthesize?key=MYAPIKEY"
private let httpMethod = "POST"
private let languageCode = "en-US"
private let maleVoice = "en-US-WaveNet-I"
private let femaleVoice = "en-US-WaveNet-H"
private let male = "MALE"
private let female = "FEMALE"
private let audioEncoding = "MP3"
private let encoder = JSONEncoder()
var avPlayer: AVAudioPlayer?
//MARK: - INIT
override init() {
super.init()
}
func fetchAudio(from text: String,completion: @escaping (_ audioData: Data) -> Void) {
// Initialize URL
guard let url = URL(string: baseURL) else {
print("Unable to form url")
return
}
// Setup Request
var request = URLRequest(url: url)
request.httpMethod = httpMethod
request.addValue("application/json",forHTTPHeaderField: "Accept")
request.addValue("application/json",forHTTPHeaderField: "Content-Type")
// Encode http Body
let synthesisInput = SynthesisInput(text: text)
let voiceParams = VoiceSelectionParams(languageCode: languageCode,name: maleVoice,ssmlGender: male)
let audioConfig = AudioConfig(audioEncoding: audioEncoding)
let body = TextToSpeechBody(input: synthesisInput,voice: voiceParams,audioConfig: audioConfig)
guard let encodedBody = try? encoder.encode(body) else {
print("Error Encoding Body")
return
}
request.httpBody = encodedBody
let task = URLSession.shared.dataTask(with: request) { (data,response,error) in
if let error = error {
print("Error fetching audio: \(error.localizedDescription)")
return
}
guard let data = data else {
print("Data is nil")
return
}
print(String(decoding: data,as: UTF8.self))
guard let response = try? JSONDecoder().decode(ResponseBody.self,from: data) else {
print("Unable to decode data")
return
}
guard let audioData = Data(base64Encoded: response.audioContent) else {
print("Unable to get audio data")
return
}
completion(audioData)
}
task.resume()
}
func playAudio(of text: String) {
fetchAudio(from: text) { (audioData) in
dispatchQueue.main.async {
do {
self.avPlayer = try AVAudioPlayer(data: audioData,fileTypeHint: AVFileType.mp3.rawValue)
self.avPlayer?.play()
} catch let error {
print("Error occurred while playing audio: \(error.localizedDescription)")
}
}
}
}
}
调用函数
TextToSpeechManager.shared.playAudio(of: "Hello,what's up dude!")
Google文档
https://cloud.google.com/text-to-speech/docs/reference/rest/v1/text/synthesize#AudioEncoding
该文档介绍了REST API,并显示了网络请求和响应的外观。
在此网站的底部,您可以通过填写请求字段来测试API,它将向您显示请求输出。我在程序中的输出与在测试GUI中的输出相同。这就是为什么我在考虑将错误类型的数据输入到avPlayer中,或者在播放音频方面做错了。
感谢您的所有帮助!
**编辑**
解决方案
我能够找到一个发布在有效项目上的示例。这是此项目的链接:https://medium.com/google-cloud/how-to-integrate-google-cloud-text-to-speech-api-into-your-ios-app-140ab7be42ae
基于他的项目,我使用了一个信号灯来使请求同步,并使其生效。 我不确定我是否完全理解为什么,如果有人可以解释,我很想听听。出于这个原因,我尚未将其发布为解决方案,因为我想在发布解决方案之前先了解它
有效的代码
func fetchAudio(from text: String,completion: @escaping (_ audioData: Data) -> Void) {
var result = Data()
dispatchQueue.global().async {
// Initialize URL
guard let url = URL(string: self.baseURL) else {
print("Unable to form url")
return
}
// Setup Request
var request = URLRequest(url: url)
request.httpMethod = self.httpMethod
request.addValue("application/json",forHTTPHeaderField: "Accept")
request.addValue("application/json",forHTTPHeaderField: "Content-Type")
// Encode http Body
let synthesisInput = SynthesisInput(text: text)
let voiceParams = VoiceSelectionParams(languageCode: self.languageCode,name: self.maleVoice,ssmlGender: self.male)
let audioConfig = AudioConfig(audioEncoding: self.audioEncoding)
let body = TextToSpeechBody(input: synthesisInput,audioConfig: audioConfig)
guard let encodedBody = try? self.encoder.encode(body) else {
print("Error Encoding Body")
return
}
request.httpBody = encodedBody
// Using semaphore to make request synchronous
let semaphore = dispatchSemaphore(value: 0)
let task = URLSession.shared.dataTask(with: request) { (data,error) in
if let error = error {
print("Error fetching audio: \(error.localizedDescription)")
return
}
guard let data = data else {
print("Data is nil")
return
}
// print(String(decoding: data,as: UTF8.self))
guard let response = try? JSONDecoder().decode(ResponseBody.self,from: data) else {
print("Unable to decode data")
return
}
guard let audioData = Data(base64Encoded: response.audioContent) else {
print("Unable to get audio data")
return
}
result = audioData
semaphore.signal()
}
task.resume()
_ = semaphore.wait(timeout: .distantFuture)
completion(result)
}
}
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)