问题描述
我正在尝试使用 twilio 流和谷歌语音到文本构建一个简单的机器人,播放按摩,暂停听取用户输入并返回适合用户输入的答案,但我找不到方法收到音频后发回答案, 我的意思是我不知道如何从我的服务器向 twilio 本身发送“按摩”。
这是我的代码:
require('dotenv').config();
const WebSocket = require("ws");
const express = require("express");
const app = express();
const server = require("http").createServer(app);
const wss = new WebSocket.Server({ server });
let myAction = '<Say>default</Say>';
//Include Google Speech to Text
const speech = require("@google-cloud/speech");
const client = new speech.SpeechClient();
//Configure Transcription Request
const request = {
config: {
encoding: "MULAW",sampleRateHertz: 8000,languageCode: "iw-IL",},interimResults: true
};
// Handle Web Socket Connection
wss.on("connection",function connection(ws) {
console.log("New Connection Initiated");
let recognizeStream = null;
ws.on("message",function incoming(message) {
const msg = JSON.parse(message);
switch (msg.event) {
case "connected":
console.log(`A new call has connected.`);
// Create Stream to the Google Speech to Text API
recognizeStream = client
.streamingRecognize(request)
.on("error",console.error)
.on("data",data => {
let answer = (data.results[0].alternatives[0].transcript);
if (answer?.includes('כן')){
myAction="<Play>https://cinnamon-cockroach-2574.twil.io/assets/saidYes.mp3</Play>"
}
if(answer?.includes('לא')){
myAction="<Play>https://cinnamon-cockroach-2574.twil.io/assets/saidNo.mp3</Play>"
}
});
break;
case "start":
console.log(`Starting Media Stream ${msg.streamSid}`);
break;
case "media":
// Write Media Packets to the recognize stream
recognizeStream.write(msg.media.payload);
break;
case "stop":
console.log(`Call Has Ended`);
recognizeStream.destroy();
break;
}
});
});
//Handle HTTP Request
app.get("/",(req,res) => res.send("Hello World"));
app.post("/",res) => {
res.set("Content-Type","text/xml");
res.send(`
<Response>
<Start>
<Stream url="wss://${req.headers.host}/"/>
</Start>
<Play>https://cinnamon-cockroach-2584.twil.io/assets/opening.mp3</Play>
<Pause length="5" />
${myAction/* robot respond */}
<Play>https://cinnamon-cockroach-2574.twil.io/assets/closing.mp3</Play>
</Response>
`);
});
console.log("Listening at Port 8080");
server.listen(8080);
as 你可以看到我尝试使用 myAction 为用户动态设置答案,但是它不起作用,我很高兴得到一些帮助或参考正确的文档。
谢谢。
解决方法
使用 <Start><Stream .../></Start>
,您将异步启动流,请参阅 doc:
<Start>
动词异步启动音频 <Stream>
,并立即继续下一个 TwiML 指令。如果没有指令,呼叫将被断开。
您可能想要使用 here 和 <Connect><Stream .../></Connect
描述的同步双向流。
基本上你的 TwiML 需要做的是:
- 说开场白
- 通过
<Connect><Stream .../></Connect
连接流,收听,使用 Google Speech to Text API 进行处理,将电话号码标识的结果存储在数据库/缓存中,完成后关闭 websocket - 从数据库/缓存中检索结果
- 对结果说些什么
Connect Basic Demo 应该能让您朝着正确的方向开始。