WAV标头指示不支持的格式在Google Cloud语音转文字API中

问题描述

我正在尝试将我的WAV文件首先上传到成功上传的存储桶,然后在使用Google Cloud Speech to Text API的同时使用该URI进行转录,但该错误提示我提供的配置对象可能是错误的:

(node:15728) UnhandledPromiseRejectionWarning: Error: 3 INVALID_ARGUMENT: WAV header indicates an unsupported format.
    at Object.callErrorFromStatus (C:\Users\Talha\Desktop\transcription backend\node_modules\@grpc\grpc-js\build\src\call.js:31:26)
    at Object.onReceiveStatus (C:\Users\Talha\Desktop\transcription backend\node_modules\@grpc\grpc-js\build\src\client.js:176:52)
    at Object.onReceiveStatus (C:\Users\Talha\Desktop\transcription backend\node_modules\@grpc\grpc-js\build\src\client-interceptors.js:342:141)
    at Object.onReceiveStatus (C:\Users\Talha\Desktop\transcription backend\node_modules\@grpc\grpc-js\build\src\client-interceptors.js:305:181)
    at C:\Users\Talha\Desktop\transcription backend\node_modules\@grpc\grpc-js\build\src\call-stream.js:124:78
    at processticksAndRejections (internal/process/task_queues.js:75:11)
(node:15728) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block,or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:15728) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future,promise rejections that are not handled will terminate
the Node.js process with a non-zero exit code.

我写的代码是:

const filePath = "i_think_arthur.wav"; // WAV file

// Google Cloud storage
const bucketName = "<bucket name>"; // Must exist in your Cloud Storage
const keyFilename = "<path to service account key>";

const uploadToGcs = async () => {
  const storage = new Storage({
    projectId: "<my project id>",keyFilename,});

  const bucket = storage.bucket(bucketName);
  const fileName = path.basename(filePath);

  await bucket.upload(filePath);

  return `gs://${bucketName}/${fileName}`;
};

// Upload to Cloud Storage first,then detects speech in the audio file
uploadToGcs()
  .then(async (gcsUri) => {
    const audio = {
      uri: gcsUri,};

    const config = {
      encoding: "OGG_OPUS",sampleRateHertz: 48000,// encoding: "LINEAR16",languageCode: "en-US",audioChannelCount: 2,enableSeparateRecognitionPerChannel: true,};

    const request = {
      audio,config,};

    speechClient
      .longRunningRecognize(request)
      .then((data) => {
        const operation = data[0];

        // The following Promise represents the final result of the job
        return operation.promise();
      })
      .then((data) => {
        const results = _.get(data[0],"results",[]);
        const transcription = results
          .map((result) => result.alternatives[0].transcript)
          .join("\n");
        console.log(`Transcription: ${transcription}`);
      });
  })
  .catch((err) => {
    console.error("ERROR:",err);
  });

感谢您在此问题上提供的任何帮助,谢谢

解决方法

我经历了几乎相同的事情。

我试图混合样本并尝试识别特定的声音。我所做的是使用 OpenShot Video Editor 混合音频样本,并使用 https://online-audio-converter.com/ 将 .mp4 文件转换为 wav。

具体而言,转换器网站中的以下设置适用于 Google Cloud Scripts 上的默认设置:

在高级设置中:

  • 采样率:16000 KHz
  • 频道:1

现在您有了一个可以在 Google Cloud 的 speech-to-text 上使用的音频文件!