Wechat applet uses iFLYTEK interface speech recognition

I've seen several other people on the Internet using iFLYTEK's interface to do wechat applet. When I actually follow others' blogs, I will encounter some problems. Therefore, the use of iFLYTEK interface is summarized here. Here I use WebAPI Do it.

1. Apply for iFLYTEK's interface

IFLYTEK's official website

After entering the official website, log in to the account (if not, you can register for use).

After logging in, click the console in the upper right corner. Enter the control center to create a new application

After filling in the information, you can create a new application. Then enter the application, and you will see three messages in the upper left corner:

  • APPID
  • APISecret
  • APIKey

These three information are used to verify the identity when our program calls iFLYTEK's interface. This can be guessed from their names.

2. Create wechat applet

I don't need to say more about the installation of wechat applet development tools and the application for development account; The explanation of some documents of wechat applet is also explained in the official documents. Here, I won't say more about some basic contents of wechat applet development.

(1) Display page

After creating a new wechat applet, first write the front-end display page:

app.wxss file

page {
  height: 100%;
  background-color: #ffffff;
}

.container {
  height: 100%;
  display: flex;
  flex-direction: column;
}

index.wxml file

<view class="container">
    <view class="showContent">
        <view>{{searchKey}}</view>
    </view>

    <view class="content">
        <button class="btn" bindtouchstart='start' bindtouchend="stop">Click the button to speak</button>
    </view>
</view>

The bindtouchstart here is to execute the specified method when the button is pressed, and the same bindtouchend is to execute the specified method after the button is released.

index.wxss file

.showContent {
    flex: 3 0 auto;
    text-align: center;

    padding: 100rpx;
    font-size: 40rpx;
    color: black;
}

.content {
    flex: 1 1 auto;
    display: flex;
    align-items: flex-end;
    justify-content: center;

    margin-bottom: 60rpx;
    width: 100%;
}

.content .btn {
    border-radius: 40rpx;
    width: 80%;
    letter-spacing: 20rpx;
}

Because it is only a simple template, the interface is not particularly good-looking.

(2) Process authentication string

The web API has a lot to explain in the official documentation. There will be no more explanation here.

If iFLYTEK's API request is used, interface authentication is required. Because I am familiar with Java, I use java to process the string that generates the authentication request.

Use Spring Boot as the back end

After creating a Spring Boot application, use the usual MVC mode to handle it.

Controller

import com.example.template.service.UrlService;
import lombok.RequiredArgsConstructor;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
@RequiredArgsConstructor
@RequestMapping(path = "/url")
public class UrlController {

    private final UrlService urlService;

    @GetMapping
    public String getUrl() {
        return urlService.getUrl();
    }
}

Here, the okhttp dependency needs to be added to the service code

<dependency>
    <groupId>com.squareup.okhttp3</groupId>
    <artifactId>okhttp</artifactId>
    <version>4.9.1</version>
</dependency>

Service

import okhttp3.HttpUrl;
import org.springframework.stereotype.Service;

import javax.crypto.Mac;
import javax.crypto.spec.SecretKeySpec;
import java.net.URL;
import java.nio.charset.Charset;
import java.text.SimpleDateFormat;
import java.util.Base64;
import java.util.Date;
import java.util.Locale;
import java.util.TimeZone;

@Service
public class UrlService {
    private static final String hostUrl = "https://iat-api.xfyun.cn/v2/iat";
    private static final String apiSecret = ""; //In the console - my application - voice dictation (streaming version) acquisition
    private static final String apiKey = ""; //In the console - my application - voice dictation (streaming version) acquisition

    public String getUrl() {
        try {
            String authUrl = getAuthUrl(hostUrl, apiKey, apiSecret);
            return authUrl.replace("http://", "ws://").replace("https://", "wss://");
        } catch (Exception e) {
            e.printStackTrace();
        }
        return null;
    }

    // The getAuthUrl() method is given on the official website and can be found in the official documents.
    private String getAuthUrl(String hostUrl, String apiKey, String apiSecret) throws Exception {
        URL url = new URL(hostUrl);
        SimpleDateFormat format = new SimpleDateFormat("EEE, dd MMM yyyy HH:mm:ss z", Locale.US);
        format.setTimeZone(TimeZone.getTimeZone("GMT"));
        String date = format.format(new Date());
        StringBuilder builder = new StringBuilder("host: ").append(url.getHost()).append("\n").//
                append("date: ").append(date).append("\n").//
                append("GET ").append(url.getPath()).append(" HTTP/1.1");
        Charset charset = Charset.forName("UTF-8");
        Mac mac = Mac.getInstance("hmacsha256");
        SecretKeySpec spec = new SecretKeySpec(apiSecret.getBytes(charset), "hmacsha256");
        mac.init(spec);
        byte[] hexDigits = mac.doFinal(builder.toString().getBytes(charset));
        String sha = Base64.getEncoder().encodeToString(hexDigits);

        String authorization = String.format("api_key=\"%s\", algorithm=\"%s\", headers=\"%s\", signature=\"%s\"", apiKey, "hmac-sha256", "host date request-line", sha);
        HttpUrl httpUrl = HttpUrl.parse("https://" + url.getHost() + url.getPath()).newBuilder().//
                addQueryParameter("authorization", Base64.getEncoder().encodeToString(authorization.getBytes(charset))).//
                addQueryParameter("date", date).//
                addQueryParameter("host", url.getHost()).//
                build();
        return httpUrl.toString();
    }
}

Send a request in the js file of the applet to obtain the authentication link

First, when using an applet to send a request to a local server, you should first modify the configuration of the wechat applet development tool.

If you do not configure it, you cannot request to the local server for string processing.

Here I write a separate function to make the request and get the processed result.

getUrl() {
    wx.request({
      url: 'http://localhost:8080/url',
      method: 'GET',
      header: {
        'content-type': 'application/json' // Default value
      },
      success(result) {
        apiUrl = result.data;
        console.log(apiUrl);
      }
    })
  },

The apiUrl here is a global variable defined. The request processing method of wechat is wx.request. This method has other parameters that can be set. But now it's enough to make a request and get the returned result.

Get the recording in js file

Before recording, we need to determine the format of voice that iFLYTEK can receive. After comparison with wechat documents, the format of pcm can be used here.

Here, the parameters of the recording file are now defined in the js file

const options = {
  duration: 60000, // Specifies the frequency of recording, in ms
  sampleRate: 8000, // sampling rate
  numberOfChannels: 1, // Number of recording channels
  encodeBitRate: 48000, // code rate 
  format: 'PCM', // Audio format
  frameSize: 5, // Specifies the frame size in KB
}

Here we first write a function that is just recording.

In the official documents of wechat applet, the function of recording can be managed as a whole by using RecorderManager(). So we first define a recorderManager globally.

const recorderManager = wx.getRecorderManager();

Then we need to write two methods: the method to start recording and the method to end recording.

/* Start recording */
start: function () {
    recorderManager.start(options); // Start recording
    recorderManager.onStart(() => { // Listening event for starting recording
        console.log('Start recording');
    });
},

/* End recording */
stop: function() {
    recorderManager.stop(); // Stop recording
    recorderManager.onStop((result) => {
        console.log('End of recording' + result.tempFilePath); // tempFilePath is the temporary storage path of recording files
    });
},

At this time, you can simply test whether you can record normally( If it's the first time, you may need the permission to open the recording. Please explain about the permission later. Now mainly integrate iFLYTEK's interface and applet)

Call iFLYTEK interface

IFLYTEK's Web API interface needs to link WebSockets. So the applet needs to create a Websocket link.

When we start recording, we create a link, so we write this function in the start method. And when the link is established successfully, you can start recording.

start: function () {
    wxst = wx.connectSocket({ // Open websocket connection
        url: apiUrl,
        method: 'GET',
        success: function (res) {
            recorderManager.start(options);//Start recording
        }
    });
},

Other Websocket monitors are unloaded in onLoad, and some recorded monitors are written in onShow. Here is a reference to other big guys on the Internet. Here is a link to his blog.
Wechat applet foreground calls iFLYTEK speech recognition interface

After these processes, this function is completed.

But there is still a pit at this time: during the computer virtual machine test, the value returned by iFLYTEK interface is always empty. But there are recordings. The solution is to use real machine mode. Debugging with a real machine will have results. Of course, the authentication url here can be obtained through the background first and then directly placed on the url requested by the socket to facilitate testing.

Tags: Front-end Mini Program wechat

Posted on Sat, 04 Sep 2021 22:42:59 -0400 by aiwebs