Automatic Speech Recognition System

As new development of modern conference solution, automatic speech recognition (ASR) system brings more intelligent human-computer interaction experience. For traditional conferences, 70% of the meeting information depends on visual reception, and only 30% depends on sound reception. The communication by sound and video cannot satisfy the modern conference needs any more. Besides, after meeting, the document processing, meeting minutes and legal procedures of specific users are also required to be presented in words format. Gonsin Automatic Speech Recognition System can achieve real-time, complete and orderly text transcription from sound, and ensures the text corresponding to each delegate’s speech.The transcribed text can be displayed on the large screen, as well as Gonsin paperless conference system inreal time.

ASR system suits various application scenarios, including meeting minutes, training records, real-time,speech subtitles, interview records transcription, real-time court trial records, etc.

Gonsin Automatic Speech Recognition system is developed on the platform of Gonsin full digital conference technology. By connecting the network audio data and the ASR background, and in the support of ASR and Gonsin application software, it realizes real-time voice transcribing into text.

There are two selective modes for ASR background: local server LAN, and cloud platform , so to meet different application requirements .Both modes shall be assorted with Gonsin intelligent conference management software—automatic speech recognition module.


   Feature
 
  • Equip with conference system, adapt to noisy environment, clear sound pickup
  • Real time speech recognition of each role to generate a separate voice recording file
  • The speech of each role is recognized and transcribed into text in real time, and a separate text file is generated.
  • It can be used with Gonsin 20000S series or Leader series conference system, supports multiple microphones active at the same time. The voice of each microphone can be recognized in real time;
  • A separate voice recording file is generated and transcribed into text (the authorized number of voice transcribing module should match the number of simultaneous active microphones).
  • It can be used with Gonsin Z4 Series conference system, to support one active microphone. The voice of the microphone can be recognized in real time, generate a separate voice recording file and transcribe it into text.
  • It can merge the text and voice recording of each role, merge and generate meeting minutes, and support text export.
  • Intelligent semantic recognition and intelligent sentence segmentation based on semantics.
  • Voice recording and transcribed text can be played back synchronously and displayed in contrast to realize intelligent document correction.
  • It supports keyword retrieval function, can locate the corresponding content quickly, and greatly improves the efficiency of content retrieval.
  • It supports the main screen and split screen display, real-time display of transcribed text on the main screen from the operating computer, and put it into the large screen display system, supporting screen resolution adaptive.
  • Transcribed text can be displayed on Gonsin paperless terminal in real time;
  • Support conference center cluster deployment or local conference room deployment, artificial intelligence learning, system continuous optimization.

Automatic Speech Recognition Module V7.1.0 (ASR) is the voice transcribing function module of conference management software V7.1.0, which realizes the voice to text function. Before meeting, set the conference units of each participant with corresponding role. During meeting, the speech recognition module can recognize the voice flow of each conference unit in real time, generate independent voice recording file and transcribed text file of each role synchronously, and present them in the operation computer and large screen display. Also, it can be saved as a text + voice meeting minutes file based on the set template.

Basic Functions
♦ Real time speech recognition of each role to generate a separate voice recording file
♦ The speech of each role is recognized and transcribed into text in real time, and a separate text file is generated.
♦ It can be used with Gonsin 20000S series or Leader series conference system, supporting multiple microphones active at the same time. The speech of each microphone can be recognized in real time; A separate voice recording file is generated and transcribed into text (the authorized number of voice transcribing module should match the number of simultaneous active microphones).
♦ It can be used with Gonsin Z4 Series conference system to support one active microphone. The voice of
the microphone can be recognized in real time, generate a separate voice recording file and transcribe it into text.
♦ It can merge the text and voice recording of each role, merge and generate meeting minutes, and support text export.
♦ Intelligent semantic recognition and intelligent sentence segmentation based on semantics.
♦ Voice recording and transcribed text can be played back synchronously and displayed in contrast to realize intelligent document correction.
♦ It supports keyword retrieval function, can locate the corresponding content quickly, and greatly improves the efficiency of content retrieval.
♦ It supports the main screen and split screen display, real-time display of transcribed text on the main screen of the operating computer, and put it into the large screen display system, support screen resolution adaptive.
♦ Transcribed text can be displayed on Gonsin paperless terminal in real time.
♦ Conference system management and setting (e.g. equipment search, terminal ID, sensitivity)
♦ Conference information editing and management (conference content editing, personnel information setting, conference unit role setting, etc.)
♦ Compatible with different Gonsin conference system series.
♦ Support screen customization, the editing of visual interface e.g. text font, color, picture and associated data. Support fast switching of multiple interface styles.
♦ The software supports secondary development; the interface protocol can be open for customized development on project request.

Technical Parameters

Built-in ASR engine V3.0 and voice transcribed module authorization V1.0

Basic Functions
♦ Install ASR engine V3.0 software
♦ Industry leading single pass large-scale language model decoding technology
♦ It can customize English, Russian, Thai recognition engine
♦ It can customize industry identification engines for finance, politics and law, medical treatment, education, etc.
♦ High efficiency CTC model supports up to 50 simultaneous speech recognition channels by optional authorization
♦ Support the centralized deployment of conference center multiple conference rooms LAN, satisfy simultaneous voice transcribing of multiple conference rooms
♦ Assorted with Gonsin management software, the roles can be separated and identified.
♦ Support the deployment in conference center cluster or local conference room.

Technical Parameters

Basic Function
♦ Industry leading single pass large-scale language model decoding technology
♦ Optional with English, Russian, and Thai speech recognition engine
♦ It can customize industry identification engines for finance, politics and law, medical treatment, education,etc.
♦ High efficiency CTC model supports up to 50 simultaneous speech recognition channels by optional authorization
♦ Support the centralized deployment of conference center multiple conference rooms LAN, satisfy simultaneous voice transcribing of multiple conference rooms
♦ Assorted with Gonsin management software, the roles can be separated and identified
♦ ASR Automatic Speech Recognition engine V3.0 software is installed in the intelligent speech recognition
server to run.

ASR Conference Cluster Deployment Scheme Connection Diagram