-
公开(公告)号:US20240312477A1
公开(公告)日:2024-09-19
申请号:US18396788
申请日:2023-12-27
发明人: Oron NIR , Inbal SAGIV , Maayan YEDIDIA , Fardau VAN NEERDEN , Itai NORMAN
IPC分类号: G10L25/69 , G10L19/008 , G10L19/16 , G10L25/06 , G10L25/21
CPC分类号: G10L25/69 , G10L19/008 , G10L19/173 , G10L25/06 , G10L25/21
摘要: Examples of the present disclosure describe systems and methods for multichannel audio speech classification. In examples, an audio signal comprising multiple audio channels is received at a processing device. Each of the audio channels in the audio signal is transcoded to a predefined audio format. For each of the transcoded audio channels, an average power value is calculated for one or more data windows in the audio signal. A correlation value is calculated between the average power value for each audio channel and the combined average power value of the other audio channels in the audio signal. Each of the correlation values (or an aggregated correlation value for the audio channels) is then compared against a threshold value to determine whether the audio signal is to be classified as a speech-based communication. Based on the classification, an action associated with the audio signal may be performed.
-
公开(公告)号:US12058225B2
公开(公告)日:2024-08-06
申请号:US18096581
申请日:2023-01-13
申请人: VoiceMe.AI, Inc.
发明人: Francis G Lacson , Ahmed Dilsher , Kinuko Masaki
CPC分类号: H04L67/565 , G06F3/162 , G10L13/02 , G10L15/26 , G10L15/30 , G10L19/173
摘要: Embodiments of the invention provide a system and method that enables multiple networks themselves based on different underlying technologies to be combined into a larger network that provides seamless communications between a plurality of communication networks. For example, embodiments of the invention enable users in a radio network (e.g., push-to-talk radio systems) to broadcast messages to users in a non-radio network (e.g., users on personal computers or mobile phones) with the radio message translated into text for transmission to the non-radio network. Similarly, text messages originating from users in a non-radio network (e.g., users on personal computers or mobile phones) may be communicated to users in radio networks with the text messages translated into audio for broadcast on the radio network.
-
公开(公告)号:US20240098159A1
公开(公告)日:2024-03-21
申请号:US18096581
申请日:2023-01-13
申请人: VoiceMe.AI, Inc.
发明人: Francis G Lacson , Ahmed Dilsher , Kinuko Masaki
CPC分类号: H04L67/565 , G06F3/162 , G10L13/02 , G10L15/26 , G10L15/30 , G10L19/173
摘要: Embodiments of the invention provide a system and method that enables multiple networks themselves based on different underlying technologies to be combined into a larger network that provides seamless communications between a plurality of communication networks. For example, embodiments of the invention enable users in a radio network (e.g., push-to-talk radio systems) to broadcast messages to users in a non-radio network (e.g., users on personal computers or mobile phones) with the radio message translated into text for transmission to the non-radio network. Similarly, text messages originating from users in a non-radio network (e.g., users on personal computers or mobile phones) may be communicated to users in radio networks with the text messages translated into audio for broadcast on the radio network.
-
公开(公告)号:US20230386505A1
公开(公告)日:2023-11-30
申请号:US17804606
申请日:2022-05-31
发明人: Oron NIR , Inbal SAGIV , Maayan YEDIDIA , Fardau VAN NEERDEN , Itai NORMAN
IPC分类号: G10L25/69 , G10L25/21 , G10L19/008 , G10L19/16 , G10L25/06
CPC分类号: G10L25/69 , G10L25/21 , G10L19/008 , G10L19/173 , G10L25/06
摘要: Examples of the present disclosure describe systems and methods for multichannel audio speech classification. In examples, an audio signal comprising multiple audio channels is received at a processing device. Each of the audio channels in the audio signal is transcoded to a predefined audio format. For each of the transcoded audio channels, an average power value is calculated for one or more data windows in the audio signal. A correlation value is calculated between the average power value for each audio channel and the combined average power value of the other audio channels in the audio signal. Each of the correlation values (or an aggregated correlation value for the audio channels) is then compared against a threshold value to determine whether the audio signal is to be classified as a speech-based communication. Based on the classification, an action associated with the audio signal may be performed.
-
5.
公开(公告)号:US11832087B2
公开(公告)日:2023-11-28
申请号:US17504080
申请日:2021-10-18
IPC分类号: H04S7/00 , G10K15/08 , G10L19/008 , G10L19/16 , G10L19/22
CPC分类号: H04S7/305 , G10K15/08 , G10L19/008 , G10L19/173 , G10L19/22
摘要: A multi-channel signal encoding method includes determining a downmixed signal of a first channel signal and a second channel signal in a multi-channel signal, and reverberation gain parameters corresponding to different subbands of the first channel signal and the second channel signal, where the obtained reverberation gain parameters are belonging to at least two reverberation gain parameter groups. The method further includes selecting, from the at least two reverberation gain parameter groups, a target reverberation gain parameter group. The method further includes generating parameter indication information, where the parameter indication information indicates the target reverberation gain parameter group. The method further includes encoding reverberation gain parameters corresponding to the target reverberation gain parameter group, the parameter indication information, and the downmixed signal to obtain a bitstream.
-
公开(公告)号:US20190172472A1
公开(公告)日:2019-06-06
申请号:US16268448
申请日:2019-02-05
发明人: Michael M. Truman , Mark S. Vinton
IPC分类号: G10L19/02 , G10L21/0388 , G10L19/16 , G10L19/26 , G10L19/002 , G10L21/00 , G10L19/06 , G10L19/028 , G10L19/00 , G10L19/03 , G10L19/012 , G10L21/038
CPC分类号: G10L19/0208 , G10L19/0017 , G10L19/002 , G10L19/012 , G10L19/02 , G10L19/0204 , G10L19/0212 , G10L19/028 , G10L19/03 , G10L19/06 , G10L19/16 , G10L19/167 , G10L19/173 , G10L19/26 , G10L19/265 , G10L21/00 , G10L21/038 , G10L21/0388
摘要: According to an aspect of the present invention, a method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes obtaining a decoded baseband audio signal by decoding an encoded audio signal and obtaining a plurality of subband signals by filtering the decoded baseband audio signal. The method further includes generating a high-frequency reconstructed signal by copying a number of consecutive subband signals of the plurality of subband signals and obtaining an envelope adjusted high-frequency signal. The method further includes generating a noise component based on a noise parameter. Finally, the method includes adjusting a phase of the high-frequency reconstructed signal and obtaining a time-domain reconstructed audio signal by combining the decoded baseband audio signal and the combined high-frequency signal to obtain a time-domain reconstructed audio signal.
-
公开(公告)号:US20180220255A1
公开(公告)日:2018-08-02
申请号:US15420840
申请日:2017-01-31
CPC分类号: H04S7/308 , A63F13/25 , A63F13/28 , A63F13/355 , A63F13/54 , G10L19/008 , G10L19/167 , G10L19/173 , H04N21/4781 , H04S7/30 , H04S2400/11 , H04S2420/01 , H04S2420/11
摘要: A game engine may generate video and audio content on a per-frame basis. Audio data corresponding to a current frame may be generated to comprise sound-field information independent of a speaker configuration or spatialization technology that may be used to play the associated audio. The sound-field may be generated based on monaural audio data corresponding to a sound produced by an in-game object at the object's position as of the current frame. The sound-field information may be transmitted to a remote computing device for reproduction using a selected, available speaker configuration and spatialization technology.
-
公开(公告)号:US09947328B2
公开(公告)日:2018-04-17
申请号:US15702451
申请日:2017-09-12
发明人: Michael M. Truman , Mark S. Vinton
IPC分类号: G10L21/038 , G10L21/0388 , G10L19/02 , G10L19/028 , G10L19/26
CPC分类号: G10L19/0208 , G10L19/0017 , G10L19/002 , G10L19/012 , G10L19/02 , G10L19/0204 , G10L19/0212 , G10L19/028 , G10L19/03 , G10L19/06 , G10L19/16 , G10L19/167 , G10L19/173 , G10L19/26 , G10L19/265 , G10L21/00 , G10L21/038 , G10L21/0388
摘要: According to an aspect of the present invention, a method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes obtaining a decoded baseband audio signal by decoding an encoded audio signal and obtaining a plurality of subband signals by filtering the decoded baseband audio signal. The method further includes generating a high-frequency reconstructed signal by copying a number of consecutive subband signals of the plurality of subband signals and obtaining an envelope adjusted high-frequency signal. The method further includes generating a noise component based on a noise parameter. Finally, the method includes adjusting a phase of the high-frequency reconstructed signal and obtaining a time-domain reconstructed audio signal by combining the decoded baseband audio signal and the combined high-frequency signal to obtain a time-domain reconstructed audio signal.
-
公开(公告)号:US20170243595A1
公开(公告)日:2017-08-24
申请号:US15519007
申请日:2015-10-23
IPC分类号: G10L19/16 , H04N21/234 , H04N21/233 , G10L19/022
CPC分类号: G10L19/173 , G10L19/022 , G10L19/167 , H04N21/2335 , H04N21/23418
摘要: An audio signal (X) is represented by a bitstream (B) segmented into frames. An audio processing system (500) comprises a buffer (510) and a decoding section (520). The buffer joins sets of audio data (D1; D2, . . . , DN) carried by N respective frames (F1, F2, . . . , FN) into one decodable set of audio data (D) corresponding to a first frame rate and to a first number of samples of the audio signal per frame. The frames have a second frame rate corresponding to a second number of samples of the audio signal per frame. The first number of samples is N times the second number of samples. The decoding section decodes the decodable set of audio data into a segment of the audio signal by at least employing signal synthesis, based on the decodable set of audio data, with a stride corresponding to the first number of samples of the audio signal.
-
公开(公告)号:US20170235543A1
公开(公告)日:2017-08-17
申请号:US15429140
申请日:2017-02-09
申请人: Stéphanie England
发明人: Stéphanie England
CPC分类号: G06F3/165 , G06F3/167 , G10L19/0017 , G10L19/173
摘要: Disclosed herein is an audio transmitter receiver device. The device includes an audio interface providing an audio signal, the audio signal including at least one of an audio input signal and an audio output signal; a digital communications interface for at least communicating audio information; and an audio codec for transcoding the audio information such that the audio information includes at least a high quality distortion free lossless representation of the audio signal and the audio signal includes an audio representation of the audio information.
-
-
-
-
-
-
-
-
-