专利检索 ap:("KABUSHIKI KAISHA TOSHIBA") AND inv:"Daichi Hayakawa" 第 1 页

1.

发明授权
Acoustic signal processing with neural network using amplitude, phase, and frequency 有权

公开(公告)号：US11282505B2

公开(公告)日：2022-03-22

申请号：US16296282

申请日：2019-03-08

申请人： KABUSHIKI KAISHA TOSHIBA

发明人： Daichi Hayakawa , Takehiko Kagoshima , Hiroshi Fujimura

IPC分类号： G10L21/02 , G10L25/18 , G10L25/30 , G10L15/16 , G10L15/22 , G10L15/02

摘要： According to one embodiment, a signal generation device includes one or more processors. The processors convert an acoustic signal and output amplitude and phase at a plurality of frequencies. The processors, for each of a plurality of nodes of a hidden layer included in a neural network that treats the amplitude and the phase as input, obtain frequency based on a plurality of weights used in arithmetic operation of the node. The processors generate an acoustic signal based on the plurality of obtained frequencies and based on amplitude and phase corresponding to each of the plurality of nodes.

2.

发明授权
Signal processing apparatus, signal processing method, and computer program product 有权

公开(公告)号：US10951982B2

公开(公告)日：2021-03-16

申请号：US16544613

申请日：2019-08-19

申请人： KABUSHIKI KAISHA TOSHIBA

发明人： Daichi Hayakawa , Takehiko Kagoshima , Hiroshi Fujimura

IPC分类号： H04R3/00 , G06N3/04 , H04R29/00 , H04R1/40

摘要： A signal processing apparatus includes one or more processors. The processors acquire a plurality of observed signals acquired from a plurality of microphone groups each including at least one microphone selected from a plurality of microphones. The microphone groups include respective microphone combinations each including at least one microphone, the combinations are different from each other, and at least one of the microphone groups includes a plurality of microphones. The processors estimate a mask indicating occupancy for each of time frequency points of a sound signal of a space corresponding to the observed signal in a plurality of spaces, for each of the observed signals. The processors integrate masks estimated for the observed signals to generate an integrated mask indicating occupancy for each of time frequency points of a sound signal in a space determined based on the spaces.

3.

发明授权
Speech recognition apparatus, method and non-transitory computer-readable storage medium 有权

公开(公告)号：US11978441B2

公开(公告)日：2024-05-07

申请号：US17186806

申请日：2021-02-26

申请人： KABUSHIKI KAISHA TOSHIBA

发明人： Daichi Hayakawa , Takehiko Kagoshima , Kenji Iwata

IPC分类号： G10L15/20 , G10L15/06 , G10L15/08

CPC分类号： G10L15/20 , G10L15/063 , G10L15/08

摘要： According to one embodiment, a speech recognition apparatus includes processing circuitry. The processing circuitry generates, based on sensor information, environmental information relating to an environment in which the sensor information has been acquired, generates, based on the environmental information and generic speech data, an adapted acoustic model obtained by adapting a base acoustic model to the environment, acquires speech uttered in the environment as input speech data, and subjects the input speech data to a speech recognition process using the adapted acoustic model.

4.

发明申请
SIGNAL PROCESSING APPARATUS, SIGNAL PROCESSING METHOD, AND COMPUTER PROGRAM PRODUCT 审中-公开

公开(公告)号：US20200296507A1

公开(公告)日：2020-09-17

申请号：US16544613

申请日：2019-08-19

申请人： KABUSHIKI KAISHA TOSHIBA

发明人： Daichi Hayakawa , Takehiko Kagoshima , Hiroshi Fujimura

IPC分类号： H04R3/00 , H04R1/40 , H04R29/00 , G06N3/04

摘要： A signal processing apparatus includes one or more processors. The processors acquire a plurality of observed signals acquired from a plurality of microphone groups each including at least one microphone selected from a plurality of microphones. The microphone groups include respective microphone combinations each including at least one microphone, the combinations are different from each other, and at least one of the microphone groups includes a plurality of microphones. The processors estimate a mask indicating occupancy for each of time frequency points of a sound signal of a space corresponding to the observed signal in a plurality of spaces, for each of the observed signals. The processors integrate masks estimated for the observed signals to generate an integrated mask indicating occupancy for each of time frequency points of a sound signal in a space determined based on the spaces.

5.

发明授权
Signal processing apparatus and non-transitory computer readable medium 有权

公开(公告)号：US11908487B2

公开(公告)日：2024-02-20

申请号：US17187559

申请日：2021-02-26

申请人： KABUSHIKI KAISHA TOSHIBA

发明人： Takehiko Kagoshima , Daichi Hayakawa

IPC分类号： G10L25/30 , H04R1/32 , G10L25/03

CPC分类号： G10L25/30 , G10L25/03 , H04R1/326

摘要： A signal processing apparatus according an embodiment includes an acquisition unit and an application unit. The acquisition unit acquires M detection signals output from M detector devices having N-fold symmetry (M is an integer equal to or greater than 2, and N is an integer equal to or greater than 2). Each of the M detector devices detects original signals generated from K signal sources (K is an integer equal to or greater than 2) having the N-fold symmetry. The application unit applies a trained neural network to M input vectors corresponding to the M detection signals and outputs K output vectors. The same parameter is set to, of multiple weights included in a weight matrix of the trained neural network, weights that are commutative based on the N-fold symmetry.

6.

发明申请
SPEECH RECOGNITION APPARATUS, METHOD AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM 有权

公开(公告)号：US20220076667A1

公开(公告)日：2022-03-10

申请号：US17186806

申请日：2021-02-26

申请人： KABUSHIKI KAISHA TOSHIBA

发明人： Daichi Hayakawa , Takehiko Kagoshima , Kenji Iwata

IPC分类号： G10L15/20 , G10L15/06 , G10L15/08

摘要： According to one embodiment, a speech recognition apparatus includes processing circuitry. The processing circuitry generates, based on sensor information, environmental information relating to an environment in which the sensor information has been acquired, generates, based on the environmental information and generic speech data, an adapted acoustic model obtained by adapting a base acoustic model to the environment, acquires speech uttered in the environment as input speech data, and subjects the input speech data to a speech recognition process using the adapted acoustic model.