-
公开(公告)号:US11010124B2
公开(公告)日:2021-05-18
申请号:US16703768
申请日:2019-12-04
Applicant: LG ELECTRONICS INC.
Inventor: Sang Ki Kim , Yongchul Park , Sungmin Han , Siyoung Yang , Juyeong Jang , Minook Kim
Abstract: Disclosed are a sound source focus method and device in which the sound source focus device, in a 5G communication environment by amplifying and outputting a sound source signal of a user's object of interest extracted from an acoustic signal included in video content by executing a loaded artificial intelligence (AI) algorithm and/or machine learning algorithm. The sound source focus method includes playing video content including a video signal including at least one moving object and the acoustic signal in which sound sources output by the object are mixed, determining the user's object of interest from the video signal, acquiring unique sound source information about the user's object of interest, extracting an actual sound source for the user's object of interest corresponding to the unique sound source information from the acoustic signal, and outputting the actual sound source extracted for the user's object of interest.
-
2.
公开(公告)号:US11721319B2
公开(公告)日:2023-08-08
申请号:US16803941
申请日:2020-02-27
Applicant: LG ELECTRONICS INC.
Inventor: Minook Kim , Yongchul Park , Sungmin Han , Siyoung Yang , Sangki Kim , Juyeong Jang
IPC: G10L13/10 , G06N5/04 , G10L13/047 , G06N20/00
CPC classification number: G10L13/10 , G06N5/04 , G06N20/00 , G10L13/047
Abstract: An artificial intelligence device includes a memory and a processor. The memory is configured to store audio data having a predetermined speech style. The processor is configured to generate a condition vector relating to a condition for determining the speech style of the audio data, reduce a dimension of the condition vector to a predetermined reduction dimension, acquire a sparse code vector based on a dictionary vector acquired through sparse dictionary coding with respect to the condition vector having the predetermined reduction dimension, and change a vector element value included in the sparse code vector.
-
3.
公开(公告)号:US11636845B2
公开(公告)日:2023-04-25
申请号:US16928815
申请日:2020-07-14
Applicant: LG ELECTRONICS INC.
Inventor: Siyoung Yang , Yongchul Park , Sungmin Han , Sangki Kim , Juyeong Jang , Minook Kim
IPC: G10L13/00 , G10L13/08 , G10L13/10 , G10L13/033 , G10L13/04
Abstract: A method includes generating first synthesized speech by using text and a first emotion vector configured for the text, extracting a second emotion vector included in the first synthesized speech, determining whether correction of the second emotion information vector is needed by comparing a loss value calculated by using the first emotion information vector and the second emotion information vector with a preconfigured threshold, re-performing speech synthesis by using a third emotion information vector generated by correcting the second emotion information vector, and outputting the generated synthesized speech, thereby configuring emotion information of speech in a more effective manner. A speech synthesis apparatus may be associated with an artificial intelligence module, drone (unmanned aerial vehicle, UAV), robot, augmented reality (AR) devices, virtual reality (VR) devices, devices related to 5G services, and the like.
-
公开(公告)号:US11074904B2
公开(公告)日:2021-07-27
申请号:US16593161
申请日:2019-10-04
Applicant: LG Electronics Inc.
Inventor: Siyoung Yang , Minook Kim , Sangki Kim , Yongchul Park , Juyeong Jang , Sungmin Han
Abstract: A speech synthesis method and apparatus based on emotion information are disclosed. A speech synthesis method based on emotion information extracts speech synthesis target text from received data and determines whether the received data includes situation explanation information. First metadata corresponding to first emotion information is generated on the basis of the situation explanation information. When the extracted data does not include situation explanation information, second metadata corresponding to second emotion information generated on the basis of semantic analysis and context analysis is generated. One of the first metadata and the second metadata is added to the speech synthesis target text to synthesize speech corresponding to the extracted data. A speech synthesis apparatus of this disclosure may be associated with an artificial intelligence module, drone (unmanned aerial vehicle, UAV), robot, augmented reality (AR) devices, virtual reality (VR) devices, devices related to 5G services, and the like.
-
公开(公告)号:US20210210067A1
公开(公告)日:2021-07-08
申请号:US17088480
申请日:2020-11-03
Applicant: LG ELECTRONICS INC.
Inventor: Hwansik YUN , Wonho Shin , Yongchul Park , Sungmin Han , Siyoung Yang , Sangki Kim , Juyeong Jang , Minook Kim
Abstract: A voice recognition device and a method for learning voice data using the same are disclosed. The voice recognition device combines feature information for various speakers with a text-to-speech function to generate voice data recognized by a voice recognition unit, and can improve voice recognition efficiency by allowing the voice recognition unit itself to learn various voice data. The voice recognition device can be associated with an artificial intelligence module, a robot, an augmented reality (AR) device, a virtual reality (VR) device, devices related to 5G services, and the like.
-
公开(公告)号:US11721345B2
公开(公告)日:2023-08-08
申请号:US16917784
申请日:2020-06-30
Applicant: LG ELECTRONICS INC.
Inventor: Siyoung Yang , Yongchul Park , Sungmin Han , Sangki Kim , Juyeong Jang , Minook Kim
CPC classification number: G10L15/32 , G10L15/22 , G10L15/30 , G16Y40/35 , G10L2015/228
Abstract: Disclosed is a device for controlling a plurality of voice recognition devices for determining and selecting a first voice recognition device that a user wants to use based on a point in time when the voice of the user is spoken or a place where the user spoke the voice. The device for controlling a plurality of voice recognition devices according to the present disclosure may be associated with an artificial intelligence module, a robot, an augmented reality (AR) device, a virtual reality (VR) device, a device related to 5G service, etc.
-
公开(公告)号:US11285611B2
公开(公告)日:2022-03-29
申请号:US16420852
申请日:2019-05-23
Applicant: LG ELECTRONICS INC.
Inventor: Yongjin Park , Minook Kim , Jungkwan Son , Tacksung Choi , Sewan Gu , Jinho Sohn , Taegil Cho
Abstract: A robot includes a sensing unit including at least one sensor for detecting a user, a face detector configured to acquire an image including a face of the user detected by the sensing unit, a controller configured to detect an interaction intention of the user from the acquired image, and an output unit including at least one of a speaker or a display for outputting at least one of sound or a screen for inducing interaction of the user, when the interaction intention is detected.
-
8.
公开(公告)号:US11107456B2
公开(公告)日:2021-08-31
申请号:US16561410
申请日:2019-09-05
Applicant: LG ELECTRONICS INC.
Inventor: Jonghoon Chae , Minook Kim , Sangki Kim , Yongchul Park , Siyoung Yang , Juyeong Jang , Sungmin Han
IPC: G10L15/02 , G10L13/08 , G10L13/033 , G10L13/047
Abstract: Discussed is an artificial intelligence (AI)-based voice sampling apparatus for providing a speech style, including a rhyme encoder configured to receive a user's voice, extract a voice sample, and analyze a vocal feature included in the voice sample, a text encoder configured to receive text for reflecting the vocal feature, a processor configured to classify the vocal feature of the voice sample input to the rhyme encoder according to a label, extract an embedding vector representing the vocal feature from the label, and generate a speech style from the embedding vector and apply the generated speech style to the text, and a rhyme decoder configured to output synthesized voice data in which the speech style is applied to the text by the processor.
-
9.
公开(公告)号:US10943597B2
公开(公告)日:2021-03-09
申请号:US16276371
申请日:2019-02-14
Applicant: LG ELECTRONICS INC.
Inventor: Minook Kim , Sewan Gu , Jinho Sohn , Tack Sung Choi
IPC: G10L21/00 , G10L21/0208 , G10L15/20 , G10L21/034 , H03G3/32 , G10L13/033
Abstract: A method for controlling volume in an apparatus having a speaker and a microphone includes receiving, at the microphone, external noise and speech of a user, and calculating sound pressure of the noise received by the microphone. The method further includes performing exception processing of the sound pressure of some or all of the noise using the calculated sound pressure and one of a speech utterance state, a speech receiving state, or a temporal length state, of the noise, mapping volume of the speech in response to the sound pressure of the external noise, synthesizing speech guidance into a sound file, outputting the sound file, via the speaker, according to the mapped volume.
-
公开(公告)号:US11646021B2
公开(公告)日:2023-05-09
申请号:US16851053
申请日:2020-04-16
Applicant: LG ELECTRONICS INC.
Inventor: Siyoung Yang , Yongchul Park , Sungmin Han , Sangki Kim , Juyeong Jang , Minook Kim
CPC classification number: G10L15/22 , G06F3/14 , G06N3/0454 , G06N3/088 , G10L15/063
Abstract: According to one embodiment, an apparatus for processing a voice signal includes a display configured to display an image of a user or a character corresponding to the user, a microphone, a speaker configured to output a voice signal of the user, a memory configured to store a trained voice age conversion model, and a processor configured to, based on changing an age of the user or the character displayed on the display, control the display such that the display displays the user or the character corresponding to the changed age. The processor is further configured to determine a first age that is a current age of the user or the character based on the voice signal of the user inputted through the microphone. Accordingly, convenience of a user may be enhanced.
-
-
-
-
-
-
-
-
-