-
公开(公告)号:US12300267B2
公开(公告)日:2025-05-13
申请号:US18225406
申请日:2023-07-24
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Woohyun Nam , Kyungrae Kim , Jungkyu Kim , Sangchul Ko , Yoonjae Son , Tammy Lee , Hyunkwon Chung , Sunghee Hwang
IPC: G10L25/57 , G10L15/25 , G11B27/031
Abstract: A method of matching a voice for each object included in a video, includes: separating a plurality of voices in a video; determining a dissimilarity between the plurality of voices; selecting a partial duration in an entire duration of the video as a matching duration, based on the dissimilarity between the plurality of voices; matching, within the matching duration, the plurality of voices with a plurality of objects in the video respectively, based on mouth movements of the plurality of objects; and matching the plurality of voices with the plurality of objects respectively in the entire duration of the video, based on results of the matching between the plurality of voices and the plurality of objects within the matching duration.
-
公开(公告)号:US20230276070A1
公开(公告)日:2023-08-31
申请号:US18195221
申请日:2023-05-09
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Heechul YANG , Hyunkwon Chung , Inhak Na
IPC: H04N19/59 , H04N19/124 , H04N19/169
CPC classification number: H04N19/59 , H04N19/124 , H04N19/188
Abstract: An artificial intelligence (AI) encoding apparatus includes a memory storing one or more instructions, and a processor configured to execute the one or more instructions stored in the memory to identify an object region of interest in an original image, obtain, from the original image, a plurality of original part images respectively including the object region of interest and a non-interest region, obtain a plurality of first images by performing AI scaling on the plurality of original part images through a scaling neural network (NN) that is configured to operate with NN setting information selected from among a plurality of pieces of NN setting information, at least based on whether the plurality of original part images include the object region of interest or the non-interest region, generate image data by encoding the plurality of first images, and transmit the image data, and AI data including information related to the AI scaling.
-
公开(公告)号:US12062377B2
公开(公告)日:2024-08-13
申请号:US17722569
申请日:2022-04-18
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Woohyun Nam , Sangchul Ko , Kyungrae Kim , Jungkyu Kim , Yoonjae Son , Tammy Lee , Hyunkwon Chung , Sunghee Hwang
IPC: G10L19/008 , G06N3/08 , H04S3/00
CPC classification number: G10L19/008 , G06N3/08 , H04S3/008 , H04S2400/01
Abstract: An audio processing apparatus may obtain second audio signals corresponding to channels included in a second channel group from first audio signals corresponding to channels included in a first channel group, downsample at least one third audio signal corresponding to at least one channel identified based on a correlation with the second channel group from among the channels included in the first channel group, by using an artificial intelligence (AI) model, and generate a bitstream including the second audio signals corresponding to the channels included in the second channel group and the downsampled at least one third audio signal. The first channel group includes a channel group of an original audio signal, and the second channel group is constructed by combining at least two channels from among the channels included in the first channel group.
-
公开(公告)号:US12266089B2
公开(公告)日:2025-04-01
申请号:US17566283
申请日:2021-12-30
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Heechul Yang , Inhak Na , Hyunkwon Chung
Abstract: A method, performed by a terminal, of performing artificial intelligence (AI) decoding, including obtaining, based on a first image, image feature information of the first image, the image feature information being related to an image quality degradation; obtaining, based on the image feature information of the first image, neural network (NN) setting information of a first NN, from among NN setting information of a plurality of first NNs which are pre-stored and which correspond to a plurality of image quality degradation types; and obtaining, by using the first NN, a second image in which the image quality degradation is reduced.
-
公开(公告)号:US12200464B2
公开(公告)日:2025-01-14
申请号:US17728037
申请日:2022-04-25
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Tammy Lee , Sangchul Ko , Kyungrae Kim , Sunmin Kim , Jungkyu Kim , Woohyun Nam , Yoonjae Son , Hyunkwon Chung , Sunghee Hwang
IPC: H04S3/00 , G10L19/008
Abstract: According to various embodiments of the disclosure, an audio processing apparatus includes at least one processor configured to execute one or more instructions to obtain a second audio signal down-mixed from at least one first audio signal, obtain information related to error removal for the at least one first audio signal, de-mix the at least one first audio signal from the down-mixed second audio signal, and reconstruct the at least one first audio signal by applying the information related to the error removal for the at least one first audio signal to the at least one first audio signal de-mixed from the second audio signal. The information related to the error removal having been generated using at least one of an original signal power of the at least one first audio signal or a second signal power of the at least one first audio signal after decoding.
-
公开(公告)号:US20230239643A1
公开(公告)日:2023-07-27
申请号:US18126794
申请日:2023-03-27
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Woohyun NAM , Yoonjae Son , Hyunkwon Chung , Sunghee Hwang
CPC classification number: H04S7/302 , G06T7/246 , G11B27/10 , H04S7/307 , H04S3/008 , G06T2207/10016 , G06T2207/20084 , G06T2207/20081 , H04S2420/11 , H04S2400/15 , H04S2400/11 , H04S2400/01
Abstract: A video processing apparatus includes a memory storing instructions, and at least one processor configured to execute the instructions to generate a plurality of feature information by analyzing a video signal comprising a plurality of images based on a first DNN, extract a first altitude component and a first planar component corresponding to a movement of an object in a video from the video signal based on a second DNN, extract a second planar component corresponding to a movement of a sound source in audio from a first audio signal based on a third DNN, generate a second altitude component based on the first altitude component, the first planar component, and the second planar component, output a second audio signal comprising the second altitude component based on the feature information, and synchronize the second audio signal with the video signal and output the synchronized second audio signal and video signal.
-
公开(公告)号:US20220172340A1
公开(公告)日:2022-06-02
申请号:US17566283
申请日:2021-12-30
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Heechul YANG , Inhak Na , Hyunkwon Chung
Abstract: A method, performed by a terminal, of performing artificial intelligence (AI) decoding, including obtaining, based on a first image, image feature information of the first image, the image feature information being related to an image quality degradation; obtaining, based on the image feature information of the first image, neural network (NN) setting information of a first NN, from among NN setting information of a plurality of first NNs which are pre-stored and which correspond to a plurality of image quality degradation types; and obtaining, by using the first NN, a second image in which the image quality degradation is reduced.
-
-
-
-
-
-