-
公开(公告)号:US20220122594A1
公开(公告)日:2022-04-21
申请号:US17506664
申请日:2021-10-20
Applicant: QUALCOMM Incorporated
Inventor: Simyung CHANG , Hyunsin PARK , Hyoungwoo PARK , Janghoon CHO , Sungrack YUN , Kyu Woong HWANG
Abstract: A computer-implemented method of operating an artificial neural network for processing data having a frequency dimension includes receiving an input. The audio input may be separated into one or more subgroups along the frequency dimension. A normalization may be performed on each subgroup. The normalization for a first subgroup the normalization is performed independently of the normalization a second subgroups. An output such as a keyword detection indication, is generated based on the normalized subgroups.
-
公开(公告)号:US20220101087A1
公开(公告)日:2022-03-31
申请号:US17405879
申请日:2021-08-18
Applicant: QUALCOMM Incorporated
Inventor: Juntae LEE , Mihir JAIN , Sungrack YUN , Hyoungwoo PARK , Kyu Woong HWANG
Abstract: A method performed by an artificial neural network (ANN) includes determining, at a first stage of a multi-stage cross-attention model of the ANN, a first cross-correlation between a first representation of each modality of a number of modalities associated with a sequence of inputs. The method still further includes determining, at each second stage of one or more second stages of the multi-stage cross-attention model, a second cross-correlation between first attended representations of each modality. The method also includes generating a concatenated feature representation associated with a final second stage of the one or more second stages based on the second cross-correlation associated with the final second stage, the first attended representation of each modality, and the first representation of each modality. The method further includes determining a probability distribution between a set of background actions and a set of foreground actions from the concatenated feature representation. The method still further includes localizing an action in the sequence of inputs based on the probability distribution.
-
公开(公告)号:US20240282081A1
公开(公告)日:2024-08-22
申请号:US18504968
申请日:2023-11-08
Applicant: QUALCOMM Incorporated
Inventor: Juntae LEE , Sungrack YUN
IPC: G06V10/764 , G06V10/44 , G06V10/62 , G06V10/82
CPC classification number: G06V10/764 , G06V10/44 , G06V10/62 , G06V10/82
Abstract: Systems and techniques are described herein for performing dynamic temporal fusion for video classification, such as recognition, detection, and/or other form of classification. For example, a computing device can generate, via a first network, frame-level features obtained from a set of input frames. The computing device can generate, via a first multi-scale temporal feature fusion engine, first local temporal context features from a first neighboring sub-sequence of the set of input frames. The computing device can generate, via a second multi-scale temporal feature fusion engine, second local temporal context features from a second neighboring sub-sequence of the set of input frames. The computing device can further classify the set of input frames based on the first local temporal context features and the second local temporal context features.
-
公开(公告)号:US20230376753A1
公开(公告)日:2023-11-23
申请号:US18157723
申请日:2023-01-20
Applicant: QUALCOMM Incorporated
Inventor: Seokeon CHOI , Sungha CHOI , Seunghan YANG , Hyunsin PARK , Debasmit DAS , Sungrack YUN
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Systems and techniques are provided for training a neural network model or machine learning model. For example, a method of augmenting training data can include augmenting, based on a randomly initialized neural network, training data to generate augmented training data and aggregating data with a plurality of styles from the augmented training data to generate aggregated training data. The method can further include applying semantic-aware style fusion to the aggregated training data to generate fused training data and adding the fused training data as fictitious samples to the training data to generate updated training data for training the neural network model or machine learning model.
-
公开(公告)号:US20230297653A1
公开(公告)日:2023-09-21
申请号:US17655506
申请日:2022-03-18
Applicant: QUALCOMM Incorporated
Inventor: Debasmit DAS , Sungrack YUN , Fatih Murat PORIKLI
Abstract: Certain aspects of the present disclosure provide techniques for improved domain adaptation in machine learning. A feature tensor is generated by processing input data using a feature extractor. A first set of logits is generated by processing the feature tensor using a domain-agnostic classifier, and a second set of logits is generated by processing the feature tensor using a domain-specific classifier. A loss is computed based at least in part on the first set of logits and the second set of logits, where the loss includes a divergence loss component. The feature extractor, the domain-agnostic classifier, and the domain-specific classifier are refined using the loss.
-
公开(公告)号:US20230281509A1
公开(公告)日:2023-09-07
申请号:US18086586
申请日:2022-12-21
Applicant: QUALCOMM Incorporated
Inventor: Sungha CHOI , Seunghan YANG , Seokeon CHOI , Sungrack YUN
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: A processor-implemented method includes training a machine learning model on a source domain. The method also includes testing the machine learning model on a target domain, after training. The method further includes training the machine learning model on the target domain by regularizing weights of the machine learning model such that shift-agnostic weights are subjected to a higher penalty than shift-biased weights.
-
公开(公告)号:US20230081012A1
公开(公告)日:2023-03-16
申请号:US17474679
申请日:2021-09-14
Applicant: QUALCOMM Incorporated
Inventor: Kyu Woong HWANG , Sungrack YUN , Jaewon CHOI , Seunghan YANG , Janghoon CHO , Hyoungwoo PARK , Hanul KIM
Abstract: Embodiments include methods of assisting a user in locating a mobile device executed by a processor of the mobile device. Various embodiments may include a processor of a mobile device obtaining information useful for locating the mobile device from a sensor of the mobile device configured to obtain information regarding surroundings of the mobile device, anonymizing the obtained information to remove private information, and uploading the anonymized information to a remote server in response to determining that the mobile device may be misplaced. Anonymizing the obtained information may include removing speech from an audio input and compiling samples of ambient noise for inclusion in the anonymized information. Anonymizing the obtained information to remove private information includes editing an image captured by the mobile device to make images of detected individuals unrecognizable.
-
公开(公告)号:US20210304734A1
公开(公告)日:2021-09-30
申请号:US16830029
申请日:2020-03-25
Applicant: Qualcomm Incorporated
Inventor: Young Mo KANG , Sungrack YUN , Kyu Woong HWANG , Hye Jin JANG , Byeonggeun KIM
Abstract: In one embodiment, an electronic device includes an input device configured to provide an input stream, a first processing device, and a second processing device. The first processing device is configured to use a keyword-detection model to determine if the input stream comprises a keyword, wake up the second processing device in response to determining that a segment of the input stream comprises the keyword, and modify the keyword-detection model in response to a training input received from the second processing device. The second processing device is configured to use a first neural network to determine whether the segment of the input stream comprises the keyword and provide the training input to the first processing device in response to determining that the segment of the input stream does not comprise the keyword.
-
公开(公告)号:US20200321022A1
公开(公告)日:2020-10-08
申请号:US16657552
申请日:2019-10-18
Applicant: QUALCOMM Incorporated
Inventor: Hye Jin JANG , Kyu Woong HWANG , Sungrack YUN , Janghoon CHO
Abstract: A device to perform end-of-utterance detection includes a speaker vector extractor configured to receive a frame of an audio signal and to generate a speaker vector that corresponds to the frame. The device also includes an end-of-utterance detector configured to process the speaker vector and to generate an indicator that indicates whether the frame corresponds to an end of an utterance of a particular speaker.
-
-
-
-
-
-
-
-