Machine-learned hormone status prediction from image analysis

    公开(公告)号:US11508481B2

    公开(公告)日:2022-11-22

    申请号:US16895983

    申请日:2020-06-08

    Abstract: An analytics system uses one or more machine-learned models to predict a hormone receptor status from a H&E stain image. The system partitions H&E stain images each into a plurality of image tiles. Bags of tiles are created through sampling of the image tiles. The analytics system trains one or more machine-learned models with training H&E stain images having a positive or negative receptor status. The analytics system generates, via a tile featurization model, a tile feature vector for each image tile a test bag for a test H&E stain image. The analytics system generates, via an attention model, an aggregate feature vector for the test bag by aggregating the tile feature vectors of the test bag, wherein an attention weight is determined for each tile feature vector. The analytics system predicts a hormone receptor status by applying a prediction model to the aggregate feature vector for the test bag.

    SYSTEMS AND METHODS FOR FEW-SHOT PROTEIN FITNESS PREDICTION WITH GENERATIVE MODELS

    公开(公告)号:US20230110719A1

    公开(公告)日:2023-04-13

    申请号:US17589623

    申请日:2022-01-31

    Abstract: Embodiments are directed to finetuning a pre-trained language model using generative fitness finetuning. The generative fitness finetuning reuses a probability distribution learned during unsupervised training of the pre-trained language model to finetune and assay labeled data. The generative fitness finetuning trains the language model to classify a relative fitness of protein sequence pairs based on the corresponding probability of the protein sequences in the pairs. The generative fitness finetuning identifies protein sequences in the pairs with a higher probability as also having higher fitness. The trained and finetuned language model identifies fitness of a protein sequence.

    MACHINE-LEARNED HORMONE STATUS PREDICTION FROM IMAGE ANALYSIS

    公开(公告)号:US20210280311A1

    公开(公告)日:2021-09-09

    申请号:US16895983

    申请日:2020-06-08

    Abstract: An analytics system uses one or more machine-learned models to predict a hormone receptor status from a H&E stain image. The system partitions H&E stain images each into a plurality of image tiles. Bags of tiles are created through sampling of the image tiles. The analytics system trains one or more machine-learned models with training H&E stain images having a positive or negative receptor status. The analytics system generates, via a tile featurization model, a tile feature vector for each image tile a test bag for a test H&E stain image. The analytics system generates, via an attention model, an aggregate feature vector for the test bag by aggregating the tile feature vectors of the test bag, wherein an attention weight is determined for each tile feature vector. The analytics system predicts a hormone receptor status by applying a prediction model to the aggregate feature vector for the test bag.

    SYSTEMS AND METHODS FOR LANGUAGE MODELING OF PROTEIN ENGINEERING

    公开(公告)号:US20210249105A1

    公开(公告)日:2021-08-12

    申请号:US17001045

    申请日:2020-08-24

    Abstract: The present disclosure provides systems and methods for controllable protein generation. According to some embodiments, the systems and methods leverage neural network models and techniques that have been developed for other fields, in particular, natural language processing (NLP). In some embodiments, the systems and methods use or employ models implemented with transformer architectures developed for language modeling and apply the same to generative modeling for protein engineering.

    SYSTEMS AND METHODS FOR LANGUAGE MODELING OF PROTEIN ENGINEERING

    公开(公告)号:US20210249100A1

    公开(公告)日:2021-08-12

    申请号:US17001068

    申请日:2020-08-24

    Abstract: The present disclosure provides systems and methods for controllable protein generation. According to some embodiments, the systems and methods leverage neural network models and techniques that have been developed for other fields, in particular, natural language processing (NLP). In some embodiments, the systems and methods use or employ models implemented with transformer architectures developed for language modeling and apply the same to generative modeling for protein engineering.

    SYSTEMS AND METHODS FOR ALIGNMENT-BASED PRE-TRAINING OF PROTEIN PREDICTION MODELS

    公开(公告)号:US20220122689A1

    公开(公告)日:2022-04-21

    申请号:US17153164

    申请日:2021-01-20

    Abstract: Embodiments described herein provide an alignment-based pre-training mechanism for protein prediction. Specifically, the protein prediction model takes as input features derived from multiple sequence alignments (MSAs), which cluster proteins with related sequences. Features derived from MSAs, such as position specific scoring matrices and hidden Markov model (HMM) profiles, have long known to be useful features for predicting the structure of a protein. Thus, in order to predict profiles derived from MSAs from a single protein in the alignment, the neural network learns information about that protein's structure using HMM profiles derived from MSAs as labels during pre-training (rather than as input features in a downstream task).

    SYSTEMS AND METHODS FOR LANGUAGE MODELING OF PROTEIN ENGINEERING

    公开(公告)号:US20210249104A1

    公开(公告)日:2021-08-12

    申请号:US17001090

    申请日:2020-08-24

    Abstract: The present disclosure provides systems and methods for controllable protein generation. According to some embodiments, the systems and methods leverage neural network models and techniques that have been developed for other fields, in particular, natural language processing (NLP). In some embodiments, the systems and methods use or employ models implemented with transformer architectures developed for language modeling and apply the same to generative modeling for protein engineering.

    METHODS AND SYSTEM FOR DEEP LEARNING MODEL GENERATION OF SAMPLES WITH ENHANCED ATTRIBUTES

    公开(公告)号:US20220383070A1

    公开(公告)日:2022-12-01

    申请号:US17353691

    申请日:2021-06-21

    Abstract: Embodiments described herein provide methods and systems for generating data samples with enhanced attribute values. Some embodiments of the disclosure disclose a deep neural network framework with an encoder, a decoder, and a latent space therebetween, that is configured to extrapolate beyond the attributes of samples in a training distribution to generate data samples with enhanced attribute values by learning the latent space using a combination of contrastive objective, smoothing objective, cycle consistency objective, and a reconstruction loss.

Patent Agency Ranking