SYSTEMS AND METHODS FOR IDENTIFYING NOVEL AND DIVERGENT VIRUSES IN TRANSCRIPTOMES

    公开(公告)号:US20240221942A1

    公开(公告)日:2024-07-04

    申请号:US18392646

    申请日:2023-12-21

    CPC classification number: G16H50/20 G16B30/10

    Abstract: Systems and methods for identifying viral sequences in a subject of a species are provided. Sequence reads not associated with a reference genome of the species are obtained from a biological sample from the subject. At least a portion of each respective sequence read is encoded into a corresponding vector representing all or a portion of the sequence of the respective sequence read, thereby obtaining a plurality of vectors. Each sequence read is assigned a corresponding scalar model score by inputting a vector, in the plurality of vectors, corresponding to the sequence read into a model. Those sequence reads having a corresponding scalar model score that satisfies a first threshold score are selected as contig seeds. The plurality of sequence reads are aligned to these contig seeds through common k-mer sequences thereby forming a plurality of contigs which, in turn, are used to identify viral sequences in the subject.

    PROCESSES FOR PREDICTING THERAPY BENEFITS
    10.
    发明公开

    公开(公告)号:US20240105303A1

    公开(公告)日:2024-03-28

    申请号:US18469813

    申请日:2023-09-19

    CPC classification number: G16H20/10

    Abstract: Described are systems and methods of predicting a response to a medical treatment in a subject. The systems and methods include the steps of selecting a set of mutations within at least one biological process, training a set of classifiers from the set of selected mutations via a training dataset, determining the performance level of each classifier via a validation dataset, applying a subset of high-performance level classifiers from the validation dataset via a test dataset, and predicting the response to the medical treatment based on the test dataset.

Patent Agency Ranking