-
公开(公告)号:US09014987B2
公开(公告)日:2015-04-21
申请号:US13331557
申请日:2011-12-20
Applicant: Sharmila Shekhar Mande , Kuntal Kumar Bhusan , Tarini Shankar Ghosh
Inventor: Sharmila Shekhar Mande , Kuntal Kumar Bhusan , Tarini Shankar Ghosh
CPC classification number: G06F19/26 , G06F19/14 , G06F19/20 , G06F19/28 , G06T11/206
Abstract: Systems and methods for analyzing community structures within a plurality of environmental samples are described herein. The method includes obtaining taxa data corresponding to taxonomic groups within the plurality of the environmental samples. Based on the taxa data, an abundance value for each of the taxonomic groups with respect to each of the plurality of environmental samples is determined. Further, based on abundance values, an interaction factor for each pair of the taxonomic groups in the plurality of environmental samples is computed. The interaction factor is indicative of a degree of interaction between a pair of taxonomic groups from among the taxonomic groups. Based in part on interaction factors and abundance values, the plurality of the environmental samples is clustered.
Abstract translation: 本文描述了用于分析多个环境样品中的社区结构的系统和方法。 该方法包括获得与多个环境样品中的分类群相对应的分类数据。 基于分类单元数据,确定每个分类群相对于多个环境样本中的每一个的丰度值。 此外,基于丰度值,计算多个环境样本中每对分类群的相互作用因子。 相互作用因子表明分类群中一对分类群之间的相互作用程度。 部分基于相互作用因子和丰度值,将多个环境样本聚类。
-
公开(公告)号:US20130226465A1
公开(公告)日:2013-08-29
申请号:US13472737
申请日:2012-05-16
Applicant: Sharmila Shekhar Mande , Varun Mehra , Tarini Shankar Ghosh
Inventor: Sharmila Shekhar Mande , Varun Mehra , Tarini Shankar Ghosh
IPC: G06F19/18
CPC classification number: G06F19/14 , G06F19/24 , G06K9/622 , G06K9/6268
Abstract: Method(s) and system(s) for identifying horizontally transferred genes are described herein. The method includes defining a cuboid in a three dimensional space, wherein the cuboid includes fragment points corresponding to the genomic fragments belonging to a plurality of sequenced microbial genomes, and dividing the cuboid into a plurality of grids. The method further includes selecting one or more grids corresponding to a selected genome and classifying each of the selected grids as one of majority, minority, and mixed grids, based on number of fragment points corresponding to the selected genome in each of the selected grids. Further, at least one genomic fragment from the minority and the mixed grids is identified as the horizontally transferred gene based on a distance ratio assessment.
Abstract translation: 本文描述了用于识别水平转移的基因的方法和系统。 该方法包括在三维空间中定义长方体,其中长方体包括对应于属于多个测序的微生物基因组的基因组片段的片段点,并将长方体分成多个网格。 该方法还包括基于所选择的基因组中选择的基因组的选择基因组的数量,选择与所选择的基因组相对应的一个或多个网格,并将所选择的网格中的每一个分类为多数,少数和混合网格之一。 此外,基于距离比评估,将来自少数和至少混合网格的至少一个基因组片段识别为水平转移的基因。
-
公开(公告)号:US09372959B2
公开(公告)日:2016-06-21
申请号:US13484885
申请日:2012-05-31
Applicant: Sharmila Shekhar Mande , Tarini Shankar Ghosh , Varun Mehra
Inventor: Sharmila Shekhar Mande , Tarini Shankar Ghosh , Varun Mehra
Abstract: Systems and methods for assembly of metagenomic sequences are described herein. In one embodiment, a plurality of metagenomic sequences is represented in three dimensional space to obtain a plurality of sequence vectors. Based on plurality of the sequence vectors, a cuboid having a plurality of grids is defined in the three dimensional space such that it encompasses the plurality of metagenomic sequences. Further, the plurality of metagenomic sequences is assembled into one or more contigs based on traversal of the plurality of grids. In one implementation, the one or more contigs are assembled such that a contig includes metagenomic sequences probably originating from the same genome.
Abstract translation: 本文描述了用于组装宏基因组序列的系统和方法。 在一个实施方案中,在三维空间中表示多个宏基因组序列以获得多个序列载体。 基于多个序列向量,在三维空间中定义具有多个网格的长方体,使得其包含多个宏基因组序列。 此外,基于遍历多个网格,将多个宏基因组序列组装成一个或多个重叠群。 在一个实施方案中,组装一个或多个重叠群使得重叠群包括可能源自相同基因组的重组基因组序列。
-
公开(公告)号:US09342653B2
公开(公告)日:2016-05-17
申请号:US13115553
申请日:2011-05-25
Applicant: Sharmila S. Mande , Mohammed Monzoorul Haque , Tarini Shankar Ghosh , Sudha Chadaram , Venkata Siva Kumar Reddy Chennareddy
Inventor: Sharmila S. Mande , Mohammed Monzoorul Haque , Tarini Shankar Ghosh , Sudha Chadaram , Venkata Siva Kumar Reddy Chennareddy
Abstract: Method(s) for identifying rDNA sequences from a sample containing plurality of unknown DNA sequences are described herein. The method includes selecting one or more target clusters, from a plurality of reference clusters, corresponding to the query sequence. The target clusters are selected based on a composition based analysis. A proportion of probable rDNA clusters from the target clusters is identified. Based on the proportion of the probable rDNA clusters, the query sequence is identified as an rDNA.
Abstract translation: 本文描述了用于从含有多个未知DNA序列的样品鉴定rDNA序列的方法。 该方法包括从与查询序列对应的多个参考簇中选择一个或多个目标簇。 基于基于组合的分析来选择目标簇。 确定来自目标簇的可能rDNA簇的一部分。 基于可能的rDNA簇的比例,查询序列被鉴定为rDNA。
-
公开(公告)号:US09116839B2
公开(公告)日:2015-08-25
申请号:US13472737
申请日:2012-05-16
Applicant: Sharmila Shekhar Mande , Varun Mehra , Tarini Shankar Ghosh
Inventor: Sharmila Shekhar Mande , Varun Mehra , Tarini Shankar Ghosh
CPC classification number: G06F19/14 , G06F19/24 , G06K9/622 , G06K9/6268
Abstract: Method(s) and system(s) for identifying horizontally transferred genes are described herein. The method includes defining a cuboid in a three dimensional space, wherein the cuboid includes fragment points corresponding to the genomic fragments belonging to a plurality of sequenced microbial genomes, and dividing the cuboid into a plurality of grids. The method further includes selecting one or more grids corresponding to a selected genome and classifying each of the selected grids as one of majority, minority, and mixed grids, based on number of fragment points corresponding to the selected genome in each of the selected grids. Further, at least one genomic fragment from the minority and the mixed grids is identified as the horizontally transferred gene based on a distance ratio assessment.
Abstract translation: 本文描述了用于识别水平转移的基因的方法和系统。 该方法包括在三维空间中定义长方体,其中长方体包括对应于属于多个测序的微生物基因组的基因组片段的片段点,并将长方体分成多个网格。 该方法还包括基于所选择的基因组中选择的基因组的选择基因组的数量,选择与所选择的基因组相对应的一个或多个网格,并将所选择的网格中的每一个分类为多数,少数和混合网格之一。 此外,基于距离比评估,将来自少数和至少混合网格的至少一个基因组片段识别为水平转移的基因。
-
公开(公告)号:US20110295902A1
公开(公告)日:2011-12-01
申请号:US13115597
申请日:2011-05-25
Applicant: Sharmila S. Mande , Mohammed Monzoorul Haque , Tarini Shankar Ghosh , Nitin Kumar Singh
IPC: G06F17/30
Abstract: Method(s) for identifying a taxon corresponding to a query sequence are described herein. The method includes selecting a target cluster, from amongst a plurality of reference clusters, corresponding to the query sequence. The target cluster may be selected based on a composition based analysis. A similarity based analysis of the query sequence is performed with respect to the target cluster. From the target cluster, the taxon corresponding to the query sequence is identified based on the similarity based analysis.
Abstract translation: 本文描述了用于识别与查询序列相对应的分类群的方法。 该方法包括从多个参考簇中选择与查询序列相对应的目标簇。 可以基于基于组合的分析来选择目标簇。 相对于目标簇执行查询序列的基于相似度的分析。 从目标群集中,基于相似度分析识别与查询序列相对应的分类群。
-
公开(公告)号:US20130325428A1
公开(公告)日:2013-12-05
申请号:US13484885
申请日:2012-05-31
Applicant: Sharmila Shekhar Mande , Tarini Shankar Ghosh , Varun Mehra
Inventor: Sharmila Shekhar Mande , Tarini Shankar Ghosh , Varun Mehra
IPC: G06F19/10
Abstract: Systems and methods for assembly of metagenomic sequences are described herein. In one embodiment, a plurality of metagenomic sequences is represented in three dimensional space to obtain a plurality of sequence vectors. Based on plurality of the sequence vectors, a cuboid having a plurality of grids is defined in the three dimensional space such that it encompasses the plurality of metagenomic sequences. Further, the plurality of metagenomic sequences is assembled into one or more contigs based on traversal of the plurality of grids. In one implementation, the one or more contigs are assembled such that a contig includes metagenomic sequences probably originating from the same genome.
Abstract translation: 本文描述了用于组装宏基因组序列的系统和方法。 在一个实施方案中,在三维空间中表示多个宏基因组序列以获得多个序列载体。 基于多个序列向量,在三维空间中定义具有多个网格的长方体,使得其包含多个宏基因组序列。 此外,基于遍历多个网格,将多个宏基因组序列组装成一个或多个重叠群。 在一个实施方案中,组装一个或多个重叠群,使得重叠群包括可能源自相同基因组的重组基因组序列。
-
公开(公告)号:US20130158880A1
公开(公告)日:2013-06-20
申请号:US13331557
申请日:2011-12-20
Applicant: Sharmila Shekhar Mande , Kuntal Kumar Bhusan , Tarini Shankar Ghosh
Inventor: Sharmila Shekhar Mande , Kuntal Kumar Bhusan , Tarini Shankar Ghosh
IPC: G06F19/00
CPC classification number: G06F19/26 , G06F19/14 , G06F19/20 , G06F19/28 , G06T11/206
Abstract: Systems and methods for analyzing community structures within a plurality of environmental samples are described herein. The method includes obtaining taxa data corresponding to taxonomic groups within the plurality of the environmental samples. Based on the taxa data, an abundance value for each of the taxonomic groups with respect to each of the plurality of environmental samples is determined. Further, based on abundance values, an interaction factor for each pair of the taxonomic groups in the plurality of environmental samples is computed. The interaction factor is indicative of a degree of interaction between a pair of taxonomic groups from among the taxonomic groups. Based in part on interaction factors and abundance values, the plurality of the environmental samples is clustered.
Abstract translation: 本文描述了用于分析多个环境样品中的社区结构的系统和方法。 该方法包括获得与多个环境样品中的分类群相对应的分类数据。 基于分类单元数据,确定每个分类群相对于多个环境样本中的每一个的丰度值。 此外,基于丰度值,计算多个环境样本中每对分类群的相互作用因子。 相互作用因子表明分类群中一对分类群之间的相互作用程度。 部分基于相互作用因子和丰度值,将多个环境样本聚类。
-
公开(公告)号:US20110295519A1
公开(公告)日:2011-12-01
申请号:US13115553
申请日:2011-05-25
Applicant: Sharmila S. Mande , Mohammed Monzoorul Haque , Tarini Shankar Ghosh , Sudha Chadaram , Venkata Siva Kumar Reddy Chennareddy
Inventor: Sharmila S. Mande , Mohammed Monzoorul Haque , Tarini Shankar Ghosh , Sudha Chadaram , Venkata Siva Kumar Reddy Chennareddy
IPC: G06F19/00
Abstract: Method(s) for identifying rDNA sequences from a sample containing plurality of unknown DNA sequences are described herein. The method includes selecting one or more target clusters, from a plurality of reference clusters, corresponding to the query sequence. The target clusters are selected based on a composition based analysis. A proportion of probable rDNA clusters from the target clusters is identified. Based on the proportion of the probable rDNA clusters, the query sequence is identified as an rDNA.
Abstract translation: 本文描述了用于从含有多个未知DNA序列的样品鉴定rDNA序列的方法。 该方法包括从与查询序列相对应的多个参考簇中选择一个或多个目标簇。 基于基于组合的分析来选择目标簇。 确定来自目标簇的可能rDNA簇的一部分。 基于可能的rDNA簇的比例,查询序列被鉴定为rDNA。
-
-
-
-
-
-
-
-