Techniques and architectures for providing continuous integration (CI) and/or continuous delivery (CD) in a data lake environment

    公开(公告)号:US11275569B1

    公开(公告)日:2022-03-15

    申请号:US17086247

    申请日:2020-10-30

    Abstract: Mechanisms and techniques for providing continuous integration and continuous deployment (CI/CD) for data lake applications are disclosed. Assembly of code for an app is managed with a CI platform to create a container within a shared environment within which the app runs. The container is isolated from other containers and bundles software, libraries and configuration files and can communicate with other containers through defined channels. The shared environment provides a platform for running the app. The app writes to one or more tables maintained in the shared environment. Assembly of subsequent versions of code for the app is managed by the CI platform. Deployment of the assembled subsequent version of the code to the container is managed by the CI platform. Integration tests are run on the deployed subsequent version of the code with the CI platform. The subsequent version of the code replaces the app in the shared environment when integration testing is complete.

    TRANSFER OF DATA STREAMING SERVICES TO PROVIDE CONTINUOUS DATA FLOW

    公开(公告)号:US20190238604A1

    公开(公告)日:2019-08-01

    申请号:US15881665

    申请日:2018-01-26

    Abstract: Embodiments regard transfer of data streaming services to provide continuous data flow. An embodiment of an apparatus includes a processor to process data for streaming to one or more organizations; and a memory to store data for streaming to the one or more organizations, wherein the apparatus is to provide a centralized work distribution service to track status of each of a plurality of data streams to the one or more organizations, and a plurality of nodes, each node being a virtual machine to stream one or more data streams to the one or more organizations, each node including a first daemon service to monitor connectivity of the node to dependency services for the node and, upon detecting a loss of connection to one or more of the dependency services, the node to discontinue ownership of the one or more data streams of the node and a second daemon service to poll the centralized work distribution service for data streams that are not assigned.

    SYSTEM AND METHOD FOR AUGMENTING SYNCED DATA ACROSS MULTIPLE SYSTEMS TO FACILITATE DATA CLEANSING

    公开(公告)号:US20220245170A1

    公开(公告)日:2022-08-04

    申请号:US17248574

    申请日:2021-01-29

    Abstract: A method of syncing data across multiple systems includes: receiving a plurality of calendar events from a plurality of independent calendar systems that use different calendar system specific schemas; aggregating the calendar events at a unifying communication system; converting the calendar events from a calendar system specific schema to a unifying communication system specific schema; storing the plurality of calendar events in the unifying communication system specific schema; converting a calendar event received from a non-master calendar system to the master calendar system specific schema; and sending the converted calendar event to the master calendar system; wherein copies of the received calendar events that are formatted according to the calendar system specific schema of the master calendar system are stored with the master calendar system, and copies of the calendar events that are formatted according to the unifying communication system specific schema are stored with the unifying communication system.

    BULK DATA EXTRACTION SYSTEM
    18.
    发明申请

    公开(公告)号:US20190238918A1

    公开(公告)日:2019-08-01

    申请号:US15885065

    申请日:2018-01-31

    Abstract: Techniques are disclosed relating to bulk data extraction systems. In some embodiments, a streaming server system may receive a first request, from a data storage system, that is sent prior to initiation of a bulk data extraction for a first group of users. In response to the first request, the streaming server system may receive, from the data storage system, a first notification message that includes a particular event identifier for a most recent data event generated at the data storage system. The streaming server system may receive, from the data storage system, those messages associated with the bulk data extraction for the first group. Subsequent to completion of the bulk data extraction, the streaming server system may send, to the data storage system, a request to subscribe to notification messages for data events associated with the first group.

Patent Agency Ranking