-
公开(公告)号:US20210019194A1
公开(公告)日:2021-01-21
申请号:US16513510
申请日:2019-07-16
Applicant: Cisco Technology, Inc.
Inventor: Rohit Bahl , Paul Clyde Sherrill , Stephen Joseph Williams
Abstract: A multi-cloud service mesh orchestration platform can receive a request to deploy an application as a service mesh application. The platform can tag the application with governance information (e.g., TCO, SLA, provisioning, deployment, and operational criteria). The platform can partition the application into its constituent components, and tag each component with individual governance information. For first time steps, the platform can select and perform a first set of actions for deploying each component to obtain individual rewards, state transitions, and expected returns. The platform can determine a reinforcement learning policy for each component that maximizes a total reward for the application based on the individual rewards, state transitions, and expected returns of each first set of actions selected and performed for each component. For second time steps, the platform can select and perform a second set of actions for each component based on the reinforcement learning policy for the component.
-
公开(公告)号:US11635995B2
公开(公告)日:2023-04-25
申请号:US16513510
申请日:2019-07-16
Applicant: Cisco Technology, Inc.
Inventor: Rohit Bahl , Paul Clyde Sherrill , Stephen Joseph Williams
Abstract: A multi-cloud service mesh orchestration platform can receive a request to deploy an application as a service mesh application. The platform can tag the application with governance information (e.g., TCO, SLA, provisioning, deployment, and operational criteria). The platform can partition the application into its constituent components, and tag each component with individual governance information. For first time steps, the platform can select and perform a first set of actions for deploying each component to obtain individual rewards, state transitions, and expected returns. The platform can determine a reinforcement learning policy for each component that maximizes a total reward for the application based on the individual rewards, state transitions, and expected returns of each first set of actions selected and performed for each component. For second time steps, the platform can select and perform a second set of actions for each component based on the reinforcement learning policy for the component.
-