Skip to content

cncf-tags/cloud-native-ai

Repository files navigation

title toc_hide list_pages weight
Charter
false
true
1

Introduction

AI is poised to reshape the dynamics of cloud software architecture, cloud resource orchestration, and cloud native services management in ways we are still exploring. Deployment, management and monitoring of AI workloads in the cloud might be the initial undertaking many cloud stakeholders engage with other moving targets being identified and evaluated for future roadmap.

It is also recognized that cloud native infrastructure is profoundly influenced by the constraints and rapid emergence of AI workloads and technologies within cloud native architectures. It is a responsibility of the WG to identify cloud functionality, performance, security and sustainability gaps brought by AI workloads, and construct mechanisms to judiciously assess and guide on this progress, focusing on enabling cloud native projects and infrastructure that support AI workloads and services.

Mission Statement

This WG is committed to advocating for, developing, supporting, and assessing initiatives focused on addressing the new needs brought by AI to the cloud, which includes cloud resource orchestration and life cycle management of AI workloads, enabling cloud native infrastructure that serves AI services, and enabling AI in CNCF communities and projects such as Kubernetes. Foster rich community knowledge to facilitate innovation and development.

Responsibilities & Deliverables

Background

We recognize that:

  • There is no singular barometer or test for the ethical completeness of AI. In turn, AI is a broad domain that covers the life cycle of AI models.
  • There needs to be more existing documentation that has relevance to the emergent AI technologies of the past two years, specifically those of general-purpose large language models.
  • The rising volume of community-generated data necessitates enhanced management, especially those utilized by generative AI. AI is a multifaceted domain, encompassing diverse forms of training, reinforcement, and modeling.
  • The CNCF communities and projects, such as MLOps, are witnessing a growing demand for enabling AI capabilities.

Alignment to TAGs Runtime and Observability

As a method of focusing the efforts of the WG, TAGs Runtime and Observability as a supporting body would bring natural alignment to projects within the groups ordinary scope. The principal examples of this include the fusion of AI workloads and Kubernetes and data-driven architecture for AI-induced actions. This integration point also extends to the observable nature of both AI training and workloads within Kubernetes, bringing it directly into the focus of the observability TAG.

In-Scope

  • Identify current end user requirements, restrictions, and gaps of deploying and running AI workflows in cloud native space
  • Help the community identify the relevant data to create relevant model outputs for the cloud native ecosystem. For example, a model output for a service that helps migrate workloads to Kubernetes or a model output for a service that allows troubleshooting security gaps in a cloud-native environment.
  • Community outreach and engagement with AI developers and cloud providers, focusing on the integration and management of AI & Data in CNCF communities and projects.
  • Fostering collaboration with various AI entities, initiatives, activities, and endeavors; including AI & Data within the Linux Foundation, and those that might be external to the CNCF, concentrating on AI challenges to the cloud native field and solutions to address those challenges; evaluate AI usage with CNCF projects and community.
  • AI Monitoring/Observability. AI workload run time metrics collection and feedback to the cloud-native orchestrator for GPU/CPU and HW accelerator resource adjustment, scale up/down, scale out/in, etc.

Out of Scope

  • Providing any form of legal advice in any jurisdiction.
  • Form an umbrella organization beyond the CNCF
  • Establish a compliance and standards body beyond the CNCF space
  • Focus outside of cloud native technologies, according to the CNCF Cloud Native definition
  • Development of new AI algorithms or models

Example Deliverables to Community

Educate and Inform: End Users, and projects

  • Whitepaper of current end user requirements, restrictions, and gaps of deploying and running AI in Cloud native space.
  • Current landscape of cloud native solutions for AI workflows
  • Surveys on the evolving landscape of managing community-generated data within the realm of AI, particularly focusing on building cloud native infrastructures serving AI services.
  • Reports on new trends in the AI industry and how cloud native is influenced or being influenced by them. For example, the explosion of LLM and Generative AI usage in 2023.
  • Curate a "Distinguished Speaker" series of talks and interviews that may inform the working group's collaborative workstreams, and to provide accessible educational materials for the larger community found in the CNCF to generate interest and grow working group membership.

Tooling supporting & evaluation: AI-ready cloud native solutions, ontology, and integrations

  • Reviews, inputs, and recommendations for proposed initiatives aimed at enabling AI in CNCF communities and projects, notably in areas like MLOps.
  • API specifications or guideline solutions for gaps identified in the Cloud Native AI whitepaper. For example, a common inference API or large scale AI ML deployments.
  • Ontology of Kubernetes resources that can be used for AI ML models.
  • Literature review of cloud native data security and privacy implications.

Audiences

  • Education - the audience is end users, developers, stakeholders
  • Project intelligence - the audience is TOC/CNCF Community
  • External collaboration - organizations, initiatives, activities, and efforts outside of the Cloud Native Computing Foundation (CNCF) (e.g. OpenSSF, Academic and other research groups, etc).

TOC Liaisons

TAG Liaisons

TAGs Runtime and Observability are the current host of this WG, while future extension to its own TAG is on the roadmap.

WG Leadership

Tech leads

Communications

Operations

AI WG operations are consistent with standard WG operating guidelines provided by the CNCF Technical Oversight Committee TOC.