Platform Engineering Deep Dive

December 4, 2023
10
min read

Platform engineering is a modern software development discipline that facilitates seamless integration between complex cloud-native technologies and engineering workflows. It's centered around creating robust internal developer platforms (IDPs), enabling self-service capabilities, and fostering efficient processes. IDPs and ephemeral environments have emerged as key enablers of effective platform engineering that enhance agility and operational efficiency.

This article will explore platform engineering in detail with a spotlight on IDPs and ephemeral environments, including its evolution, principles, real-world use cases, and future trends.

Summary of key platform engineering concepts

The table below summarizes the key platform engineering concepts this article will explore in more detail.

Сoncept Description
Origins and evolution of platform engineering Overview of the historical transition from traditional software development practices to modern platform engineering, underscored by the adoption of cloud-native technologies.
The importance of Golden Paths Platform teams integrate toolchains and workflows into easy-to-navigate "golden paths," resulting in an IDP that covers the operational necessities of the entire application lifecycle and makes it self-service and customized for each team.
Internal Developer Platform An IDP is a comprehensive, integrated set of tools, services, and infrastructure purpose-built to streamline and accelerate software development. The core components of IDP are application configuration management, infrastructure orchestration, deployment management, role-based access control, and environment management.
Future trends in platform engineering Ephemeral environments and Kubernetes will play a crucial role in the future of platform engineering. The goal is to enhance developer productivity by improving the developer experience, where cognitive load is identified as a significant barrier.

The evolution of platform engineering

Platform engineering has become a vital component in the landscape of contemporary software development. At its heart, platform engineering focuses on creating Golden Paths (GPs) for software delivery. These are well-defined, standardized methodologies that developers adhere to within an IDP that ensure a seamless developer experience and efficient operations. The Golden Path represents the most efficient process for application deployment, minimizing variability and maximizing reliability. We will explore Golden Paths in more detail later in this article. 

What is platform engineering?

Platform engineering encapsulates the methodologies and tools essential for building and maintaining environments that enable efficient software development and deployment. It bridges the gap between complex infrastructure setups and the operational necessities of Developers, DevOps engineers, Kubernetes administrators, and Platform Engineers. Platform engineering significantly reduces operational overhead by promoting a self-service and continuous delivery culture, enabling teams to focus on core development activities.

Diagram of platform engineering. (Source: Gartner)
  • Product and Service Teams consist of developers and operations professionals who create and maintain software products and services. These teams typically take on the most cognitive load from infrastructure complexity.
  • Platform Teams are specialized groups with infrastructure, development, and product management expertise responsible for building and maintaining the internal developer platform. Their goal is to simplify the development process by managing underlying infrastructure complexities.
  • Digital Platform references an IDP, a unified suite of tools and services designed to streamline the software development process. IDP offers developers a simplified, automated, and self-service environment for building, testing, and deploying applications.

    • Platform Services encompass various functions delivered through an 'X-as-a-Service' model. Later in the article, we will refer to them as the five core components of an IDP:
      • Application configuration management
      • Infrastructure orchestration
      • Deployment management
      • Role-based access control (RBAC)
      • Environment management
    • Platform Tools implement those services and often expose them in a Developer Portal like Spotify Backstage.
    • Reusable Components are standardized, modular elements that teams can repeatedly use across different projects and applications. Later in this article, we will cover a few examples of software templates and how they relate to Golden Paths.
    • Knowledge Management is crucial in organizing and centralizing technical information, best practices, and documentation. This enables developers and teams to easily access and share knowledge and discover and use internal services and products.

Let's delve into some historical context to better understand the emergence and evolution of this concept and its value.

The value of platform engineering

Let's delve into some historical context to better understand the emergence and evolution of platform engineering and why it is valuable.

In 2019, Manuel Pais and Matthew Skelton identified cognitive load as a significant concern in DevOps environments. They introduced Team Topologies to address this. The approach uses product teams to create platforms, offering layers of abstraction to protect developers from underlying technical complexity and reduce their cognitive burden. Team Topologies is crafted to counteract common antipatterns in DevOps.

Numerous organizations have begun forming platform engineering teams in response to these needs. Tasked with accelerating software delivery to production, these teams develop and maintain platforms designed to optimize the software development lifecycle. They strive to make this process efficient, stable, and consistently effective.

As we delve into the challenges that platform engineering aims to resolve, it becomes evident that this discipline is not just about technical mastery but also about crafting a seamless and productive development environment.

{{banner-1="/utility-pages/banners"}}

Understanding common cloud-native technologies and challenges

Cloud-native technologies have transformed how organizations build and deploy applications, presenting both benefits and challenges. While these technologies can provide scalability, resilience, and rapid deployment, they can also bring complexity, creating bottlenecks in infrastructure management and software delivery.

Infrastructure challenges

For infrastructure teams, the challenges are multifaceted. They face a constant demand for infrastructure from development teams while maintaining a growing microservices ecosystem, complex service mesh architectures and Kubernetes orchestration. The responsibility to ensure access, seamless integration, consistent networking, and robust security across disparate services can be daunting. Moreover, the dynamic cloud-native infrastructure requires thorough monitoring to manage failures and performance issues proactively.

Development challenges

On the other hand, developers face significant cognitive load from the need to navigate many tools within the CNCF landscape. The learning curve can be steep with a wide array of tools for containerization (like Docker), CI/C) pipelines, Helm charts, and infrastructure as code tools like Terraform. The more tools they have to learn the less they can focus on their core tasks. The complexity deepens as developers must keep applications compliant with evolving cloud-native standards and integrate these practices into their workflows.

The importance of Golden Paths

The CNCF defines a Golden Path as a "templated composition of well-integrated code and capabilities for rapid project development." In essence, it acts as a self-service template for routine tasks. 

Let's look at a practical example to understand why Golden Paths are so powerful. Consider a developer who has just started at a new company tasked with creating a small Python service. In a DevOps environment, the shared responsibilities between development and infrastructure teams could mean they are expected to write Python code for the service and infrastructure code for build pipelines and monitoring instrumentation. Then, consider that the developer’s new employer might use a tech stack or cloud platform they are unfamiliar with. 

The developer can quickly become overwhelmed with YAML files, and what should have been a straightforward Python service project becomes a significant challenge. They are confronted with numerous decisions, such as: how to structure the Python code, which tools to use for continuous delivery, how to implement monitoring, and more. Each decision point adds a cognitive load that takes away the energy the developer could have spent improving core business functions.

Golden Path example diagram. (Source: Google)

Golden Paths offer significant advantages to various roles within an organization. Golden Paths accelerate onboarding and development for application developers by providing a well-defined route to endorsed tools, reducing the need for custom infrastructure code. For platform engineers, they provide a way to implement a standardized set of principles across various projects.

9 essential Golden Path principles

The unique nature of Golden Paths means that no two will be identical, and this diversity is both expected and beneficial. Each Golden Path is tailored to specific organizational needs and workflows, reflecting different teams' distinct challenges and goals. Despite these variations, Golden Paths typically adhere to a shared framework of fundamental principles. These guiding principles ensure that, regardless of their individual characteristics, all Golden Paths maintain consistent quality and effectiveness.

  1. Offer a singular, definitive method for executing a particular task.
  2. Reduce cognitive load on users through simplification and introductory guides.
  3. Ensure compatibility with existing IDPs. For example, when a developer uses a Golden Path to create a new service, it should automatically generate a new entry in the software catalog. Similarly, if a Golden Path includes a Kubernetes Deployment template, it should be deployable within a Kubernetes cluster managed by the platform team.
  4. Provide comprehensive guidance from development through to production, encompassing local development instructions, a pre-designed CI/CD pipeline, and infrastructure-as-code templates for different environments.
  5. Facilitate a self-service approach, enabling any developer within the organization to access and utilize Golden Paths without needing a ticketing system.
  6. Maintain transparency in abstraction, ensuring the infrastructure underneath remains visible. This is crucial in a DevOps shared responsibility model, as developers must fully grasp their service stack for maintenance, optimization, and troubleshooting.
  7. Tailor to the specific requirements of your organization. If the focus is on migrating existing systems ("brownfield") rather than creating new ones ("greenfield"), develop Golden Paths that support these migration strategies.
  8. Allow adaptability to meet diverse needs. For instance, a Golden Path could default to a SQL database like MySQL but offer the flexibility to switch to a NoSQL option like MongoDB.
  9. Keep the adoption of Golden Paths voluntary, so developers are free to choose alternative methods if Golden Paths do not meet their specific requirements.

Platform engineering teams meticulously apply these foundational principles within an IDP framework, embedding them into the platform's structure and operations.

Internal developer platform

The internal developer platform (IDP) represents a significant advancement in platform engineering. It embodies the principles of self-service and customization, serving as a hub for managing the entire application lifecycle. IDP is the sum of the tech and tools that an operations, DevOps, or platform engineering team glues together to build Golden Paths for developers. 

IDPs streamline operational workflows by providing standardized application configuration templates and facilitating automated resource provisioning. They encapsulate best practices and operational standards, ensuring a consistent and efficient approach to software delivery across the organization. This automation streamlines recurrent tasks like environment and resource deployments by enforcing standardized practices while providing developer teams flexibility and autonomy.

Templating and customization

Customization in platform engineering is achieved through templating, which allows teams to define and reuse configurations across different projects. Templating ensures consistency in configurations while enabling adaptability to project-specific requirements. Through a standardized yet flexible template system, platform engineering promotes a balanced approach to maintaining operational standards and catering to unique project needs.

A prime example of such templating is the software templates in Spotify's Backstage, the most popular open-source software catalog. By default, the software catalog can load code skeletons, template in variables, and publish the templates to platforms like GitHub or GitLab, enabling the platform team to prepare typical software project templates. Examples provided by Backstage can be found here.

5 core components of an IDP

An IDP is typically composed of five key components. These components are essential in developing a comprehensive set of templates that cover the Golden Paths of various development teams.

Application configuration management seamlessly orchestrates resources within container platforms like Kubernetes and external resources like databases, adapting to different environments. This solves the challenge of managing complex configuration files, like scripts or YAML, which often become intricate due to versioning and environment-specific differences, usually necessitating DevOps team involvement in setting up environments.

Infrastructure orchestration ensures IDP integrates with the entire infrastructure cohesively and permits custom integrations as needed. It should include CI/CD pipelines, clusters, DNS and service discovery providers, and infrastructure as code (IaC) tools. 

Deployment management implements a streamlined developer experience by automating the deployment process. It includes handling continuous integration after a developer pushes the code to Git, executing deployment stages and extended debugging support with versioning.

Role-based access control (RBAC) implements a structured approach to managing user permissions within a system based on their roles in an organization. It assigns predefined permissions to roles, such as administrator, developer, or tester.  Each role owns varying levels of access and control over resources and operations. This method simplifies permission management, enhances security by minimizing unnecessary access, and ensures that individuals have appropriate access levels based on their job functions and responsibilities.

Environment management is one of an IDP's most interesting and challenging aspects. Traditional setups often require waiting for infrastructure teams to set up and configure new environments, a time-consuming and inefficient process. IDPs revolutionize this by automating creating and configuring environments, allowing for self-service capabilities. Developers or team members can instantly establish new environments as needed. Additionally, IDPs empower engineering organizations to decommission environments efficiently when they're no longer necessary. This concept is known as an ephemeral environment.

The emergence of ephemeral environments

The inception of ephemeral environments can be traced back to the need for more efficient testing and development workflows. Initially, software teams relied on resource-intensive static environments that often led to bottlenecks. The shift towards ephemeral environments marked a significant leap, offering a more flexible and resource-efficient solution. The ability to quickly provision and dispose of environments as needed proved to be a game-changer, facilitating continuous integration and continuous delivery (CI/CD) pipelines.

Ephemeral environments are temporary, on-demand environments that mirror production configurations, enabling developers and testers to work in a controlled, isolated setting. The primary benefit of ephemeral environments is their ability to significantly cut down setup and teardown times, accelerating the development and testing phases. They also contribute to resource optimization by existing only for the duration necessary, thereby reducing resource waste.

The cross-departmental feedback loop enabled by ephemeral environments. (source: Uffizzi)

Creating an ephemeral environment

Our platform implements ephemeral environments using a notion of Virtual Kubernetes Clusters. A virtual cluster is an abstraction layer on top of a physical or host Kubernetes cluster. The virtual cluster will have its own control plane (like an API server and etcd storage) in which users deploy objects directly. The new control plane will provide users with a complete environment capable of running any workloads compatible with physical Kubernetes clusters, with few exceptions.

Creating a virtual cluster with Uffizzi can be as easy as following a few steps from our Quickstart guide and GitHub Actions integration Guide.

Future trends in platform engineering

As the software development landscape evolves, platform engineering is at the cusp of numerous transformative trends. These trends are poised to further refine and enhance the discipline, ensuring it remains at the forefront of delivering optimized, efficient, and scalable solutions.

Significant increases in business agility

Ephemeral environments and Kubernetes are set to play pivotal roles in the evolving narrative of platform engineering. Ephemeral environments, with their on-demand, temporary nature, are aligned with the industry’s shift towards more agile, flexible, and resource-efficient operational models. Similarly, Kubernetes continues to be a cornerstone for orchestrating containerized applications, with ongoing enhancements aimed at addressing the challenges of scalability, management, and security in microservices architectures.

Ephemeral environments will increase developer productivity and reduce cognitive load

The adoption of ephemeral environments significantly augments developer productivity. By automating the provisioning and teardown of environments, developers can focus on coding rather than operational overhead. The reduced cognitive load from simplified environment management enables developers to work more efficiently and effectively. This shift accelerates the development lifecycle and fosters a culture of innovation and continuous improvement.

AI and machine learning will supercharge platform engineering progress

The horizon of platform engineering is teeming with potential advancements. The continuous exploration of more efficient orchestration models, enhanced security protocols, and the integration of AI and machine learning are some of the promising trends. These advancements aim to further automate and optimize operational workflows, ensuring platform engineering remains a dynamic and vital discipline in the software development ecosystem.

As an example, machine learning models could analyze patterns in application development workflow and propose crafted Golden Paths to be implemented for different teams.

{{banner-2="/utility-pages/banners"}}

Conclusion

Platform engineering is a crucial discipline that can drastically reduce complexity and tightly integrate complex cloud-native technologies with day-to-day software engineering workflows. The introduction of ephemeral environments, particularly through Kubernetes and virtual clusters, represents a significant step forward in platform engineering. Ephemeral environments offer engineers a blend of agility, efficiency, and self-service capabilities. The nuanced customization and self-service capabilities in platform engineering frameworks foster a culture of autonomy and continuous innovation, essential in navigating the multifaceted challenges of modern software delivery.