How Pulumi Works ?

Pulumi orchestrates and manages infrastructure using a desired state (declarative) paradigm, allowing you to write infrastructure code in programming languages you already know, such as TypeScript, JavaScript, Python, Go, C#, and Java. This model includes the benefits of programming constructs such as loops, conditionals, and functions, as well as your IDE’s autocomplete, type checking, and documentation. When you write a Pulumi program, the final output will be the state you specify, regardless of the current condition of your infrastructure.

A language host executes a Pulumi program to compute the desired state of a stack’s infrastructure. The deployment engine compares the desired state with the stack’s existing state to determine which resources should be added, updated, or destroyed. To handle individual resources, the engine makes use of a number of resource providers (including AWS, Azure, Kubernetes, and over 150 more). As it runs, the engine keeps track of the status of your infrastructure, including all supplied resources and any pending operations.

The graphic below depicts the interplay between various system components:

1. Language hosts

The language host is responsible for running a Pulumi application and creating an environment in which resources can be registered with the deployment engine. The language host consists of two separate parts:

Pulumi’s language executor, pulumi-language, launches the runtime for the program’s language (e.g., Node or Python). This binary is distributed alongside the Pulumi CLI.

A language SDK is responsible for preparing your program for execution and monitoring its execution to detect resource registrations. When a resource is registered (with new Resource() in JavaScript or Resource in Python), the language SDK sends the registration request to the deployment engine. The language SDK is supplied as a standard package, just like any other code that may depend on your software. For example, the Node SDK is contained in the @pulumi/pulumi package available on npm, whereas the Python SDK is housed in the pulumi package available on PyPI.

2. The deployment engine

The deployment engine is in charge of computing the set of actions required to transform the current state of your infrastructure into the desired state specified by your program. When the engine receives a resource registration from the language host, it checks the existing state to see if the resource has been generated before. If it does not already exist, the engine creates it via a resource provider. If it already exists, the engine collaborates with the resource provider to establish what, if anything, has changed by comparing the resource’s former state to the new desired state as specified by the application. If there are changes, the engine decides whether it can update the resource in situ or if it has to replace it by creating a new version and destroying the old one. The selection is made based on the resource’s changing qualities as well as its type. When the language host informs the engine that the Pulumi program has been completed, the engine searches for any existing resources that have not received a new resource registration and schedules their deletion.

The deployment engine is built into the Pulumi CLI itself.

3. Resource providers

A resource provider is made up of two distinct pieces:

A resource plugin is the binary that allows the deployment engine to manage a resource. These plugins are saved in the plugin cache (found in ~/.pulumi/plugins) and can be managed with the pulumi plugin commands.
An SDK that has bindings for every sort of resource that the provider can manage.
The SDKs, like the language runtime, are accessible in standard package formats. For example, the @pulumi/aws package for Node is available on npm, while the pulumi aws package for Python is available on PyPI. When you add these packages to your project, they run pulumi plugin install in the background to download the resource plugin from Pulumi.com.

Putting everything together

Let us see through a basic example. Assume we have the following Pulumi application that creates two S3 buckets.

resources:
    webBucket:
        type: aws:s3:Bucket
    appBucket:
        type: aws:s3:Bucket

Now we’ll run pulumi stack init mystack. mystack is a new stack, therefore the “last deployed state” has no resources.

Next, we run pulumi up. Because this program is written in YAML, the Pulumi CLI invokes the YAML language host and demands that it run the application. When the first aws.s3.bucket object is created, the language host makes a resource registration request to the deployment engine and then continues to run the program. This is subtle but significant. When the call to new aws.s3.bucket returns, it does not imply that the S3 bucket has been formed in AWS; rather, it indicates that the language host has specified that this bucket is part of the desired state of your infrastructure. The language host continues to execute your application while the engine processes this request.

In this scenario, because the last deployed state lacks resources, the engine calculates that it must construct the web–bucket resource. It uses the AWS resource plugin to generate the resource, which in turn uses the AWS SDK to construct it. It’s worth noting that the engine does not communicate directly with AWS; instead, it requests that the AWS Resource Plugin build a Bucket. As new resource types become available, you can upgrade the version of a resource provider to obtain access to them without having to update the CLI. When the procedure to construct this bucket is completed, the engine saves information about the newly generated resource to its state file.

As the engine created the web-bucket bucket, the language host proceeded to run the Pulumi application. This resulted in another resource registration (for content-bucket). Because there is no connection between these two buckets, the engine can execute the request while also creating the web-bucket.

After both processes are performed, the language host departs, indicating that the program has finished execution. The engine and resource providers will thereafter shut down. The state of mystack now looks like this:

stack mystack
   - aws.s3.Bucket "web-bucket719a9"
   - aws.s3.Bucket "app-bucket394cf"

Take note of the additional prefixes added to these bucket names. The reason behind this is that Pulumi employs auto-naming by default, which prevents resource name collisions when you deploy multiple copies of your infrastructure. If preferred, this behavior can be turned off.

Let’s now modify one of the resources and launch Pulumi up once more. Because Pulumi is based on a desired state model, it will calculate the minimum set of modifications required to update your deployed infrastructure based on the most recent deployed state. Consider the scenario if we wished to enable public reading of the S3 web-bucket. We modify our algorithm to reflect this updated ideal state:

resources:
  webBucket:
    type: aws:s3:Bucket
    properties:
      acl: public-read # add acl
  appBucket:
    type: aws:s3:Bucket

Run pulumi preview or pulumi up to restart the full process. The call to aws.s3 and your program are launched by the language host. A fresh resource registration request is made to the engine by bucket. This time, however, the engine requests the resource provider to compare the current state from our last run of Pulumi up with the desired state specified by the program because our state already has a resource named web-bucket. The acl property’s default value of private has been changed to public-read, which the process detects. The engine verifies that it can alter this property without forming a new bucket by speaking with the resource provider once more. Based on this, the engine instructs the provider to change the acl property to public-read. The changed state is reflected in the current state after this operation is finished.

Additionally, a resource registration request for “app-bucket” is received by the engine. The resource does not need to be altered by the engine, though, as there are no changes between the desired and present states.

Let’s say that app-bucket is rename as data-bucket.

resources:
  mediaBucket:
    type: aws:s3:Bucket
    properties:
      acl: public-read # add acl
  dataBucket:
    type: aws:s3:Bucket

Because web-bucket’s intended state and actual state match this time, the engine won’t need to modify it. However, the engine discovers that there isn’t an existing resource with the name data-bucket in the current state when processing the resource request for data-bucket, therefore it needs to construct a new S3 bucket. The engine searches for any resources in the current state for which it did not find a resource registration after that procedure is finished and the language host has shut down. In this instance, the engine calls the resource provider to delete the app-bucket bucket that is currently in place because we deleted the registration of app-bucket from our program.

Infrastructure As Code Using Pulumi

Infrastructure as code (IaC) is a method for automating the provisioning and administration of infrastructure. Infrastructure as code is fundamentally about applying software engineering principles, techniques, and tools to cloud infrastructure.

Prior to infrastructure as code, infrastructure was (and still is!) provisioned in a variety of ways, including pointing and clicking in a user interface (UI), running commands via a command-line interface (CLI), running batch scripts, and using configuration management tools that were not designed for cloud infrastructure. Each of these approaches has limitations; interactive methods including a UI or a CLI frequently cause issues with repeatability and consistency, whereas batch scripts or configuration management systems may be unable to manage infrastructure declaratively. Modern approaches leverage platforms like Pulumi to embrace and assist the entire software engineering lifecycle.