Blogs

Getting Started with Argo Workflows

December 27, 2022

Argo Workflows is a K8s native workflow engine that allows you to run all kinds of workflows in Kubernetes by leveraging native resources, such as K8s Pods, to execute the individual steps of the workflow. Workflows can be kicked off in all kinds of ways, which – along with the enormous range of customizations possible – is what makes Argo Workflows such a versatile tool for running cloud-native workflows.

Use cases for Argo Workflows range from MLOps to the CI processes associated with GitOps (which you would likely run with the related tool Argo CD) to any data-intensive processing you need to carry out in advance of an application’s deployment, during runtime, or for cleanup after a given resource has been torn down.

How does Argo Workflows work?

Argo Workflows work with a K8s operator – a Kubernetes native application – that is deployed in your K8s cluster. This application extends the native behavior of your cluster by watching Etcd, the central datastore associated with a cluster, for Argo-specific manifests – called Custom Resource Definitions – that define the workflow process to be carried out.

When the operator finds a workflow definition in Etcd, it then creates the required K8s objects and executes the workflow as it has been defined. For example, it may run each of the involved processes in parallel, or – according to the workflow definition – certain steps may be completed in order to satisfy specific requirements of other steps in the workflow before any other steps begin.

The order of processes being carried out can be controlled in several ways. For example, steps within the larger workflow can depend on other steps in the workflow, which means that any dependencies of a given step will have been completed before a given step is carried out, or steps can be made to run sequentially, or if the steps are running independently of each other, they can be made to run in parallel to speed up the execution of the larger workflow.

Custom Resource Definitions (CRDs)

Custom resource definitions are defined in YAML manifests much like native K8s resources, but they are different in that they extend the capabilities of K8s. Below, we have an example of a CRD of the kind ^Workflow. Native K8s objects such as Pods, ReplicaSets, Deployments, and others are defined with this same ^kind tag, but in this case, we have a custom resource that is specific to Argo Workflows called – appropriately enough – a workflow.

Once the Argo operator has been installed in your cluster, which we’ll look at below, this manifest can be applied to your cluster exactly as a native K8s object’s manifest would be – you simply run the following with Kubectl:

kubectl apply -f <workflow-definition-filename.yaml>

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: hello-world-
spec:
entrypoint: whalesay
templates:
- name: whalesay
container:
image: docker/whalesay:latest
command: [cowsay]
args: ["hello world"]

Each step included in the workflow is containerized, allowing Argo to run each step in a K8s Pod. It is possible to include complex Pod definitions that involve multiple varieties of containers in a given Pod, such as sidecar containers for example, that run alongside the main containerized step that is being run within the workflow, or to even run multiple containers within a given Pod to simplify passing output from one step to the next within a workflow, as is often required when a given step depends on another.

The workflow definition above contains a single step, which is defined by a ^template that runs a simple containerized process called “whalesay” provided by Docker.

Above the ^templates tag is an ^entrypointtag defining the first template to be run within the workflow. In this case, there is only one template to be run, but we’ll look at combining templates into a more complex workflow below.

Argo Workflows Example

Now let’s run the above example workflow locally in minikube. To do so, we’ll first have to complete the following prerequisites:

Install Helm:

brew install helm

Install minikube:

brew install minikube

Start the minikube cluster:

minikube start

Install Argo:

brew install argo

First, we'll create a namespace for the Argo Workflows controller:

kubectl create namespace argo

Next, we'll install with Helm, like so:

helm install my-argo-workflows -n argo argo/argo-workflows --version 0.22.4

Next, we can review the created resources via the minikube dashboard by running the following:

minikube dashboard

Notice that the resources that make up the Argo operator have been created in the namespace “argo.”

Run an example workflow

With the operator installed, we can move on to running the example workflow included above by running:

argo submit https://raw.githubusercontent.com/argoproj/argo-workflows/master/examples/hello-world.yaml --watch

When we do, we’ll see the following output in the terminal, because we included the ^--watch flag.

Name: hello-world-rfkcz
Namespace: default
ServiceAccount: unset (will run with the default ServiceAccount)
Status: Succeeded
Conditions:
PodRunning False
Completed True
Created: Tue Dec 27 13:31:33 -0800 (11 seconds ago)
Started: Tue Dec 27 13:31:33 -0800 (11 seconds ago)
Finished: Tue Dec 27 13:31:43 -0800 (1 second ago)
Duration: 10 seconds
Progress: 1/1
ResourcesDuration: 4s*(1 cpu),4s*(100Mi memory)

STEP TEMPLATE PODNAME DURATION MESSAGE
✔ hello-world-rfkcz whalesay hello-world-rfkcz-whalesay-102532164 6s

Finally, let’s go back to the minikube dashboard and see what resources were created.

Notice that (because we didn’t specify a namespace) the workflow ran in the Default namespace, as Argo can carry out workflows in any namespace by default – not just the argo namespace to which it has been deployed.

And we can see that the workflow (as described above) has run in a K8s Pod that was created by Argo Workflows specifically for this workflow.

Conclusion

Argo Workflows is an extremely flexible workflow engine that runs directly in Kubernetes. It works by watching Etcd, the central K8s datastore, for custom resource definitions, of the kind Workflow, that are created and deployed much like native Kubernetes objects. Above, we walked through the process of installing the Argo operator and running a sample workflow in minikube.

‍

Join the discussion!

Have any questions or comments about this post? Maybe you have a similar project or an extension to this one that you'd like to showcase? Join the Velocity Discord server to ask away, or just stop by to talk K8s development with the community.

‍

Python class called ProcessVideo