Note to my learning path around Kubernetes
- Learning Kubernetes
- Learning Platforms
- Blog etc.
- Container Basics
- Kubernetes Basics
- Key Kubernetes Terminology
- Cluster Components
- S3 Storage
- Kubernetes Application Backup
- Kasten K10
- Kasten K10 Actions
- Kanister-Enabled Applications
- Use Case Testing
- Basic K10 Setup process
- Application State and Configuration Data
- Application Security for Cloud-Native Environments
- Kasten K10
- Kubernetes Blog
- CKA-Study Guide by David-VTUK
- Intro to Kubernetes by @MichaelCade1
- 90DaysOfDevOps by @MichaelCade1
Key Container Terminology
docker build command builds Docker images from a Dockerfile.
Compose is a tool for defining and running multi-container Docker applications.
Key Kubernetes Terminology
Helm is the package manager for Kubernetes that allows you to package, share and manage the lifecycle of your Kubernetes containerized applications.
Helm uses Charts to pack all the required K8S components for an application to deploy, run and scale. A chart is a collection of files that describe a related set of Kubernetes resources.
Helm Templates is subdirectory in a chart that combines the K8S components of it, e.g. Service, ReplicaSet, Deployment, Ingress etc.
Helm Values are described in the values.yaml file which allows users to deploy their containerized applications dynamically.
Short term apps that do not retain data regarding a transaction (e.g.: print services, microservices)
Applications that typically use a database (e.g.: MySQL) and process a read/write and thus retain information regarding each transaction involved
A Pod is the basic building block of Kubernetes–the smallest and simplest unit in the Kubernetes object model that you create or deploy. A Pod represents a running process on your cluster.
A Pod encapsulates an application container (or, in some cases, multiple containers), storage resources, a unique network IP, and options that govern how the container(s) should run. A Pod represents a unit of deployment: a single instance of an application in Kubernetes, which might consist of either a single container or a small number of containers that are tightly coupled and that share resources.
A ReplicaSet controller ensures that a specified number of Pod replicas are running at any given time.
While ReplicaSets can be used independently, today it’s mainly used by Deployments as a mechanism to orchestrate Pod creation, deletion and updates. When you use Deployments you don’t have to worry about managing the ReplicaSets that they create. Deployments own and manage their ReplicaSets.
A Deployment controller provides declarative updates for Pods and ReplicaSets.
You describe a desired state in a Deployment object, and the Deployment controller changes the actual state to the desired state at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments.
Deployment manages the ReplicaSet to orchestrate Pod lifecycles. This includes Pod creation, upgrade and deletion, and scaling.
A Kubernetes Service is an abstraction which defines a logical set of Pods and a policy by which to access them – sometimes called a micro-service. The set of Pods targeted by a Service is (usually) determined by a Label Selector.
As an example, consider an image-processing backend which is running with 3 replicas. Those replicas are fungible – frontends do not care which backend they use. While the actual Pods that compose the backend set may change, the frontend clients should not need to be aware of that or keep track of the list of backends themselves. The Service abstraction enables this decoupling.
Namespaces are intended for use in environments with many users spread across multiple teams, or projects.
Namespaces provide a scope for names. Names of resources need to be unique within a namespace, but not across namespaces.
Namespaces are a way to divide cluster resources between multiple users (via resource quota).
It is not necessary to use multiple namespaces just to separate slightly different resources, such as different versions of the same software: use labels to distinguish resources within the same namespace.
On-disk files in a Container are ephemeral, which presents some problems for non-trivial applications when running in Containers. First, when a Container crashes, kubelet will restart it, but the files will be lost – the Container starts with a clean state. Second, when running Containers together in a Pod it is often necessary to share files between those Containers. The Kubernetes Volume abstraction solves both of these problems.
At its core, a Volume is just a directory, possibly with some data in it, which is accessible to the Containers in a Pod. How that directory comes to be, the medium that backs it, and the contents of it are determined by the particular volume type used.
A Kubernetes Volume has an explicit lifetime – the same as the Pod that encloses it. Consequently, a volume outlives any Containers that run within the Pod, and data is preserved across Container restarts. When a Pod ceases to exist, the volume will cease to exist too. Kubernetes supports many types of Volumes, and a Pod can use any number of them simultaneously.
Kubernetes users can define StorageClasses and assign PVs to them. Each StorageClass represents a type of storage and uses provisioners that determines what volume plugin is used for provisioning PVs.
Persistent Volumes (PV)
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual Pod that uses the PV.
# cat pv.yaml kind: PersistentVolume apiVersion: v1 metadata: name: myvolume spec: storageClassName: local-path capacity: storage: 10Gi accessModes: - ReadWriteOnce - ReadWriteMany hostPath: path: /etc/foo
kubectl apply -f pv.yaml kubectl get pv
Persistent Volume Claims
A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany)
# cat pvc.yaml kind: PersistentVolumeClaim apiVersion: v1 metadata: name: mypvc spec: storageClassName: local-path accessModes: - ReadWriteOnce resources: requests: storage: 4Gi
kubectl apply -f pvc.yaml kubectl get pvc
A Job creates one or more Pods and ensures that a specified number of them successfully terminate.
As Pods successfully complete, the Job tracks the successful completions. When a specified number of successful completions is reached, the Job itself is complete. Deleting a Job will clean up the Pods it created.
A simple case is to create one Job object in order to reliably run one Pod to completion. The Job object will start a new Pod if the first Pod fails or is deleted (for example due to a node hardware failure or a node reboot).
A DaemonSet ensures that all (or some) Nodes run a copy of a Pod.
As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected. Deleting a DaemonSet will clean up the Pods it created.
Some typical uses of a DaemonSet are:
- running a cluster storage daemon, such as glusterd and ceph on each node.
- running a logs collection daemon on every node, such as fluentd or logstash.
- running a node monitoring daemon on every node, such as Prometheus Node Exporter, collectd, Datadog agent, New Relic agent, or Ganglia gmond.
A StatefulSet manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering and uniqueness of these Pods.
StatefulSets are valuable for applications that require one or more of the following.
- Stable, unique network identifiers.
- Stable, persistent storage.
- Ordered, graceful deployment and scaling.
- Ordered, graceful deletion and termination.
- Ordered, automated rolling updates.
Use cases for StatefulSets are:
- Deploying a clustered resource (e.g. Cassandra, Elasticsearch)
- Applications that somehow depend on each other
A worker machine that runs Kubernetes workloads. It can be a physical (bare metal) machine or a virtual machine (VM). Each node can host one or more pods. Kubernetes nodes are managed by a control plane.
Each individual non-Control Plane node in your cluster runs two processes:
- kubelet, which communicates with the Kubernetes Control Plane.
- kube-proxy, a network proxy which reflects Kubernetes networking services on each node.
The Kubernetes Control Plane is a collection of three processes that run on a single node in your cluster, which is designated as the Control Plane node. Those processes are: kube-apiserver, kube-controller-manager and kube-scheduler.
Control Plane Components
The Kubernetes API server validates and configures data for the API objects which include pods, services, replication controllers, and others. The API Server services REST operations and provides the frontend to the cluster’s shared state through which all other components interact.
The Kubernetes scheduler is a control plane process that assigns Pods to Nodes. The scheduler determines which Nodes are valid placements for each Pod in the scheduling queue according to constraints and available resources. The scheduler then ranks each valid Node and binds the Pod to a suitable Node.
The Kubernetes controller manager is a daemon that embeds the core control loops shipped with Kubernetes. In applications of robotics and automation, a control loop is a non-terminating loop that regulates the state of the system. In Kubernetes, a controller is a control loop that watches the shared state of the cluster through the api server and makes changes attempting to move the current state towards the desired state.
Consistent and highly-available key value store used as Kubernetes’ backing store for all cluster data.
An agent that runs on each node in the cluster. It makes sure that containers are running in a Pod.
The kube-proxy is a network proxy that runs on each node in your cluster, implementing part of the Kubernetes Service concept.
The container runtime is the software that is responsible for running containers.
MinIO is a High Performance Object Storage released under Apache License v2.0. It is API compatible with Amazon S3 cloud storage service.
More details: https://min.io/
kubectl create namespace minio helm repo add minio https://helm.min.io helm install minio minio/minio --namespace=minio --version=8.0.0 --set accessKey="AKIAIOSFODNN7EXAMPLE",secretKey="1234/asdf",defaultBucket.enabled=true,defaultBucket.name=backup --wait --timeout 5m
Kubernetes Application Backup
A data management solution for Kubernetes needs to understand this cloud-native architectural pattern, be able to work with a lack of IP address stability, and deal with continuous change.
Purpose-built for Kubernetes, Kasten K10 provides enterprise operations teams an easy-to-use, scalable, and secure system for backup/restore, disaster recovery, and mobility of Kubernetes applications.
For stateful, cloud-native applications, data operations must often be performed by tools with a semantic understanding of the data. The volume-level primitives provided by orchestrators are not sufficient to support data workflows like backup/recovery of complex, distributed databases.
K10 supports a number of different community databases that includes MySQL, PostgreSQL, MongoDB, and Cassandra.
Kasten’s K10 platform uses the Kubernetes custom resources provided by Kanister for data management. This enables domain experts to capture application specific data management tasks in blueprints which can be easily shared and extended. The framework takes care of the tedious details around execution on Kubernetes and presents a homogeneous operational experience across applications at scale. Further, it gives the user a natural mechanism to extend the K10 platform by adding personal code to modify any desired step performed for data lifecycle management.
Kasten K10 Actions
An Action API resource is used to initiate Kasten K10 data management operations. The actions can either be associated with a Policy or be stand-alone on-demand actions. Actions also allow for tracking the execution status of the requested operations.
The Kasten K10 Platform exposes a number of different action types.
See also: https://docs.kasten.io/latest/api/actions.html
Backup actions are used to initiate backup operations on applications. A backup action can be submitted as part of a policy or as a standalone action.
Restore actions are used to restore applications to a known-good state from a restore point.
Export actions are used to initiate an export of an application to external data storage, such as S3-compatible object stores, using an existing restore point.
Backup cluster actions are used to initiate backup operations on cluster-scoped resources. A backup cluster action can be submitted as part of a policy or as a standalone action.
Restore cluster actions are used to restore cluster-scoped resources from a ClusterRestorePoint. A restore cluster action can be submitted as part of a policy or as a standalone action.
RunActions are used for manual execution and monitoring of actions related to policy runs.
CancelActions are created to halt progress of another action and prevent any remaining retries. Cancellation is best effort and not every phase of an Action may be cancellable. When an action is cancelled, its state becomes Cancelled.
A ReportAction resource is created to generate a K10 Report and provide insights into system performance and status. A successful ReportAction produces a K10 Report that contains information gathered at the time of the ReportAction.
Policies are used to automate your data management workflows. A Policy custom resource (CR) is used to perform operations on K10 Policies.
K10 Policies allow you to manage application protection and migration at scale.
See also: https://docs.kasten.io/latest/api/policies.html
A Profile custom resource (CR) is used to perform operations on K10 Profiles.
Location profiles are used to create backups from snapshots, move applications and their data across clusters and potentially across different clouds, and to subsequently import these backups or exports into another cluster.
K10 defaults to volume snapshot operations when capturing data, but there are situations where customization is required. For example, the best way to protect your application’s data may be to take a logical dump of the database. This requires using tools specific to that database.
Use Case Testing
Once you have installed your application, you will be able to use the K10 Dashboard to bring the application in compliance and protect it by creating one or more policies. You can also subsequently restore the application and its data to a previous version.
Basic K10 Setup process
Step 1: Add the Kasten K10 Helm repository
helm repo add kasten https://charts.kasten.io/
Step 2: Install K10
helm install k10 kasten/k10 --namespace=kasten-io --create-namespace watch -n 2 "kubectl -n kasten-io get pods"
Step 3: Configure the Local Storage System
kubectl annotate volumesnapshotclass csi-hostpath-snapclass k10.kasten.io/is-snapshot-class=true
Step 4: Expose the K10 dashboard
apiVersion: v1 kind: Service metadata: name: gateway-nodeport namespace: kasten-io spec: selector: service: gateway ports: - name: http port: 8000 nodePort: 32000 type: NodePort
kubectl apply -f k10-nodeport-svc.yaml
Step 5: Backup Policy Creation
Application State and Configuration Data
Each application must include the state that spans across storage volumes and databases (NoSQL/relational), as well as configuration data included in Kubernetes objects such as configmaps and secrets.
Application Security for Cloud-Native Environments
Cloud-native environments have shifted the importance and function of application security. End-to-end encryption and customer-owned management capabilities are paramount and should include integrated authentication and role-based access control (RBAC).
Lastly, application security must allow for a quick recovery from ransomware attacks.
Weave Net creates a virtual network that connects Docker containers across multiple hosts and enables their automatic discovery. With Weave Net, portable microservices-based applications consisting of multiple containers can run anywhere: on one host, multiple hosts, or even across cloud providers and data centers. Applications use the network just as if the containers were all plugged into the same network switch, without having to configure port mappings, ambassadors, or links.
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
Antrea is a Kubernetes networking solution intended to be Kubernetes native. It operates at Layer3/4 to provide networking and security services for a Kubernetes cluster, leveraging Open vSwitch as the networking data plane.
Basic Tools on Windows
All Pre-reqs on Windows are available as a Chocolatey package.
choco install kubernetes-helm
choco install kubernetes-cli
MiniKube Setup on Windows:
choco install minikube
MiniKube Start on Windows with VMware Workstation:
$Env:Path += ";C:\Program Files (x86)\VMware\VMware Workstation" minikube start --driver vmware --addons volumesnapshots,csi-hostpath-driver
Kubestr is a collection of tools to discover, validate and evaluate your kubernetes storage options.
kubectl get storageclass kubectl get VolumeSnapshotClass
Run an FIO test:
./kubestr fio -s <storage class>
Check a CSI drivers snapshot and restore capabilities:
./kubestr csicheck -s <storage class> -v <volume snapshot class>
Extended FIO run:
kubestr fio -f ssd-test.fio -s local-path
[global] bs=4k ioengine=libaio iodepth=1 size=1g direct=1 runtime=10 directory=/ filename=ssd.test.file [seq-read] rw=read stonewall [rand-read] rw=randread stonewall [seq-write] rw=write stonewall [rand-write] rw=randwrite stonewall
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.