Reference: Microsoft documentation
In Kubernetes, the Control Plane is responsible for managing the state of the cluster, including scheduling, scaling, and deploying applications. Control Plane logs are a set of logs that provide insight into the operation of the Kubernetes Control Plane components. These logs can be used to diagnose issues and monitor the health of the Control Plane.
The Control Plane logs are generated by various Kubernetes components such as API server, audit, authenticator, controller manager, and scheduler. Each log type corresponds to a component of the Kubernetes Control Plane. For example, API server logs contain information about the API server component that exposes the Kubernetes API. Audit logs provide a record of the individual users, administrators, or system components that have affected your cluster.
As a managed service, AKS operates the control plane for customers. Customers can configure Control Plane Logging through Diagnostic settings to collect these logs.
Categories of Control Plane Logs in AKS
As of the time of this writing, there are around 10 different log categories mentioned here. A screenshot of the categories from the official Microsoft documentation is below
Configure Control Plane Logs in AKS
Here are the steps to configure control plane logs in Azure Kubernetes Service (AKS):
Open the Azure portal and navigate to your AKS cluster.
Click on the Monitoring tab on the Overview page.
Click on Diagnostic settings.
Click on Add diagnostic setting.
In the Name field, enter a name for your diagnostic setting.
Under Destination details, select Log Analytics as the destination type. This is by far the most common option and is also used by Container Insights. You can also configure it to send to a 3rd party tool (such as Splunk) using the Send to Partner solution
Choose an existing Log Analytics workspace or create a new one.
Under Logs, select the control plane logs you want to collect.
-
NOTE: Under the destination table for the Log Analytics workspace, there are two options: Azure Diagnostics and Resource Specific. Later in the article, we will discuss the implications of choosing one vs the other
Click on Save to save your diagnostic setting.
This can also be configured using the CLI as shown below. More details in the documentation here
az monitor diagnostic-settings create --name AKS-Diagnostics --resource /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/myresourcegroup/providers/Microsoft.ContainerService/managedClusters/my-cluster --logs '[{""category"": ""kube-audit"",""enabled"": true}, {""category"": ""kube-audit-admin"", ""enabled"": true}, {""category"": ""kube-apiserver"", ""enabled"": true}, {""category"": ""kube-controller-manager"", ""enabled"": true}, {""category"": ""kube-scheduler"", ""enabled"": true}, {""category"": ""cluster-autoscaler"", ""enabled"": true}, {""category"": ""cloud-controller-manager"", ""enabled"": true}, {""category"": ""guard"", ""enabled"": true}, {""category"": ""csi-azuredisk-controller"", ""enabled"": true}, {""category"": ""csi-azurefile-controller"", ""enabled"": true}, {""category"": ""csi-snapshot-controller"", ""enabled"": true}]' --workspace /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourcegroups/myresourcegroup/providers/microsoft.operationalinsights/workspaces/myworkspace
Querying Control Plane Logs in AKS
Depending on the configuration in Step 9, the Control Plane logs are stored in Azure Diagnostics
in the Azure Diagnostics mode or AKSControlPlane, AKSAudit, AKSAuditAdmin
in the Resource Specific mode. The benefit of the Resource Specific mode is that it can be used with Basic logs which provides tremendous cost benefits. The logs in are split based on the Category
field
To query Control Plane logs in Azure Kubernetes Service (AKS) on the portal, you can follow these steps:
Open the Azure portal and navigate to your AKS cluster.
Click on the Monitoring tab on the Overview page.
Click on Logs
💡This works only if Container Insights is enabled on the AKS cluster. Else, you can navigate to the Log analytics workspace(configured before with Diagnostic settings) on Portal to query the same information-
Choose the Audit/Diagnostic option on the All Queries. Alternatively, you can use the Find in table to query individual tables. The former has some sample queries to get started
- In this case, I chose to write my own query for API Server logs
AzureDiagnostics
| where Category =='kube-apiserver'
| take 100
Using Control Plane Logs for Diagnosis in AKS
Some common use-cases for Control Plane logs
Latency issues with the API server: In this case, metrics alone are not sufficient because the metrics do not have the user agent which is critical for the investigation
Scheduling Issues with scheduler/cluster-autoscaler
Links for documentation