In the Kubernetes cluster, a Kubelet service will be started on each Node (also known as Minion). This process is used to process the tasks distributed to this Node by the Master Node and manage the Pod and the containers in the Pod. Each Kubelet process will register the Node information on the API Server, report the usage of Node resources to the Master Node regularly, and monitor the container and Node resources through caadvise.
This source code analysis is based on K8s 1.14.6
Kubelet architecture
To understand the Kubelet source code, we need to understand the main architecture and components of Kubelet.
Start entry
The main function entry of kubelet is in cmd/kubelet/kubelet.go. The startup code is very simple:
func main() { rand.Seed(time.Now().UnixNano()) // Here is to start the kubelet service process and return a * cobra.Command command := app.NewKubeletCommand(server.SetupSignalHandler()) logs.InitLogs() defer logs.FlushLogs() if err := command.Execute(); err != nil { fmt.Fprintf(os.Stderr, "%v\n", err) os.Exit(1) } }
The app.newkubelecommand function is to parameterize the kubelet, start the kubelet, and then generate the kubelet object and the services needed by the kubelet to maintain the pod. The main steps are as follows
- Analyze the parameters, load the current flag, and judge the validity of the parameters. There are two kinds of flags:
- kubeletFlags: kubelet -- additional parameters
- kubeletConfig: obtained by parsing a specific configuration file
- Constructs the cobra.Command object, which is used to perform command-line interaction entered by the user. The object structure is
{ Use: componentKubelet, Long: ..., // The Kubelet has special flag parsing requirements to enforce flag precedence rules, // so we do all our parsing manually in Run, below. // DisableFlagParsing=true provides the full set of flags passed to the kubelet in the // `args` arg to Run, without Cobra's interference. DisableFlagParsing: true, Run: func(cmd *cobra.Command, args []string) {... }, }
The Run function is used to execute user commands. The flow of this function is the main process of kubelet, creating kubelet objects and creating various services.
Startup process
This part analyzes the Run function. First, we need to understand two important configuration structures: KubeletFlags kubeletconfig.KubeletConfiguration
The starting process is as follows:
- The function of creating a new watch is mainly used to check whether the configuration file of the watch kubelet is changed. If it has been changed, reload the kubelet configuration file with the Controller commonly used by kubernetes, that is, the Informer architecture, and the watch ConfigMap object
- Construct the Kubelet server object,
The kubelet server object is composed of various configurations of kubelet. Its structure is as follows:
type KubeletServer struct { KubeletFlags kubeletconfig.KubeletConfiguration }
The kubeltedeps object is created according to the server, which does not start any process and only returns dependencies or errors suitable for running. This is an important part. Its structure is:
kubelet.Dependencies{ Auth: nil, // default does not enforce auth[nz] CAdvisorInterface: nil, // cadvisor.New launches background processes (bg http.ListenAndServe, and some bg cleaners), not set here Cloud: nil, // cloud provider might start background processes ContainerManager: nil, DockerClientConfig: dockerClientConfig, KubeClient: nil, HeartbeatClient: nil, EventClient: nil, Mounter: mounter, Subpather: subpather, OOMAdjuster: oom.NewOOMAdjuster(), OSInterface: kubecontainer.RealOS{}, VolumePlugins: ProbeVolumePlugins(), DynamicPluginProber: GetDynamicPluginProber(s.VolumePluginDir, pluginRunner), TLSOptions: tlsOptions}, nil
We can see that these returned include the underlying docker client, the mount, and so on. In fact, it is a toolkit that can be used by the server.
- Execute the Run function to start the kubelet. Note that this Run function is not Run in the Command above
func Run(s *options.KubeletServer, kubeDeps *kubelet.Dependencies, stopCh <-chan struct{}) error { // To help debugging, immediately log version klog.Infof("Version: % v", version.Get()) // Here is the initialization operation for windows system if err := initForOS(s.KubeletFlags.WindowsService); err != nil { return fmt.Errorf("failed OS init: %v", err) } // Start kubelet here if err := run(s, kubeDeps, stopCh); err != nil { return fmt.Errorf("failed to run Kubelet: %v", err) } return nil }
The called func run performs the following series of operations:
- Set the feature Gate of kubelet through SetFromMap
- Verify the initialized server
- Register endpoint / configz
- Obtain and configure various clients, including:
- kubeclient
- Event client:
Configure EventRecordQPS EventBrust parameter
Call NewForConfig in k8s.io/client-go/kubernetes/typed/core/v1 - Heartbeat client:
Configure QPS timeout (if NodeLease feature is enabled, set NodeLeaseDurationSeconds to timeout)
- Build AuthInterface of authenticator and call BuildAuth
func BuildAuth(nodeName types.NodeName, client clientset.Interface, config kubeletconfig.KubeletConfiguration) (server.AuthInterface, error) { // Get clients, if provided var ( tokenClient authenticationclient.TokenReviewInterface sarClient authorizationclient.SubjectAccessReviewInterface ) if client != nil && !reflect.ValueOf(client).IsNil() { tokenClient = client.AuthenticationV1beta1().TokenReviews() sarClient = client.AuthorizationV1beta1().SubjectAccessReviews() } authenticator, err := BuildAuthn(tokenClient, config.Authentication) if err != nil { return nil, err } attributes := server.NewNodeAuthorizerAttributesGetter(nodeName) authorizer, err := BuildAuthz(sarClient, config.Authorization) if err != nil { return nil, err } return server.NewKubeletAuth(authenticator, attributes, authorizer), nil }
- It is mainly used for monitoring functions, including
type Interface interface { Start() error DockerContainer(name string, req *cadvisorapi.ContainerInfoRequest) (cadvisorapi.ContainerInfo, error) ContainerInfo(name string, req *cadvisorapi.ContainerInfoRequest) (*cadvisorapi.ContainerInfo, error) ContainerInfoV2(name string, options cadvisorapiv2.RequestOptions) (map[string] cadvisorapiv2.ContainerInfo, error) SubcontainerInfo(name string, req *cadvisorapi.ContainerInfoRequest) (map[string] *cadvisorapi.ContainerInfo, error) MachineInfo() (*cadvisorapi.MachineInfo, error) VersionInfo() (*cadvisorapi.VersionInfo, error) // Returns usage information about the filesystem holding container images. ImagesFsInfo() (cadvisorapiv2.FsInfo, error) // Returns usage information about the root filesystem. RootFsInfo() (cadvisorapiv2.FsInfo, error) // Get events streamed through passedChannel that fit the request. WatchEvents(request *events.Request) (*events.EventChannel, error) // Get filesystem information for the filesystem that contains the given file. GetDirFsInfo(path string) (cadvisorapiv2.FsInfo, error) }
- Initialize ContainerManager to manage containers running on nodes. The related components are:
kubeReserved Contains resources related to the node, including cpu memory pid Quantity, etc. SystemReserved Node resources, supporting cpu memory experimentalQOSReserved Namely--qos-reserve-requests parameter devicePluginEnabled(DevicePlugins feature Gate)
The final structure is as follows:
kubeDeps.ContainerManager, err = cm.NewContainerManager( kubeDeps.Mounter, kubeDeps.CAdvisorInterface, cm.NodeConfig{ RuntimeCgroupsName: s.RuntimeCgroups, SystemCgroupsName: s.SystemCgroups, KubeletCgroupsName: s.KubeletCgroups, ContainerRuntime: s.ContainerRuntime, CgroupsPerQOS: s.CgroupsPerQOS, CgroupRoot: s.CgroupRoot, CgroupDriver: s.CgroupDriver, KubeletRootDir: s.RootDirectory, ProtectKernelDefaults: s.ProtectKernelDefaults, NodeAllocatableConfig: cm.NodeAllocatableConfig{ KubeReservedCgroupName: s.KubeReservedCgroup, SystemReservedCgroupName: s.SystemReservedCgroup, EnforceNodeAllocatable: sets.NewString(s.EnforceNodeAllocatable...), KubeReserved: kubeReserved, SystemReserved: systemReserved, HardEvictionThresholds: hardEvictionThresholds, }, QOSReserved: *experimentalQOSReserved, ExperimentalCPUManagerPolicy: s.CPUManagerPolicy, ExperimentalCPUManagerReconcilePeriod: s.CPUManagerReconcilePeriod.Duration, ExperimentalPodPidsLimit: s.PodPidsLimit, EnforceCPULimits: s.CPUCFSQuota, CPUCFSQuotaPeriod: s.CPUCFSQuotaPeriod.Duration, }, s.FailSwapOn, devicePluginEnabled, kubeDeps.Recorder)
- For permission verification, the user with uid 0 needs to start.
- Call RunKubelet to start the kubelet process
Main process of RunKubelet:
- Get host name
- Establish and initialize event recorder
- Obtain the following resources (all read from KubeletFlags):
- hostNetworkSources mainly means that Kubelet allows some resources in pod (including file, http, api, *) to use hostnetwork
- Hostpidresources is the list of Pod sources that Kubelet allows using the host pid namespace.
- Hostipcresources refers to the list of pod sources that Kubelet allows to use host IPC resources
- privilegedSources consists of the above three resource configurations
- If the runonce parameter is set, the configuration of the container group is pulled once and exits after the container group is started. Otherwise, it will be maintained in the form of server
- For runonce, first create the required directory, listen to the pod update information, get the pod information, create the pod and return their status
- Start in server mode and call startKubelet. The process is as follows:
- Check if logserver and apiserver are available
- If there is cloud provider configuration, start cloudResourceSyncManager and send the request to cloud provider
- Starting VolumeManager, VolumeManager runs a set of asynchronous loops that determine which volumes need to be attached / mounted / unmounted / detached based on the Pod scheduled on this node.
- Call kubelet.syncNodeStatus to synchronize the node status. If there is any change or enough time has passed since the last synchronization, it will synchronize the node status to the primary node and register kubelet if necessary.
- Call kubelet.updateRuntimeUp, updateRuntimeUp to call the runtime status callback of the container, initialize the modules that depend on the runtime when the container first appears at runtime, and return an error if the status check fails. If the state check determines, the container runtime is updated in the kubelet runtimeState.
- Open the loop and synchronize the iptables rules (but there is no operation in the source code)
- Start a goroutine for "killing pod". If other goroutines are not used, the podKiller will receive a pod from the channel (podKillingCh), and then start goroutine to kill him
- Start the statusManager and probeManager (both are infinite loop synchronization mechanisms), the statusManager and the apiserver synchronize the pod status; probeManager manages and receives the container probe.
- Start the runtimeClass manager (note that this is related to replacing the underlying container)
runtimeClass is an api object of K8s. You can define runtimeClass to realize K8s docking with different container runtimes.
- Start pleg (pod lifecycle event generator) to generate pod related events.
At this point, the whole start-up process of kubelet is completed, and it enters into an infinite loop to synchronize the state of different components in real time. At the same time, the port is also monitored to respond to http requests.
How to interact with pod (creation, deletion, state synchronization) and CRI call are described in the next part
If you like, please pay attention to my public number, or check my blog http://packyzbq.coding.me., I will send my own learning record from time to time, everyone learn and exchange with each other.