Advances, Systems and Applications

  • Open access
  • Published: 13 June 2023

A survey of Kubernetes scheduling algorithms

  • Khaldoun Senjab 1 ,
  • Sohail Abbas 1 ,
  • Naveed Ahmed 1 &
  • Atta ur Rehman Khan 2  

Journal of Cloud Computing volume  12 , Article number:  87 ( 2023 ) Cite this article

12k Accesses

16 Citations

Metrics details

As cloud services expand, the need to improve the performance of data center infrastructure becomes more important. High-performance computing, advanced networking solutions, and resource optimization strategies can help data centers maintain the speed and efficiency necessary to provide high-quality cloud services. Running containerized applications is one such optimization strategy, offering benefits such as improved portability, enhanced security, better resource utilization, faster deployment and scaling, and improved integration and interoperability. These benefits can help organizations improve their application deployment and management, enabling them to respond more quickly and effectively to dynamic business needs. Kubernetes is a container orchestration system designed to automate the deployment, scaling, and management of containerized applications. One of its key features is the ability to schedule the deployment and execution of containers across a cluster of nodes using a scheduling algorithm. This algorithm determines the best placement of containers on the available nodes in the cluster. In this paper, we provide a comprehensive review of various scheduling algorithms in the context of Kubernetes. We characterize and group them into four sub-categories: generic scheduling, multi-objective optimization-based scheduling, AI-focused scheduling, and autoscaling enabled scheduling, and identify gaps and issues that require further research.


Kubernetes is an open-source platform for automating the deployment, scaling, and management of containerized applications. It allows developers to focus on building and deploying their applications without worrying about the underlying infrastructure. Kubernetes uses a declarative approach to managing applications, where users specify desired application states, and the system maintains them. It also provides robust tools for monitoring and managing applications, including self-healing mechanisms for automatic failure detection and recovery. Overall, Kubernetes offers a powerful and flexible solution for managing containerized applications in production environments.

Kubernetes is well-suited for microservice-based web applications, where each component can be run in its own container. Containers are lightweight and can be easily created and destroyed, providing faster and more efficient resource utilization than virtual machines, as shown in Fig.  1 . Kubernetes automates the deployment, scaling, and management of containers across a cluster of machines, making resource utilization more efficient and flexible. This simplifies the process of building and maintaining complex applications.

figure 1

Comparison between different types of applications deployments

Microservice-based architecture involves dividing an application into small, independent modules called microservices, Fig.  2 . Each microservice is responsible for a specific aspect of the application, and they communicate through a message bus. This architecture offers several benefits, such as the ability to automate deployment, scaling, and management. Because each microservice is independent and can be managed and updated separately, it is easier to make changes without affecting the entire system. Additionally, microservices can be written in different languages and can run on different servers, providing greater flexibility in the development process.

figure 2

Comparison between different applications architectures

Kubernetes can quickly adapt to various types of demand intensities. For example, if a web application has few visitors at a given time, it can be scaled down to a few pods using minimal resources to reduce costs. However, if the application becomes extremely popular and receives a large number of visitors simultaneously, it can be scaled up to be serviced by a large number of pods, making it capable of handling almost any level of demand.

Kubernetes have been employed by many organizations in a diverse area of underlying applications and have gained the trust of being the best option for the management and deployment of containerized applications. In terms of recent applications, Kubernetes are proving to be an invaluable resource for IT infrastructure as they provide a sustainable path towards serverless computing that will result in easing up challenges in IT administration [ 1 ]. Serverless computing will provide end-to-end security enhancements but will also result in new infrastructure and security challenges as discussed in [ 1 ].

As the computing paradigm moves towards edge and fog computing, Kubernetes is proving to be a versatile solution that provides seamless network management between cloud and edge nodes [ 2 , 3 , 4 ]. Kubernetes face multiple challenges when deployed in an IoT environment. These challenges range from optimizing network traffic distribution [ 2 ], optimizing flow routing policies [ 3 ], and edge device’s computational resources distribution [ 4 ].

As can be seen from the diverse range of applications, and challenges associated with Kubernetes, it is imperative to study proposed algorithms in the related area to identify the state-of-the-art and future research directions. Numerous studies have focused on the development of new algorithms for Kubernetes. The main motivation for this survey is to provide a comprehensive overview of the state-of-the-art in the field of Kubernetes scheduling algorithms. By reviewing the existing literature and identifying the key theories, methods, and findings from previous studies, we aim to provide a critical evaluation of the strengths and limitations of existing approaches. We also hope to identify gaps and open questions in the existing literature, and to offer suggestions for future research directions. Overall, our goal is to contribute to the advancement of knowledge in the field and to provide a useful resource for researchers and practitioners working with Kubernetes scheduling algorithms.

To the best of authors’ knowledge, there are no related surveys found that specifically address the topic at hand. The surveys found are mostly targeted at the container orchestration in general (including Kubernetes), such as [ 5 , 6 , 7 , 8 ]. These surveys address Kubernetes breadthwise without targeting scheduling and diving deep into it and some even did not focus on Kubernetes. For example, some concentrated on scheduling in the cloud [ 9 ] and its associated concerns [ 10 ]. Others targeted big data applications in data center networks [ 11 ], or fog computing environments [ 12 ]. The authors have found two closely related and well-organized surveys [ 13 ] and [ 14 ] that targeted Kubernetes scheduling in depth. However, our work is different than these two surveys in terms of taxonomy, i.e., they targeted different aspects and objectives in scheduling whereas we categorized the literature into different four sub-categories: generic scheduling, multi-objective optimization-based scheduling, AI focused scheduling, and autoscaling enabled scheduling. Thereby focusing specifically on wide range of schemes related to multi-objective optimization and AI, in addition to the main scheduling with autoscaling. Our categorization, we believe, is more fine-grained and novel as compared to the existing surveys.

In this paper, the literature has been divided into four sub-categories: generic scheduling, multi-objective optimization-based scheduling, AI-focused scheduling, and autoscaling enabled scheduling. The literature pertaining to each sub-category is analyzed and summarized based on six parameters outlined in Literature review section.

Our main contributions are as follows:

A comprehensive review of the literature on Kubernetes scheduling algorithms targeting four sub-categories: generic scheduling, multi-objective optimization-based scheduling, AI focused scheduling, and autoscaling enabled scheduling.

A critical evaluation of the strengths and limitations of existing approaches.

Identification of gaps and open questions in the existing literature.

The remainder of this paper is organized as follows: In  Search methodology  section, we describe the methodology used to conduct the survey. In Literature review  section, we present the literature review along with results of our survey, including a critical evaluation of the strengths and limitations of existing approaches. A taxonomy of the identified research papers based on the literature review is presented as well. In  Discussion, challenges & future suggestions  section, we discuss the implications of our findings and suggest future research directions. Finally, in  Conclusions  section, we summarize the key contributions of the survey and provide our conclusions.

Search methodology

This section presents our search methodology for identifying relevant studies that are included in this review.

To identify relevant studies for our review, we conducted a comprehensive search of the literature using the following databases: IEEE, ACM, Elsevier, Springer, and Google Scholar. We used the following search terms: "Kubernetes," "scheduling algorithms," and "scheduling optimizing." We limited our search to studies published in the last 5 years and written in English.

We initially identified a total of 124 studies from the database searches, see Fig.  3 . We then reviewed the abstracts of these studies to identify those that were relevant to our review. We excluded studies that did not focus on Kubernetes scheduling algorithms, as well as those that were not original research or review articles. After this initial screening, we were left with 67 studies, see Fig.  4 .

figure 3

Inclusion criteria

figure 4

Exclusion criteria

We then reviewed the texts of the remaining studies to determine their eligibility for inclusion in our review. We excluded studies that did not meet our inclusion criteria, which were: (1) focus on optimizing Kubernetes scheduling algorithms, (2) provide original research or a critical evaluation of existing approaches, and (3) be written in English and published in the last 5 years. After this final screening, we included 47 studies in our review, see Fig.  4 . A yearly distribution of papers can be seen in Fig.  5 .

figure 5

Detailed statistics showing the yearly breakdown of analyzed studies

We also searched the reference lists of the included studies to identify any additional relevant studies that were not captured in our database searches. We did not identify any additional studies through this process. Therefore, our review includes 47 studies on Kubernetes scheduling algorithms published in the last 5 years. These studies represent a diverse range of research methods, including surveys, experiments, and simulations.

Literature review

This section has been organized into four sub-categories, i.e., generic scheduling, multi-objective optimization-based scheduling, AI focused scheduling, and autoscaling enabled scheduling. A distribution of analyzed research papers in each category can be seen in Fig.  6 . The literature in each sub-category is analyzed and then summarized based on six parameters given below:





figure 6

Detailed statistics for each category in terms of analyzed studies

Scheduling in Kubernetes

The field of Kubernetes scheduling algorithms has attracted significant attention from researchers and practitioners in recent years. A growing body of literature has explored the potential benefits and challenges of using different scheduling algorithms to optimize the performance of a Kubernetes cluster. In this section, we present a review of the key theories, methods, and findings from previous studies in this area.

One key theme in the literature is the need for efficient effective scheduling of workloads in a Kubernetes environment. Many studies have emphasized the limitations of traditional scheduling approaches, which often struggle to handle the complex and dynamic nature of workloads in a Kubernetes cluster. As a result, there has been increasing interest in the use of advanced scheduling algorithms to enable efficient, effective allocation of computing resources within the cluster.

Another key theme in the literature is the potential benefits of advanced scheduling algorithms for Kubernetes. Many studies have highlighted the potential for these algorithms to improve resource utilization, reduce latency, and enhance the overall performance of the cluster. Additionally, advanced scheduling algorithms have the potential to support the development of new applications and services within the Kubernetes environment, such as real-time analytics and machine learning and deep learning, see AI Focused Scheduling section. 

Despite these potential benefits, the literature also identifies several challenges and limitations of Kubernetes scheduling algorithms. One key challenge is the need to address the evolving nature of workloads and applications within the cluster. Therefore, various authors focused on improving the autoscaling feature in Kubernetes scheduling to allow for automatic adjustment of the resources allocated to pods based on the current demand, more detailed discussion can be found in Autoscaling-enabled Scheduling section. Other challenges include the need to manage and coordinate multiple scheduling algorithms, and to ensure the stability and performance of the overall system.

Overall, the literature suggests that advanced scheduling algorithms offer a promising solution to the challenges posed by the complex and dynamic nature of workloads in a Kubernetes cluster. However, further research is needed to address the limitations and challenges of these algorithms, and to explore their potential applications and benefits.

In Santos et al. [ 15 ], for deployments in smart cities, the authors suggest a network-aware scheduling method for container-based apps. Their strategy is put into practice as an addition to Kubernetes' built-in default scheduling system, which is an open-source orchestrator for the automatic management and deployment of micro-services. By utilizing container-based smart city apps, the authors assess the suggested scheduling approach's performance and contrast it with that of Kubernetes' built-in default scheduling mechanism. Compared to the default technique, they discovered that the suggested solution reduces network latency by almost 80%.

In Chung et al. [ 16 ], the authors propose a new cluster scheduler called Stratus that is specialized for orchestrating batch job execution on virtual clusters in public Infrastructure-as-a-Service (IaaS) platforms. Stratus focuses on minimizing dollar costs by aggressively packing tasks onto machines based on runtime estimates, i.e., to save money, the allocated resources will be made either mostly full or empty so that they may then be released. Using the workload traces from TwoSigma and Google, the authors evaluate Stratus and establish that the proposed Stratus reduces cost by 17–44% compared to the benchmarks of virtual cluster scheduling.

In Le et al. [ 17 ], the authors propose a new scheduling algorithm called AlloX for optimizing job performance in shared clusters that use interchangeable resources such as CPUs, GPUs, and other accelerators. AlloX transforms the scheduling problem into a min-cost bipartite matching problem and provides dynamic fair allocation over time. The authors demonstrate theoretically and empirically that AlloX performs better than existing solutions in the presence of interchangeable resources, and they show that it can reduce the average job completion time significantly while providing fairness and preventing starvation.

In Zhong et al. [ 18 ], the authors propose a heterogeneous task allocation strategy for cost-efficient container orchestration in Kubernetes-based cloud computing infrastructures with elastic compute resources. The proposed strategy has three main features: support for heterogeneous job configurations, cluster size adjustment through autoscaling algorithms, and a rescheduling mechanism to shut down underutilized VM instances and reallocate relevant jobs without losing task progress. The authors evaluate their approach using the Australian National Cloud Infrastructure (Nectar) and show that it can reduce overall cost by 23–32% compared to the default Kubernetes framework.

In Thinakaran et al. [ 19 ], to create Kube-Knots, the authors combine their proposed GPU-aware resource orchestration layer, Knots, with the Kubernetes container orchestrator. Through dynamic container orchestration, Kube-Knots dynamically harvests unused computing cycles, enabling the co-location of batch and latency-critical applications and increasing overall resource utilization. The authors demonstrate that the proposed scheduling strategies increase average and 99th percentile cluster-wide GPU usage by up to 80% in the case of HPC workloads when used to plan datacenter-scale workloads using Kube-Knots on a ten-node GPU cluster. In addition, the suggested schedulers reduce energy consumption throughout the cluster by an average of 33% for three separate workloads and increase the average task completion times of deep learning workloads by up to 36% when compared to modern schedulers.

In Townend et al. [ 20 ], the authors propose a holistic scheduling system for Kubernetes that replaces the default scheduler and considers both software and hardware models to improve data center efficiency. The authors claim that by introducing hardware modeling into a software-based solution, an intelligent scheduler can make significant improvements in data center efficiency. In their initial deployment, the authors observed power consumption reductions of 10–20%.

In the work by Menouer [ 21 ], the author describes the KCSS, a brand-new Kubernetes container scheduling strategy. The purpose of KCSS is to increase performance in terms of makespan and power consumption by scheduling user-submitted containers as efficiently as possible. For each freshly submitted container, KCSS chooses the best node based on a number of factors linked to the cloud infrastructure and the user's requirements using a multi-criteria decision analysis technique. The author uses the Go programming language to create KCSS and shows how it works better than alternative container scheduling methods in a variety of situations.

In Song et al. [ 22 ], authors present a topology-based GPU scheduling framework for Kubernetes. The framework is based on the traditional Kubernetes GPU scheduling algorithm, but introduces the concept of a GPU cluster topology, which is restored in a GPU cluster resource access cost tree. This allows for more efficient scheduling of different GPU resource application scenarios. The proposed framework has been used in the production practice of Tencent and has reportedly improved the resource utilization of GPU clusters by about 10%.

In Ogbuachi et al. [ 23 ], the authors propose an improved design for Kubernetes scheduling that takes into account physical, operational, and network parameters in addition to software states in order to enable better orchestration and management of edge computing applications. They compare the proposed design to the default Kubernetes scheduler and show that it offers improved fault tolerance and dynamic orchestration capabilities.

In the work by Beltre et al. [ 24 ], utilizing fairness measures including dominant resource fairness, resource demand, and average waiting time, the authors outline a scheduling policy for Kubernetes clusters. KubeSphere, a policy-driven meta-scheduler created by the authors, enables tasks to be scheduled according to each user's overall resource requirements and current consumption. The proposed policy increased fairness in a multi-tenant cluster, according to experimental findings.

In Haja et al. [ 25 ], the authors propose a custom Kubernetes scheduler that takes into account delay constraints and edge reliability when making scheduling decisions. The authors argue that this type of scheduler is necessary for edge infrastructure, where applications are often delay-sensitive, and the infrastructure is prone to failures. The authors demonstrate their Kubernetes extension and release the solution as open source.

In Wojciechowski et al. [ 26 ], the authors propose a unique method for scheduling Kubernetes pods that makes advantage of dynamic network measurements gathered by Istio Service Mesh. According to the authors, this approach can fully automate saving up to 50% of inter-node bandwidth and up to 37% of application response time, which is crucial for the adoption of Kubernetes in 5G use cases.

In Cai et al. [ 27 ], the authors propose a feedback control method for elastic container provisioning in Kubernetes-based systems. The method uses a combination of a varying-processing-rate queuing model and a linear model to improve the accuracy of output errors. The authors compare their approach with several existing algorithms on a real Kubernetes cluster and find that it obtains the lowest percentage of service level agreement (SLA) violation and the second lowest cost.

In Ahmed et al. [ 28 ], the deployment of Docker containers in a heterogeneous cluster with CPU and GPU resources can be managed via the authors' dynamic scheduling framework for Kubernetes. The Kubernetes Pod timeline and previous data about the execution of the containers are taken into account by the platform, known as KubCG, to optimize the deployment of new containers. The time it took to complete jobs might be cut by up to 64% using KubCG, according to the studies the authors conducted to validate their algorithm.

In Ungureanu et al. [ 29 ], the authors propose a hybrid shared-state scheduling framework for Kubernetes that combines the advantages of centralized and distributed scheduling. The framework uses distributed scheduling agents to delegate most tasks, and a scheduling correction function to process unprioritized and unscheduled tasks. Based on the entire cluster state the scheduling decisions are made, which are then synchronized and updated by the master-state agent. The authors performed experiments to test the behavior of their proposed scheduler and found that it performed well in different scenarios, including failover and recovery. They also found that other centralized scheduling frameworks may not perform well in situations like collocation interference or priority preemption.

In Yang et al. [ 30 ], the authors present the design and implementation of KubeHICE, a performance-aware container orchestrator for heterogeneous-ISA architectures in cloud-edge platforms. KubeHICE extends Kubernetes with two functional approaches, AIM (Automatic Instruction Set Architecture Matching) and PAS (Performance-Aware Scheduling), to handle heterogeneous ISA and schedule containers according to the computing capabilities of cluster nodes. The authors performed experiments to evaluate KubeHICE and found that it added no additional overhead to container orchestration and was effective in performance estimation and resource scheduling. They also demonstrated the advantages of KubeHICE in several real-world scenarios, showing for example a 40% increase in CPU utilization when eliminating heterogeneity.

In Li et al. [ 31 ], the authors propose two dynamic scheduling algorithms, Balanced-Disk-IO-Priority (BDI) and Balanced-CPU-Disk-IO-Priority (BCDI), to address the issue of Kubernetes' scheduler not taking the disk I/O load of nodes into account. BDI is designed to improve the disk I/O balance between nodes, while BCDI is designed to solve the issue of load imbalance of CPU and disk I/O on a single node. The authors perform experiments to evaluate the algorithms and find that they are more effective than the Kubernetes default scheduling algorithms.

In Fan et al. [ 32 ], the authors propose an algorithm for optimizing the scheduling of pods in the Serverless framework on the Kubernetes platform. The authors argue that the default Kubernetes scheduler, which operates on a pod-by-pod basis, is not well-suited for the rapid deployment and running of pods in the Serverless framework. To address this issue, the authors propose an algorithm that uses simultaneous scheduling of pods to improve the efficiency of resource scheduling in the Serverless framework. Through preliminary testing, the authors found that their algorithm was able to greatly reduce the delay in pod startup while maintaining a balanced use of node resources.

In Bestari et al. [ 33 ], the authors propose a scheduler for distributed deep learning training in Kubeflow that combines features from existing works, including autoscaling and gang scheduling. The proposed scheduler includes modifications to increase the efficiency of the training process, and weights are used to determine the priority of jobs. The authors evaluate the proposed scheduler using a set of Tensorflow jobs and find that it improves training speed by over 26% compared to the default Kubernetes scheduler.

In Dua et al. [ 34 ], the authors present an alternative algorithm for load balancing in distributed computing environments. The algorithm uses task migration to balance the workload among processors of different capabilities and configurations. The authors define labels to classify tasks into different categories and configure clusters dedicated to specific types of tasks.

The above-mentioned schemes are summarized in Table 1 .

Scheduling using multi-objective optimization

Multi-objective optimization scheduling takes into account multiple objectives or criteria when deciding how to allocate resources and schedule containers on nodes in the cluster. This approach is particularly useful in complex distributed systems where there are multiple competing objectives that need to be balanced to achieve the best overall performance. In a multi-objective optimization scheduling approach, the scheduler considers multiple objectives simultaneously, such as minimizing response time, maximizing resource utilization, and reducing energy consumption. The scheduler uses optimization algorithms to find the optimal solution that balances these objectives.

Multi-objective optimization scheduling can help improve the overall performance and efficiency of Kubernetes clusters by taking into account multiple objectives when allocating resources and scheduling containers. This approach can result in better resource utilization, improved application performance, reduced energy consumption, and lower costs.

Some examples of multi-objective optimization scheduling algorithms used in Kubernetes include genetic algorithms, Ant Colony Optimization, and particle swarm optimization. These algorithms can help optimize different objectives, such as response time, resource utilization, energy consumption, and other factors, to achieve the best overall performance and efficiency in the Kubernetes cluster.

In this section, multi-objective scheduling proposals are discussed.

In Kaur et al. [ 35 ], the authors propose a new controller for managing containers on edge-cloud nodes in Industrial Internet of Things (IIoT) systems. The controller, called Kubernetes-based energy and interference driven scheduler (KEIDS), is based on Google Kubernetes and is designed to minimize energy utilization and interference in IIoT systems. KEIDS uses integer linear programming to formulate the task scheduling problem as a multi-objective optimization problem, taking into account factors such as energy consumption, carbon emissions, and interference from other applications. The authors evaluate KEIDS using real-time data from Google compute clusters and find that it outperforms existing state-of-the-art schemes.

In Lin et al. [ 36 ], the authors propose a multi-objective optimization model for container-based microservice scheduling in cloud architectures. They present an ant colony algorithm for solving the scheduling problem, which takes into account factors such as computing and storage resource utilization, the number of microservice requests, and the failure rate of physical nodes. The authors evaluate the proposed algorithm using experiments and compare its performance to other related algorithms. They find that the proposed algorithm achieves better results in terms of cluster service reliability, cluster load balancing, and network transmission overhead.

In Wei-guo et al. [ 37 ], the authors propose an improved scheduling algorithm for Kubernetes by combining ant colony optimization and particle swarm optimization to better balance task assignments and reduce resource costs. The authors implemented the algorithm in Java and tested it using the CloudSim tool, showing that it outperformed the original scheduling algorithm.

In the work by Oleghe [ 38 ], the idea of container placement and migration in edge servers, as well as the scheduling models created for this purpose, are discussed by the author. The majority of scheduling models, according to the author, are based mostly on heuristic algorithms and use multi-objective optimization models or graph network models. The study also points out the lack of studies on container scheduling models that take dispersed edge computing activities into account and predicts that future studies in this field will concentrate on scheduling containers for mobile edge nodes.

In Carvalho et al. [ 39 ], The authors offer an addition to the Kubernetes scheduler that uses Quality of Experience (QoE) measurements to help cloud management Service Level Objectives (SLOs) be more accurate. In the context of video streaming services that are co-located with other services, the authors assess the suggested architecture using the QoE metric from the ITU P.1203 standard. According to the findings, resource rescheduling increases average QoE by 135% while the proposed scheduler increases it by 50% when compared to other schedulers.

The above-mentioned schemes are summarized in Table 2 .

AI focused scheduling

Many large companies have recently started to provide AI based services. For this purpose, they have installed machine/deep learning clusters composed of tens to thousands of CPUs and GPUs for training their deep learning models in a distributed manner. Different machine learning frameworks are used such as MXNet [ 40 ], TensorFlow [ 41 ], and Petuum [ 42 ]. Training a deep learning model is usually very resource hungry and time consuming. In such a setting, efficient scheduling is crucial in order to fully utilize the expensive deep learning cluster and expedite the model training process. Different strategies have been used to schedule tasks in this arena, for examples, general purpose schedulers are customized to tackle distributed deep learning tasks, example include [ 43 ] and [ 44 ]; however, they statically allocate resources and do not adjust resource under different load conditions which lead to poor resource utilization. Others proposed dynamic allocation of resources after carefully analyzing the workloads, examples include [ 45 ] and [ 46 ].

In this section, deep learning focused schedulers are surveyed.

In Peng et al. [ 46 ], the authors propose a customized job scheduler for deep learning clusters called Optimus. The goal of Optimus is to minimize the time required for deep learning training jobs, which are resource-intensive and time-consuming. Optimus employs performance models to precisely estimate training speed as a function of resource allocation and online fitting to anticipate model convergence during training. These models inform how Optimus dynamically organizes tasks and distributes resources to reduce job completion time. The authors put Optimus into practice on a deep learning cluster and evaluate its efficiency in comparison to other cluster schedulers. They discover that Optimus beats conventional schedulers in terms of job completion time and makespan by roughly 139% and 63%, respectively.

In Mao et al. [ 47 ], the authors propose using modern machine learning techniques to develop highly efficient policies for scheduling data processing jobs on distributed compute clusters. They present their system, called Decima, which uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms. Decima is designed to be scalable and able to handle complex job dependency graphs. The authors report that their prototype integration with Spark on a 25-node cluster improved average job completion time by at least 21% over existing hand-tuned scheduling heuristics, with up to 2 × improvement during periods of high cluster load.

In Chaudhary et al. [ 48 ], a distributed fair share scheduler for GPU clusters used for deep learning training termed as Gandivafair is presented by the authors. This GPU cluster utilization system offers performance isolation between users and is created to strike a balance between the competing demands of justice and efficiency. In spite of cluster heterogeneity, Gandivafair is the first scheduler to fairly distribute GPU time among all active users. The authors demonstrate that Gandivafair delivers both fairness and efficiency under realistic multi-user workloads by evaluating it using a prototype implementation on a heterogeneous 200-GPU cluster.

In Fu et al. [ 49 ], the authors propose a new container placement scheme called ProCon for scheduling jobs in a Kubernetes cluster. ProCon uses an estimation of future resource usage to balance resource contentions across the cluster and reduce the completion time and makespan of jobs. The authors demonstrate through experiments that ProCon decreases completion time by up to 53.3% for a specific job and enhances general performance by 23.0%. In addition, ProCon shows a makespan improvement of up to 37.4% in comparison to Kubernetes' built-in default scheduler.

In Peng et al. [ 50 ], the authors propose DL2, a deep learning-based scheduler for deep learning clusters that aims to improve global training job expedition by dynamically resizing resources allocated to jobs. The authors implement DL2 on Kubernetes and evaluate its performance against a fairness scheduler and an expert heuristic scheduler. The results show that DL2 outperforms the other schedulers in terms of average job completion time.

In Mao et al. [ 51 ], the authors propose a new container scheduler called SpeCon optimized for short-lived deep learning applications. SpeCon is designed to improve resource utilization and job completion times in a Kubernetes cluster by analyzing the progress of deep learning training processes and speculatively migrating slow-growing models to release resources for faster-growing ones. The authors conduct experiments that demonstrate that SpeCon improves individual job completion times by up to 41.5%, improves system-wide performance by 14.8%, and reduces makespan by 24.7%.

In Huang et al. [ 52 ], for scheduling independent batch jobs across many federated cloud computing clusters, the authors suggest a deep reinforcement learning-based job scheduler dubbed RLSK. The authors put RLSK into use on Kubernetes and tested its performance through simulations, demonstrating that it can outperform conventional scheduling methods.

The work by Wang et al. [ 53 ] describes MLFS, a feature-based task scheduling system for machine learning clusters that can conduct both data- and model-parallel processes. To determine task priority for work queue ordering, MLFS uses a heuristic scheduling method. The data from this method is then used to train a deep reinforcement learning model for job scheduling. In comparison to existing work schedules, the proposed system is shown to reduce job completion time by up to 53%, makespan by up to 52%, and increase accuracy by up to 64%. The system is tested using real experiments and large-scale simulations based on real traces.

In Han et al. [ 54 ], the authors present KaiS, an edge-cloud Kubernetes scheduling framework based on learning. KaiS models system state data using graph neural networks and a coordinated multi-agent actor-critic method for decentralized request dispatch. Research indicates that when compared to baselines, KaiS can increase average system throughput rate by 14.3% and decrease scheduling cost by 34.7%.

In Casquero et al. [ 55 ], the Kubernetes orchestrator's scheduling task is distributed among processing nodes by the authors' proposed custom scheduler, which makes use of a Multi-Agent System (MAS). According to the authors, this method is quicker than the centralized scheduling strategy employed by the default Kubernetes scheduler.

In Yang et al. [ 56 ], the authors propose a method for optimizing Kubernetes' container scheduling algorithm by combining the grey system theory with the LSTM (Long Short-Term Memory) neural network prediction method. They perform experiments to evaluate their approach and find that it can reduce the resource fragmentation problem of working nodes in the cluster and increase the utilization of cluster resources.

In Zhang et al. [ 57 ], a highly scalable cluster scheduling system for Kubernetes, termed as Zeus, is proposed by the authors. The main feature of Zeus is that based on the actual server utilization it schedules the best-effort jobs. It has the ability to adaptively divide resources between workloads of two different classes. Zeus is meant to enable the safe colocation of best-effort processes and latency-sensitive services. The authors test Zeus in a real-world setting and discover that it can raise average CPU utilization from 15 to 60% without violating Service Level Objectives (SLOs).

In Liu et al. [ 58 ], the authors suggest a scheduling strategy for deep learning tasks on Kubernetes that takes into account the tasks' resource usage characteristics. To increase task execution efficiency and load balancing, the suggested paradigm, dubbed FBSM, has modules for a GPU sniffer and a balance-aware scheduler. The execution of deep learning tasks is sped up by the suggested system, known as KubFBS, according to the authors' evaluation, which also reveals improved load balancing capabilities for the cluster.

In Rahali et al. [ 59 ], the authors propose a solution for resource allocation in a Kubernetes infrastructure hosting network service. The proposed solution aims to avoid resource shortages and protect the most critical functions. The authors use a statistical approach to model and solve the problem, given the random nature of the treated information.

The above-mentioned schemes are summarized in Table 3 .

Autoscaling-enabled scheduling

Autoscaling is an important feature in Kubernetes scheduling because it allows for automatic adjustment of the resources allocated to pods based on the current demand. It allows efficient resource utilization, improved performance, cost savings, and high availability of the application. Auto rescaling and scheduling are related in that auto rescaling can be used to ensure that there are always enough resources available to handle the tasks that are scheduled. For example, if the scheduler assigns a new task to a worker node, but that node does not have enough resources to execute the task, the auto scaler can add more resources to that node or spin up a new node to handle the task. In this way, auto rescaling and scheduling work together to ensure that a distributed system is able to handle changing workloads and optimize resource utilization. Some of the schemes related to this category are surveyed below.

In Taherizadeh et al. [ 60 ], the authors propose a new dynamic multi-level (DM) autoscaling method for container-based cloud applications. The DM method uses both infrastructure- and application-level monitoring data to determine when to scale up or down, and its thresholds are dynamically adjusted based on workload conditions. The authors compare the performance of the DM method to seven existing autoscaling methods using synthetic and real-world workloads. They find that the DM method has better overall performance than the other methods, particularly in terms of response time and the number of instantiated containers. SWITCH system was used to implement the DM method for time-critical cloud applications.

In Rattihalli et al. [ 61 ], the authors propose a new resource management system called RUBAS that can dynamically adjust the allocation of containers running in a Kubernetes cluster. RUBAS incorporates container migration to improve upon the Kubernetes Vertical Pod Autoscaler (VPA) system non-disruptively. The authors evaluate RUBAS using multiple scientific benchmarks and compare its performance to Kubernetes VPA. They find that RUBAS improves CPU and memory utilization by 10% and reduces runtime by 15% with an overhead for each application ranging from 5–20%.

In Toka et al. [ 62 ], the authors present a Kubernetes scaling engine that uses machine learning forecast methods to make better autoscaling decisions for cloud-based applications. The engine's short-term evaluation loop allows it to adapt to changing request dynamics, and the authors introduce a compact management parameter for cloud tenants to easily set their desired level of resource over-provisioning vs. service level agreement (SLA) violations. The proposed engine is evaluated in simulations and with measurements on Web trace data, and the results show that it results in fewer lost requests and slightly more provisioned resources compared to the default Kubernetes baseline.

In Balla et al. [ 63 ], the authors propose an adaptive autoscaler called Libra, which automatically detects the optimal resource set for a single pod and manages the horizontal scaling process. Libra is also able to adapt the resource definition for the pod and adjust the horizontal scaling process if the load or underlying virtualized environment changes. The authors evaluate Libra in simulations and show that it can reduce the average CPU and memory utilization by up to 48% and 39%, respectively, compared to the default Kubernetes autoscaler.

In another work by Toka et al. [ 64 ], the authors propose a Kubernetes scaling engine that uses multiple AI-based forecast methods to make autoscaling decisions that are better suited to handle the variability of incoming requests. The authors also introduce a compact management parameter to help application providers easily set their desired resource over-provisioning and SLA violation trade-off. The proposed engine is evaluated in simulations and with measurements on web traces, showing improved fitting of provisioned resources to service demand.

In Wu et al., the authors propose a new active Kubernetes auto scaling device based on prediction of pod replicas. They demonstrate that their proposed autoscaler has a faster response speed compared to existing scaling strategies in Kubernetes.

In Wang et al. [ 65 ] the authors propose an improved automatic scaling scheme for Kubernetes that combines the advantages of different types of nodes in the scaling process. They found that their scheme improves the performance of the system under rapid load pressure and reduces instability within running clusters compared to the default auto scaler.

In Kang et al. [ 66 ], the authors propose a method for improving the reliability of virtual networks by using optimization models and heuristic algorithms to allocate virtual network functions (VNFs) to suitable locations. The authors also develop function scheduler plugins for the Kubernetes system, which allows for the automatic deployment and management of containerized applications. The proposed method is demonstrated to be effective in allocating functions and running service functions correctly. This work was published in the 2021 edition of the IEEE Conference on Decision and Control.

In Vu et al. [ 67 ], propose a hybrid autoscaling method for containerized applications that combines vertical and horizontal scaling capabilities to optimize resource utilization and ensure quality of service (QoS) requirements. The proposed method uses a predictive approach based on machine learning to forecast future demand and a burst identification module to make scaling decisions. The authors evaluate the proposed method and find that it improves response time and resource utilization compared to existing methods that only use a single scaling mode.

The above-mentioned schemes are summarized in Table 4 .

Discussion, challenges & future suggestions

In Literature review section, a comprehensive review has been presented covering four sub-categories in the area of Kubernetes scheduling. It is crucial to provide a brief discussion on the categorized literature review that is presented in this section.

In the area of multi-objective optimization-based scheduling in Kubernetes, several research studies have been conducted to optimize various objectives such as minimizing the energy consumption and cost while maximizing resource utilization and meeting application performance requirements. These studies employ different optimization techniques such as genetic algorithms, particle swarm optimization, and ant colony optimization. Some studies also incorporate machine learning-based approaches to predict workload patterns and make scheduling decisions. There are still several challenges that need to be addressed. Firstly, the multi-objective nature of the problem poses a significant challenge in finding optimal solutions that balance conflicting objectives. Second, the dynamic nature of the cloud environment requires real-time adaptation of scheduling decisions to changing conditions. Overall, the research in multi-objective optimization-based scheduling in Kubernetes shows great potential in achieving efficient and effective resource management. Still, further work is needed to address the challenges and validate the effectiveness of these approaches in real-world scenarios.

On the other hand, AI-based scheduling in Kubernetes has been a popular area of research in recent years. Many studies have proposed different approaches to optimize scheduling decisions using machine learning and other AI techniques. One of the key accomplishments in this area is the development of scheduling algorithms that can handle complex workloads in a dynamic environment. These algorithms can consider various factors, such as resource availability, task dependencies, and application requirements, to make optimal scheduling decisions. Some studies have proposed reinforcement learning-based scheduling algorithms, which can adapt to changing workload patterns and learn from experience to improve scheduling decisions. Other studies have proposed deep learning-based approaches, which can capture complex patterns in the workload data and make accurate predictions. Overall, these studies have demonstrated that AI-based scheduling can improve the efficiency and performance of Kubernetes clusters. However, there are still some challenges that need to be addressed in this area. One of the main challenges is the lack of real-world datasets for training and evaluation of AI-based scheduling algorithms. Most studies use synthetic or simulated datasets, which may not reflect the complexities of real-world workloads. Another challenge is the trade-off between accuracy and computational complexity. Future research in this area could focus on developing more efficient and scalable AI-based scheduling algorithms that can handle large-scale, real-world workloads. This could involve exploring new machine learning and optimization techniques that can improve scheduling accuracy while reducing computational complexity.

Lastly, autoscaling enabled scheduling is an emerging research area that aims to optimize resource utilization and improve application performance by combining autoscaling and scheduling techniques. Several research studies have been published in this area in recent years. The analysis of these studies reveals that autoscaling enabled scheduling can lead to significant improvements in resource utilization and application performance. The studies have shown they can help reduce resource wastage, minimize the risk of under-provisioning, and improve application response times. However, despite these promising results, there are still some challenges that need to be addressed in this area. One of the main challenges is the complexity of designing effective autoscaling enabled scheduling algorithms. Developing algorithms that can adapt to dynamic workload changes and optimize resource utilization while maintaining application performance is a non-trivial task. Furthermore, there is a need for more research on the practical implementation of autoscaling enabled scheduling in real-world scenarios. Most of the existing studies have been conducted in controlled experimental settings, and there is a need to evaluate the effectiveness of auto scaling enabled scheduling in real-world applications. There are still several challenges that need to be addressed, including algorithm design, standardization, and practical implementation. Future research in this area should focus on addressing these challenges and developing more effective and practical auto scaling enabled scheduling techniques.

The research papers use diverse algorithms to enhance Kubernetes scheduling. These algorithms are tested on various platforms and environments, such as Spark, MXNet, Kubernetes, Google and TwoSigma's GPU cluster, workloads, Google compute, CPU-GPU, the National Cloud Infrastructure, benchmarks, ProCon, DL2, DRF, Optimus, CBP, PP, scaling, data centers, schedulers, CloudSim and Java, scenarios, cloud infrastructure, user need, RLSK, real trace, GaiaGPU and Tencent, real workload traces, simulations and web traces, Kubernetes, a new algorithm, Kubernetes failover and recovery, KubeHICE, real-world scenarios, BDI, BCDI, Kubernetes, a proposed algorithm, autoscalers, default auto scalers, video streaming, Tensorflow, Zeus, and latency-sensitive services. Some papers did not specify the details of the algorithms they used or the platforms and environments they tested on.

As can be seen in the previous sections, the survey extensively analyzes the current literature, and composes a taxonomy to not only effectively analyze the current state-of-the-art but also identify the challenges and future directions. Based on the analysis, the following areas have been identified as potential future research in the field:

As Kubernetes becomes more popular, there will be a growing need for advanced computation optimization techniques. In the future, Kubernetes may benefit from the development of more sophisticated algorithms for workload scheduling and resource allocation, potentially using AI or machine learning. Additionally, integrating Kubernetes with emerging technologies like serverless computing could lead to even more efficient resource usage by enabling dynamic scaling without pre-provisioned infrastructure. Ultimately, the future of computation optimization in Kubernetes is likely to involve a combination of cutting-edge algorithms, innovative technologies, and ongoing advancements in cloud computing.

Testing and implementation to reveal limitations or current learning algorithms for scheduling and potential improvements on large scale clusters. One important focus is on improving the tooling and automation around testing and deployment, including the development of new testing frameworks and the integration of existing tools into the Kubernetes ecosystem. Another key area is the ongoing refinement of Kubernetes' implementation and development process, with a focus on streamlining workflows, improving documentation, and fostering greater collaboration within the open-source community. Additionally, there is a growing emphasis on developing more comprehensive testing and validation strategies for Kubernetes clusters, including the use of advanced techniques like chaos engineering to simulate real-world failure scenarios. Overall, the future of testing and implementation in Kubernetes is likely to involve ongoing innovation, collaboration, and an ongoing commitment to driving the platform forward.

A number of methods are employing learning algorithms for resource balancing inside and outside the cluster. Even though the methods given encouraging results, new learning algorithms can be found to improve the scheduler, especially on large scale clusters.

Limitations and potential improvements in specific contexts, e.g., Green Computing. Minimizing the carbon footprint of a cluster is an ongoing challenge. Advanced schedulers are needed to be proposed in order to reduce the energy consumption and carbon footprint of clusters in IIoT setups. There is a huge opportunity for improving the existing methods and proposing new methods in this area.

Future research in Kubernetes resource management. Kubernetes resource management mostly relies on optimization modelling framework and heuristic-based algorithms. The potential for improving and proposing new resource management algorithms is a very promising area of research. Future research in Kubernetes resource management may focus on addressing the challenges of managing complex, dynamic workloads across distributed, heterogeneous environments. This may involve developing more sophisticated algorithms and techniques for workload placement, resource allocation, and load balancing, as well as exploring new approaches to containerization and virtualization. Additionally, there may be opportunities to leverage emerging technologies like edge computing and 5G networks to enable more efficient and scalable resource management in Kubernetes.

Most of the work done in the area of Kubernetes scheduling has been evaluated on small clusters. However, this might not always be tempting. One future research direction in Kubernetes scheduling is to use larger cluster sizes for algorithm evaluation. While Kubernetes has been shown to be effective in managing clusters of up to several thousand nodes, there is a need to evaluate its performance in even larger cluster sizes. This includes evaluating the scalability of the Kubernetes scheduler, identifying potential bottlenecks, and proposing solutions to address them. Additionally, there is a need to evaluate the impact of larger cluster sizes on application performance and resource utilization. This research could lead to the development of more efficient scheduling algorithms and better management strategies for large-scale Kubernetes deployments.

Scheduling should not only be considered from the static infrastructure point of view, but rather advanced context-aware scheduling algorithms may be proposed that could focus on developing new approaches to resource allocation and scheduling that take into account a broader range of contextual factors, such as user preferences, application dependencies, and environmental conditions. This may involve exploring new machine learning techniques and optimization algorithms that can dynamically adapt to changing conditions and prioritize resources based on real-time feedback and analysis. Other potential areas of research may include developing new models and frameworks for managing resources in Kubernetes clusters, improving container orchestration and load balancing, and enhancing monitoring and analytics capabilities to enable more effective use of context-aware scheduling algorithms.

As can be seen from the diversity of future directions, the potential for new research in Kubernetes is ripe with challenges of myriad levels of difficulty and effort. It provides future researchers with exciting opportunities to pursue and problems to tackle. We hope that this survey will facilitate future researchers in selecting a suitable challenge and solve new problems to expand the state-of-the-art in the area of Kubernetes.


In conclusion, the survey on Kubernetes scheduling provides a comprehensive overview of the current state of the field. It covers the objectives, methodologies, algorithms, experiments, and results of various research efforts in this area. The survey highlights the importance of scheduling in Kubernetes and the need for efficient and effective scheduling algorithms. The results of the experiments show that there is still room for improvement in this area, and future work should focus on developing new algorithms and improving existing ones. Overall, the survey provides valuable insight into the current state of Kubernetes scheduling and points to promising directions for future research.

Availability of data and materials

The corresponding author may provide the supporting data on request.

Mondal SK, Pan R, Kabir HMD, Tian T, Dai HN (2022) Kubernetes in IT administration and serverless computing: an empirical study and research challenges. J Supercomput 78(2):2937–2987

Article   Google Scholar  

Phuc LH, Phan LA, Kim T (2022) Traffic-Aware horizontal pod autoscaler in kubernetes-based edge computing infrastructure. IEEE Access 10:18966–18977

Zhang M, Cao J, Yang L, Zhang L, Sahni Y, Jiang S (2022) ENTS: An Edge-native Task Scheduling System for Collaborative Edge Computing. IEEE/ACM 7th Symposium on Edge Computing, SEC. pp 149–161

Google Scholar  

Kim SH, Kim T (2023) Local scheduling in kubeedge-based edge computing environment. Sensors 23(3):1522

E. Casalicchio (2019) “Container orchestration: A survey,” Syst Model, 221–235.

Pahl C, Brogi A, Soldani J, Jamshidi P (2017) Cloud container technologies: a state-of-the-art review. IEEE Transact Cloud Comput 7(3):677–692

Rodriguez MA, Buyya R (2019) Container-based cluster orchestration systems: A taxonomy and future directions. Software Pract Experience 49(5):698–719

Truyen E, Van Landuyt D, Preuveneers D, Lagaisse B, Joosen W (2019) A comprehensive feature comparison study of open-source container orchestration frameworks. Appl Sciences (Switzerland) 9(5):931

Arunarani AR, Manjula D, Sugumaran V (2019) Task scheduling techniques in cloud computing: a literature survey. Futur Gener Comput Syst 91:407–415

Vijindra and S. Shenai, (2012) Survey on scheduling issues in cloud computing. Procedia Eng 38:2881–2888

Wang K, Zhou Q, Guo S, Luo J (2018) Cluster frameworks for efficient scheduling and resource allocation in data center networks: a survey. IEEE Commun Surveys Tutor 20(4):3560–3580

Hosseinioun P, Kheirabadi M, Kamel Tabbakh SR, Ghaemi R (2022) A task scheduling approaches in fog computing: a survey”. Transact Emerg TelecommunTechnol 33(3):e3792

Rejiba Z, Chamanara J (2022) Custom scheduling in Kubernetes: a survey on common problems and solution approaches. ACM Comput Surv 55(7):1–37

Carrión C (2022) Kubernetes scheduling: taxonomy, ongoing issues and challenges. ACM Comput Surv 55(7):1–37

Article   MathSciNet   Google Scholar  

Santos J, Wauters T, Volckaert B, De Turck F (2019) Towards network-Aware resource provisioning in kubernetes for fog computing applications. Proceedings of the IEEE Conference on Network Softwarization: Unleashing the Power of Network Softwarization. pp 351–359

Chung A, Park JW, Ganger GR (2018) Stratus: Cost-aware container scheduling in the public cloud. Proceedings of the ACM Symposium on Cloud Computing. pp 121–134

Chapter   Google Scholar  

Le TN, Sun X, Chowdhury M, Liu Z (2020) AlloX: Compute allocation in hybrid clusters. Proceedings of the 15th European Conference on Computer Systems, EuroSys

Zhong Z, Buyya R (2020) A Cost-Efficient Container Orchestration Strategy in Kubernetes-Based Cloud Computing Infrastructures with Heterogeneous Resources. ACM Trans Internet Technol 20(2):1–24

Thinakaran P, Gunasekaran JR, Sharma B, Kandemir MT, Das CR (2019) Kube-Knots: Resource Harvesting through Dynamic Container Orchestration in GPU-based Datacenters. Proceedings - IEEE International Conference on Cluster Computing, ICCC

Townend P et al (2019) Invited paper: Improving data center efficiency through holistic scheduling in kubernetes. Proceedings - 13th IEEE International Conference on Service-Oriented System Engineering, 10th International Workshop on Joint Cloud Computing, and IEEE International Workshop on Cloud Computing in Robotic Systems, CCRS. pp 156–166

Menouer T (2021) KCSS: Kubernetes container scheduling strategy. J Supercomput 77(5):4267–4293

Song S, Deng L, Gong J, Luo H (2019) Gaia scheduler: A kubernetes-based scheduler framework. 16th IEEE International Symposium on Parallel and Distributed Processing with Applications, 17th IEEE International Conference on Ubiquitous Computing and Communications, 8th IEEE International Conference on Big Data and Cloud Computing. pp 252–259

Ogbuachi MC, Gore C, Reale A, Suskovics P, Kovacs B (2019) Context-aware K8S scheduler for real time distributed 5G edge computing applications. 27th International Conference on Software, Telecommunications and Computer Networks, SoftCOM

Beltre A, Saha P, Govindaraju M (2019) KubeSphere: An approach to multi-tenant fair scheduling for kubernetes clusters. 3rd IEEE International Conference on Cloud and Fog Computing Technologies and Applications, Cloud Summit. pp 14–20

Haja D, Szalay M, Sonkoly B, Pongracz G, Toka L (2019) Sharpening Kubernetes for the Edge. ACM SIGCOMM Conference Posters and Demos, Part of SIGCOMM. pp 136–137

Wojciechowski L et al (2021) NetMARKS: Network metrics-AwaRe kubernetes scheduler powered by service mesh. Proceedings - IEEE INFOCOM

Cai Z, Buyya R (2022) Inverse Queuing Model-Based Feedback Control for Elastic Container Provisioning of Web Systems in Kubernetes. IEEE Trans Comput 71(2):337–348

Article   MATH   Google Scholar  

El Haj Ahmed G, Gil-Castiñeira F, Costa-Montenegro E (2021) KubCG: A dynamic Kubernetes scheduler for heterogeneous clusters. Software Pract Experience 51(2):213–234

Ungureanu OM, Vlădeanu C, Kooij R (2019) Kubernetes cluster optimization using hybrid shared-state scheduling framework. ACM International Conference Proceeding Series

Yang S, Ren Y, Zhang J, Guan J, Li B (2021) KubeHICE: Performance-aware Container Orchestration on Heterogeneous-ISA Architectures in Cloud-Edge Platforms. 19th IEEE International Symposium on Parallel and Distributed Processing with Applications, 11th IEEE International Conference on Big Data and Cloud Computing, 14th IEEE International Conference on Social Computing and Networking and 11th IEEE Internation. pp 81–91

Li D, Wei Y, Zeng B (2020) A Dynamic I/O Sensing Scheduling Scheme in Kubernetes. ACM International Conference Proceeding Series. pp 14–19

Fan D, He D (2020) A Scheduler for Serverless Framework base on Kubernetes. ACM International Conference Proceeding Series. pp 229–232

Bestari MF, Kistijantoro AI, Sasmita AB (2020) Dynamic Resource Scheduler for Distributed Deep Learning Training in Kubernetes. 7th International Conference on Advanced Informatics: Concepts, Theory and Applications, ICAICTA

Dua A, Randive S, Agarwal A, Kumar N (2020) Efficient Load balancing to serve Heterogeneous Requests in Clustered Systems using Kubernetes. IEEE 17th Annual Consumer Communications and Networking Conference, CCNC

Kaur K, Garg S, Kaddoum G, Ahmed SH, Atiquzzaman M (2020) KEIDS: Kubernetes-Based Energy and Interference Driven Scheduler for Industrial IoT in Edge-Cloud Ecosystem. IEEE Internet Things J 7(5):4228–4237

Lin M, Xi J, Bai W, Wu J (2019) Ant colony algorithm for multi-objective optimization of container-based microservice scheduling in cloud. IEEE Access 7:83088–83100

Wei-guo Z, Xi-lin M, Jin-zhong Z (2018) Research on kubernetes’ resource scheduling scheme. ACM International Conference Proceeding Series

Oleghe O (2021) Container placement and migration in edge computing: concept and scheduling models. IEEE Access 9:68028–68043

Carvalho M, MacEdo DF (2021) QoE-Aware Container Scheduler for Co-located Cloud Environments,” Faculdades Catolicas

Chen T, Li M, Li Y, Lin M, Wang N, Wang M, Xiao T, Xu B, Zhang C, Zhang Z (2015) Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274

Abadi M et al (2016) Tensorflow: a system for large-scale machine learning. Osdi 2016(16):265–283

Xing EP et al (2015) Petuum: A new platform for distributed machine learning on big data. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp 1335–1344

Verma A, Pedrosa L, Korupolu M, Oppenheimer D, Tune E, Wilkes J (2015) Large-scale cluster management at Google with Borg. 10th European Conference on Computer Systems, EuroSys. pp 1–15

Vavilapalli VK et al (2013) Apache hadoop YARN: Yet another resource negotiator. 4th Annual Symposium on Cloud Computing, SoCC. pp 1–16

Bao Y, Peng Y, Wu C, Li Z (2018) Online Job Scheduling in Distributed Machine Learning Clusters. Proceedings - IEEE INFOCOM. pp 495–503

Peng Y, Bao Y, Chen Y, Wu C, Guo C (2018) Optimus: An Efficient Dynamic Resource Scheduler for Deep Learning Clusters. Proceedings of the 13th EuroSys Conference, EuroSys

Mao H, Schwarzkopf M, Venkatakrishnan SB, Meng Z, Alizadeh M (2019) Learning scheduling algorithms for data processing clusters. SIGCOMM Conference of the ACM Special Interest Group on Data Communication. pp 270–288

Chaudhary S, Ramjee R, Sivathanu M, Kwatra N, Viswanatha S (2020) Balancing efficiency and fairness in heterogeneous GPU clusters for deep learning. Proceedings of the 15th European Conference on Computer Systems, EuroSys

Fu Y et al (2019) Progress-based Container Scheduling for Short-lived Applications in a Kubernetes Cluster. IEEE International Conference on Big Data, Big Data. pp 278–287

Peng Y, Bao Y, Chen Y, Wu C, Meng C, Lin W (2021) DL2: A Deep Learning-Driven Scheduler for Deep Learning Clusters. IEEE Trans Parallel Distrib Syst 32(8):1947–1960

Mao Y, Fu Y, Zheng W, Cheng L, Liu Q, Tao D (2022) Speculative Container Scheduling for Deep Learning Applications in a Kubernetes Cluster. IEEE Syst J 16(3):3770–3781

Huang J, Xiao C, Wu W (2020) RLSK: A Job Scheduler for Federated Kubernetes Clusters based on Reinforcement Learning. IEEE International Conference on Cloud Engineering, IC2E. pp 116–123

Wang H, Liu Z, Shen H (2020) Job scheduling for large-scale machine learning clusters. Proceedings of the 16th International Conference on Emerging Networking EXperiments and Technologies. pp 108–120

Han Y, Shen S, Wang X, Wang S, Leung VCM (2021) Tailored learning-based scheduling for kubernetes-oriented edge-cloud system. Proceedings - IEEE INFOCOM

Casquero O, Armentia A, Sarachaga I, Pérez F, Orive D, Marcos M (2019) Distributed scheduling in Kubernetes based on MAS for Fog-in-the-loop applications. IEEE International Conference on Emerging Technologies and Factory Automation, ETFA. pp 1213–1217

Yang Y, Chen L (2019) Design of Kubernetes Scheduling Strategy Based on LSTM and Grey Model. Proceedings of IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering, ISKE. pp 701–707

Zhang X, Li L, Wang Y, Chen E, Shou L (2021) Zeus: Improving Resource Efficiency via Workload Colocation for Massive Kubernetes Clusters. IEEE Access 9:105192–105204

Liu Z, Chen C, Li J, Cheng Y, Kou Y, Zhang D (2022) KubFBS: A fine-grained and balance-aware scheduling system for deep learning tasks based on kubernetes. Concurrency Computat Pract Exper 34(11):e6836.

Rahali M, Phan CT, Rubino G (2021) KRS: Kubernetes Resource Scheduler for resilient NFV networks. IEEE Global Communications Conference

Taherizadeh S, Stankovski V (2019) Dynamic multi-level auto-scaling rules for containerized applications. Computer J 62(2):174–197

Rattihalli G, Govindaraju M, Lu H, Tiwari D (2019) Exploring potential for non-disruptive vertical auto scaling and resource estimation in kubernetes. IEEE International Conference on Cloud Computing, CLOUD. pp 33–40

Toka L, Dobreff G, Fodor B, Sonkoly B (2021) Machine Learning-Based Scaling Management for Kubernetes Edge Clusters. IEEE Trans Netw Serv Manage 18(1):958–972

Balla D, Simon C, Maliosz M (2020) Adaptive scaling of Kubernetes pods. IEEE/IFIP Network Operations and Management Symposium 2020: Management in the Age of Softwarization and Artificial Intelligence, NOMS

Toka L, Dobreff G, Fodor B, Sonkoly B (2020) Adaptive AI-based auto-scaling for Kubernetes. IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID. pp 599–608

Wang M, Zhang D, Wu B (2020) A Cluster Autoscaler Based on Multiple Node Types in Kubernetes. IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference, ITNEC. pp 575–579

Kang R, Zhu M, He F, Sato T, Oki E (2021) Design of Scheduler Plugins for Reliable Function Allocation in Kubernetes. 17th International Conference on the Design of Reliable Communication Networks, DRCN

Vu DD, Tran MN, Kim Y (2022) Predictive hybrid autoscaling for containerized applications. IEEE Access 10:109768–109778

Download references


The author(s) received no financial support for the research and publication of this article.

Author information

Authors and affiliations.

Department of Computer Science, College of Computing and Informatics, University of Sharjah, Sharjah, UAE

Khaldoun Senjab, Sohail Abbas & Naveed Ahmed

College of Engineering and Information Technology, Ajman University, Ajman, UAE

Atta ur Rehman Khan

You can also search for this author in PubMed   Google Scholar


Research was supervised by Sohail Abbas and Naveed Ahmed. Data collection, material preparation, and analysis were performed by Khaldoun. All authors read and approved the final manuscript. Conceptualization and revisions done by Sohail Abbas, Naveed Ahmed and Atta ur Rehman.

Corresponding author

Correspondence to Sohail Abbas .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Authors provide consent for publication.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit .

Reprints and permissions

About this article

Cite this article.

Senjab, K., Abbas, S., Ahmed, N. et al. A survey of Kubernetes scheduling algorithms. J Cloud Comp 12 , 87 (2023).

Download citation

Received : 27 January 2023

Accepted : 01 June 2023

Published : 13 June 2023


Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cloud services
  • Data center infrastructure
  • Resource optimization
  • Containerized applications
  • Container orchestration
  • Scheduling algorithm

research paper on kubernetes

An Efficient Scheduling Strategy for Containers Based on Kubernetes

  • Conference paper
  • First Online: 25 January 2023
  • Cite this conference paper

research paper on kubernetes

  • Xurong Zhang 19 ,
  • Xiaofeng Wang 19 , 20 ,
  • Yuan Liu 19 &
  • Zhaohong Deng 19  

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 460 ))

Included in the following conference series:

  • International Conference on Collaborative Computing: Networking, Applications and Worksharing

852 Accesses

Container clouds are an important supporting technology for collaborative edge computing, and Kubernetes has become the de facto standard for container orchestration. To solve the problem that the scheduling mechanism of Kubernetes has a single scheduling resource index and is unable to adapt the refined resource scheduling requirements in collaborative edge computing, this paper proposes an efficient multicriteria container online scheduling strategy based on Kubernetes, named E-KCSS. To improve the resource utilization of the cluster, the proposed E-KCSS strategy takes into account the global view of edge nodes and containers. An adaptive weight mechanism based on real-time utilization is proposed to solve the problem that preset Kubernetes weighting coefficients do not meet the individual resource requirements of applications. The experimental results show that compared with the scheduling mechanism of Kubernetes, the deployment efficiency of E-KCSS is improved by 35.22%, the upper limit of container application deployment is increased by 29.82%, and the cluster resource imbalance is reduced by 6.87%, which can make the multi-dimensional resource utilization of the cluster more balanced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

research paper on kubernetes

Network SLO-aware container scheduling in Kubernetes

research paper on kubernetes

KCSS: Kubernetes container scheduling strategy

research paper on kubernetes

Usage-Aware Resource Allocation in Edge Computing

References s.

Ren, J., Yu, J., He, Y.: Collaborative cloud and edge computing for latency minimization. IEEE Trans. Veh. Technol. 68 (5), 5031–5044 (2019)

Article   Google Scholar  

Chiang, M., Zhang, T.: Fog and IoT: an overview of research opportunities. IEEE Internet Things J. 3 (6), 854–864 (2016)

Yang, L., Cao, J., Cheng, H.: Multi-user computation partitioning for latency sensitive mobile cloud applications. IEEE Trans. Comput. 64 (8), 2253–2266 (2014)

Article   MATH   Google Scholar  

Lei, Y., Zheng, W., Ma, Y., Xia, Y., Xia, Q.: A novel probabilistic-performance-aware and evolutionary game-theoretic approach to task offloading in the hybrid cloud-edge environment. In: Gao, H., Wang, X., Iqbal, M., Yin, Y., Yin, J., Ning, G. (eds.) Collaborative Computing: Networking, Applications and Worksharing. LNICSSITE, vol. 349, pp. 255–270. Springer, Cham (2021).

Chapter   Google Scholar  

Xiao, X., Li, Y., Xia, Y., Ma, Y., Jiang, C., Zhong, X.: Location-aware edge service migration for mobile user reallocation in crowded scenes. In: Gao, H., Wang, X., Iqbal, M., Yin, Y., Yin, J., Ning, G. (eds.) Collaborative Computing: Networking, Applications and Worksharing. LNICSSITE, vol. 349, pp. 441–457. Springer, Cham (2021).

Gao, H., Huang, W., Zou, Q., Yang, X.: A dynamic planning framework for QOS-based mobile service composition under cloud-edge hybrid environments. In: Wang, X., Gao, H., Iqbal, M., Min, G. (eds.) Collaborative Computing: Networking, Applications and Worksharing. LNICSSITE, vol. 292, pp. 58–70. Springer, Cham (2019).

Zhang, J., Li, Y., Zhou, L., Ren, Z., Wan, J., Wang, Y.: Priority-Based optimization of I/O isolation for hybrid deployed services. In: Wang, X., Gao, H., Iqbal, M., Min, G. (eds.) Collaborative Computing: Networking, Applications and Worksharing. LNICSSITE, vol. 292, pp. 28–44. Springer, Cham (2019).

Xu, Y., Chen, L.: An adaptive mechanism for dynamically collaborative computing power and task scheduling in edge environment. IEEE Internet Things J. 1 (1), 232–245 (2021)

Google Scholar  

Li, J.: Design and implementation of machine learning cloud platform based on Kubernetes. Master thesis, Nanjing University of Posts and Telecommunications (2021)

Suresh, S., Manjunatha, R.: CCCORE: cloud container for collaborative research. Int. J. Elect. Comput. Eng. 8 (3), 1659–1670 (2018)

Dusia, A., Yang, Y., Taufer, M.: Network quality of service in docker containers. In: 2015 IEEE International Conference on Cluster Computing, pp. 527–528. IEEE (2015)

Casalicchio, E.: A study on performance measures for auto-scaling CPU-intensive containerized applications. Clust. Comput. 22 (3), 995–1006 (2019).

McDaniel, S., Herbein, S., Taufer, M.: A two-tiered approach to I/O quality of service in docker containers. In: 2015 IEEE International Conference on Cluster Computing, pp. 490–491. IEEE (2015)

Kong, D., Yao, X.: Kubernetes resource scheduling strategy for 5G edge computing. Comput. Eng. 47 (2), 32–38 (2021)

Gong, K., Wu, Y., Chen, K.: Container cloud multi-dimensional resource utilization balanced scheduling. App. Res. Comput. 37 (4), 1102–1106 (2018)

Piraghaj, S., Dastjerdi, A., Calheiros, R.: ContainerCloudSim: an environment for modeling and simulation of containers in cloud data centers. Softw. Pract. Exp. 47 (4), 505–521 (2017)

Guerrero, C., Lera, l., Juiz, C.: Genetic algorithm for multi-objective optimization of container allocation in cloud architecture. J. Gird Comput. 16 (1), 113–135 (2018)

Lin, M., Xi, J., Bai, W.: Ant colony algorithm for multi-objective optimization of container-based microservice scheduling in cloud. IEEE Access. 7 , 83088–83100 (2019)

Yang, M., Rao, R., Xin, Z.: CRUPA: a container resource utilization prediction for auto-scale based on time series analysis. In: 2016 International Conference on Progress in Informatics and Computing, pp. 468–472. IEEE (2016)

Download references


This research was funded by the National Natural Science Foundation of China (grant nos. 62172191 and 61972182), the National Key R&D Program of China (grant no. 2016YFB0800803), and the Peng Cheng Laboratory Project (grant no. PCL2021A02).

Author information

Authors and affiliations.

School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China

Xurong Zhang, Xiaofeng Wang, Yuan Liu & Zhaohong Deng

Peng Cheng Laboratory, Shenzhen, China

Xiaofeng Wang

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Xiaofeng Wang .

Editor information

Editors and affiliations.

Shanghai University, Shanghai, China

Honghao Gao

Xi’an Jiaotong-Liverpool University, Suzhou, China

Xinheng Wang

Zhejiang University City College, Hangzhou, China

London South Bank University, London, UK

Tasos Dagiuklas

Rights and permissions

Reprints and permissions

Copyright information

© 2022 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Cite this paper.

Zhang, X., Wang, X., Liu, Y., Deng, Z. (2022). An Efficient Scheduling Strategy for Containers Based on Kubernetes. In: Gao, H., Wang, X., Wei, W., Dagiuklas, T. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 460 . Springer, Cham.

Download citation


Published : 25 January 2023

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-24382-0

Online ISBN : 978-3-031-24383-7

eBook Packages : Computer Science Computer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research


  • Author Services


You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • SciProfiles
  • Encyclopedia


Article Menu

research paper on kubernetes

  • Subscribe SciFeed
  • Recommended Articles
  • PubMed/Medline
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Kubernetes cluster for automating software production environment.

research paper on kubernetes

1. Introduction

  • creating, destroying, replicating containers,
  • rolling updates of containers,
  • built-in health checks (liveness and readiness probes),
  • autoscaling,
  • redundancy and failover—to make the applications deployed on top of Kubernetes more resilient and reliable,
  • being provider-agnostic—Kubernetes can be deployed on-premises and in the cloud and in both cases it will provide the same set of features, thus, it may be said that Kubernetes unifies the underlying infrastructure,
  • utilizing provider-specific (sometimes referred to as: vendor-specific) features, e.g., AWS load balancer or Google Cloud load balancer,
  • self-healing,
  • service discovery,
  • load balancing,
  • storage orchestration.

2. Production Deployment Requirements—Background and Related Works

  • Central Monitoring—this is helpful when troubleshooting a cluster [ 5 , 8 , 15 , 16 , 17 ].
  • Central Logging—this is a fundamental requirement for any cluster with number of nodes or pods or containers greater than a couple [ 8 , 16 , 18 ].
  • Audit—to show who was responsible for which action [ 17 ].
  • High Availability—authors of [ 8 ] go even further and state that the cluster should be tested for being reliable and highly available before it is deployed into production [ 5 , 8 ].
  • Live cluster upgrades—it is not affordable for large Kubernetes clusters with many users to be offine for maintenance [ 8 ].
  • Backup, Disaster Recovery—cluster state is represented by deployed containerized applications and workloads, their associated network and disk resources—it is recommended to have a backup plan for this data [ 5 , 8 , 17 ].
  • Security, secrets management, image scanning—security at many levels is needed (node, image, pod and container, etc.) [ 5 , 8 , 16 , 17 ].
  • Passing tests, a healthy cluster–‘if you don’t test it, assume it doesn’t work’ [ 5 , 8 ].
  • Automation and Infrastructure as Code—in production environment a versioned, auditable and repeatable way to manage the infrastructure is needed [ 8 , 17 ].
  • Autoscaling—if application deployed on a Kubernetes demand more resources, then a new Kubernetes node should be automatically created and added to the cluster [ 5 ].

2.1. Monitoring as Production Environment Requirement

  • configuring the infrastructure in such a way that it is possible to collect the data,
  • storing the data,
  • providing dashboards, so that data is presented in a clear way,
  • setting up notifications or alarms to let users know about certain events.

2.2. Logging as Production Environment Requirement

2.3. high availability as production environment requirement.

  • Redundancy—means having a spare copy of something. Kubernetes uses Replica Sets or Replication Controllers to provide redundancy for applications deployed on Kubernetes. Five redundancy models were summarized in [ 27 ]. Some of them require an active replica (running) and other passive (or standby).
  • Hot Swapping—can be explained as replacing some failed component on the fly, with minimal or ideally zero down-time. Actually, hot swapping is quite easy to implement for stateless applications. For stateful applications, one has to keep a replica of a component (see redundancy).
  • Leader election—it is a pattern used in distributed systems. Whenever there are many servers fulfilling the same purpose to share the load. One of the servers must be elected a leader and then certain operations must go through it. When the leader server experiences a failure, other server can be selected as new leader. This is a combination of redundancy and hot swapping.
  • Smart load balancing—used to share and distribute the load.
  • Idempotenc—means that one request (or some operation) is handled exactly once.
  • Self-healing—means that whenever a failure of one component happens, it is automatically detected and steps are taken (also automatically) to get rid of the failure.
  • Deploying in a cloud—a goal is to be able to physically remove or replace a piece of hardware, either because of some issues or because of preventative maintenance or horizontal growth. Often this is too expensive or even impossible to achieve [ 26 ]. Traditional deployments on-premises forced administrators to do a capacity planning (to predict the amount of computing resources). Thanks to the on-demand and elastic nature of the clouds, the infrastructure can be closely aligned to the actual demand. It is also easy to scale applications deployed on a cloud, because of the fundamental property of the cloud: elasticity [ 28 ].

2.4. Automation as Production Environment Requirements

  • Every Change Should Trigger the Feedback Process—means that every change in code should trigger some pipeline and should be tested (including unit tests, functional acceptance tests, non-functional tests). The tests should happen in an environment which is as similar as possible to production. Some tests may run in production environment too [ 12 , 13 ].
  • Feedback Must Be Received as Soon as Possible—this also involves another rule: fail fast. This guideline suggests that faster tests (or less resource-intensive tests) should run first. If theses tests fail, the code does not get promoted to the next pipeline stages, which ensures optimal use of resources [ 13 ].
  • Automate Almost Everything—generally, the build process should be automated to such extent where specific human intervention or decision is needed. However there is no need to automate everything at once [ 12 , 13 ].
  • Keep Everything in Version Control—this means that not only application source code but also tests, documentation, database configuration, deployment scripts, etc. should be kept in version control and that it should be possible to identify the relevant version. Furthermore, any person with access to the source code should be able to invoke a single command in order to build and deploy the application to any accessible environment. Apart from that, it should be also clear which version in version control system was deployed into each environment [ 13 ].
  • If It Hurts, Do It More Frequently, and Bring the Pain Forward—if some part of the application lifecycle is painful, it should be done more often, certainly not left to do at the end of the project [ 13 ].
  • Idempotency—the tools used for automation should be idempotent, which means that no matter how many times the tool is invoked, the result should stay the same [ 12 ].

2.5. Security as Production Environment Requirement

  • Ensuring that data is encrypted in transit by using secure API server protocol (HTTPS instead of HTTP) [ 8 ].
  • Ensuring proper user and permissions management by configuring authentication, authorization, security accounts and admission control in API server [ 8 ]. When setting up authorization, it is wise to apply the principle of least privilege. This principle recommends that only the needed resources or permissions should be granted [ 5 ].
  • Using Role-Based Access Control (RBAC) to manage access to a cluster [ 5 , 34 , 35 ].
  • Ensuring security keys management and exchange by implementing for example automated key rotation.
  • Ensuring that used Docker images are neither malicious (deliberately causing some harm) nor vulnerable (allowing some attacker to take control) by keeping them up-to-date and maintaining them instead of using the publicly available ones or by using a private Docker registry [ 8 ].
  • Using minimal Docker images because the fewer programs there are installed in an image, the fewer potential vulnerabilities there are [ 5 ].
  • Maintaining a log or audit system [ 8 ].
  • Using network policies which act in a white list fashion and can open certain protocols and ports [ 8 ].
  • Using secrets. Kubernetes has a resource called: secret, but the problem is that Kubernetes stores secrets as plaintext in etcd. This, in turn, means that steps should be taken in order to limit direct access to etcd [ 8 ].
  • Preferring managed services, because they will have many security measures already implemented [ 5 ].
  • Avoid running processes as root user in Docker containers [ 5 ].
  • Using available programs for security scanning [ 5 ].

2.6. Disaster Recovery as Production Environment Requirement

  • Recovery Point Objective (RPO),
  • Recovery Time Objective (RTO).

2.7. Testing as Production Environment Requirement

3. kubernetes cluster production deployment requirements and methods.

  • healthy cluster,
  • automated operations,
  • central logging,
  • central monitoring,
  • central audit,
  • high availability,

3.1. Healthy and Usable Cluster

  • the cluster should be tested with Bats-core (i.e., particular number of worker nodes should be deployed, particular Kubernetes version should be used)),
  • the Kubernetes API Server endpoint should be reachable for the end users under a domain name (i.e., an end user can connect with the cluster using kubectl),
  • it should be tested that an application can be deployed on top of a Kubernetes cluster (i.e., an Apache server will be deployed and tested).

3.2. Automated Operations

  • all the code and configuration needed to deploy a cluster will be stored in a Git repository,
  • cluster will be configure with YAML file,
  • Bash script will be used to create, test and delete a cluster,
  • Helm tool will be used to automate a test application deployment,
  • it will be possible to choose between two deployment environments: testing and production (templating mechanism with Bash variables will be used).

3.3. Central Logging, Monitoring and Audit

3.4. backup, 3.5. capacity planning and high availability, 3.6. autoscaling, 3.7. security, 4. available kubernetes cluster deployment methods.

  • self-hosted solutions, on-premises,
  • deployment in a cloud, but not using Managed Services,
  • deployment in a cloud, using Managed Services.
  • using web interface of a particular cloud, e.g., AWS Management Console (supported by AWS),
  • using command-line tools officially supported by a particular cloud, e.g., awscli or eksctl (supported by AWS),
  • using command-line tools designed exactly to deploy a Kubernetes cluster, but not limited to one particular cloud, e.g., kops,
  • using command-line tools, designed for managing computer infrastructure resources, e.g., Terraform, SaltStack.
  • deploying on AWS, using AWS Managed Service (AWS EKS), using eksctl which is a AWS supported official tool,
  • deploying on AWS, not using any Managed Service, using kops which is a command-line tool, not officially supported by any cloud, but designed exactly to deploy a Kubernetes cluster.

4.1. Deployment Method: On AWS EKS Managed Service Using Eksctl

4.2. deployment method: on aws using kops, 4.3. troubleshooting any kubernetes cluster.

  • logs from kube-apiserver, which is responsible for serving the API,
  • logs from kube-scheduler, which is responsible for making scheduling decisions,
  • logs from kube-controller-manager, which manages replication controllers,
  • logs from kubelet, which is responsible for running containers on the node,
  • logs from kube-proxy, which is responsible for service load balancing.

5. Comparison of the Used Methods of Kubernetes Cluster Deployment

5.1. time of chosen kubernetes cluster operations, 5.2. additional steps needed to create a kubernetes cluster, 5.3. minimal ec2 instance type needed for a kubernetes worker node.

  • three pods for a cluster created with eksctl (one pod for testing, one pod for backup and one pod for autoscaler),
  • five pods for a cluster created with kops (one pod for testing, one pod for backup, one pod for autoscaler, two pods for logging).

5.4. Easiness of Configuration

5.5. meeting the automation requirement, 5.6. cost of chosen kubernetes cluster operations.

  • A cluster was created with kops, using the src/kops/cluster.yaml file and central logging was deployed.
  • All the AWS resources decorated with the tag: deployment kops-testing were listed with the following command: aws resourcegroupstaggingapi get-resources—tag- filters Key=deployment, Values=kops-testing .
  • The kops cluster was deleted.
  • A cluster was created with eksctl, using the src/eks/cluster.yaml file.
  • All the AWS resources decorated with the tag: deployment eks-testing were listed with the following command: aws resourcegroupstaggingapi get-resources –tag- filters Key=deployment, Values=eks-testing .
  • The AWS EKS cluster was deleted.

6. Discussion and Conclusions

Author contributions, institutional review board statement, informed consent statement, conflicts of interest.

  • Vayghan, L.A.; Saied, M.A.; Toeroe, M.; Khendek, F. Kubernetes as an Availability Manager for Microservice Applications. arXiv 2019 , arXiv:1901.04946. [ Google Scholar ]
  • The Kubernetes Authors. Kubernetes API Server—Security. 2020. Available online: (accessed on 16 May 2020).
  • The Kubernetes Authors. Kubernetes Offcial Website. 2020. Available online: (accessed on 16 May 2020).
  • Diouf, G.M.; Elbiaze, H.; Jaafar, W. On Byzantine Fault Tolerance in Multi-Master Kubernertes Clusters. arXiv 2019 , arXiv:1904.06206. [ Google Scholar ]
  • Arundel, J.; Domingus, J. Cloud Native DevOps with Kubernetes: Building, Deploying, and Scaling Modern Applications in the Cloud ; O’Reilly Media: Newton, MA, USA, 2019; ISBN 978-1492040767. [ Google Scholar ]
  • Pitchumani, R.; Kee, Y.-S. Hybrid Data Reliability for Emerging Key-Value Storage Devices. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST 20), Santa Clara, CA, USA, 24–27 February 2020; USENIX Association: Berkeley, CA, USA, 2020; pp. 309–322, ISBN 978-1-939133-12-0. [ Google Scholar ]
  • Netto, H.V.; Lung, L.C.; Correia, M.; Luiz, A.F.; de Souza, L.M.S. State machine replication in containers managed by Kubernetes. J. Syst. Archit. 2016 , 73 , 53–59. [ Google Scholar ] [ CrossRef ]
  • Sayfan, G. Mastering Kubernetes , 2nd ed.; Packt Publishing: Birmingham, UK, 2018; ISBN 978-1788999786. [ Google Scholar ]
  • Saito, H.; Lee, H.-C.; Wu, C.-Y. DevOps with Kubernetes ; Packt Publishing: Birmingham, UK, 2017; ISBN 978-1-78839-664-6. [ Google Scholar ]
  • Mai, K. Building High Availability Infrastructure in Cloud. Bachelor’s Thesis, Metropolia University of Applied Sciences, Helsinki, Finland, 2017. [ Google Scholar ]
  • Alshammari, M.M.; Alwan, A.A.; Nordin, A.; Al-Shaikhli, I.F. Disaster Recovery in Single-Cloud and Multi-Cloud Environments: Issues and Challenges. In Proceedings of the 4th IEEE International Conference on Engineering Technologies and Applied Sciences (ICETAS 2017), Salmabad, Bahrain, 29 November–1 December 2017. [ Google Scholar ]
  • Morris, K. Infrastructure as Code , 1st ed.; O’Reilly Media: Newton, MA, USA, 2016; ISBN 978-1491924358. [ Google Scholar ]
  • Humble, J.; Farley, D. Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation , 1st ed.; Addison-Wesley Professional: Boston, MA, USA, 2010; ISBN 978-0321601919. [ Google Scholar ]
  • Khan, A. Key Characteristics of a Container Orchestration Platform to Enable a Modern Application. IEEE Cloud Comput. 2017 , 4 , 42–48. [ Google Scholar ] [ CrossRef ]
  • Turol, S.; Gutierrez, C.; Matykevich, S. A Multitude of Kubernetes Deployment Tools: Kubespray, Kops, and Kubeadm. Available online: (accessed on 20 May 2020).
  • WeaveWorks. Production Ready Checklists for Kubernetes. Available online: (accessed on 15 April 2020).
  • WeaveWorks. Your Guide to a Production Ready Kubernetes Cluster. Available online: (accessed on 15 April 2020).
  • Uphill, T. DevOps: Puppet, Docker and Kubernetes ; Packt Publishing: Birmingham, UK, 2017; ISBN 978-1788297615. [ Google Scholar ]
  • Milenovi, M. How to Monitor Kubernetes Cluster with Prometheus and Grafana. Available online: (accessed on 15 April 2020).
  • Kubernetes Community. Web UI (Dashboard). Available online: (accessed on 12 June 2020).
  • Netto, H.; Oliveira, C.P.; Rech, L.; Alchieri, E. Incorporating the Raft consensus protocol in containers managed by Kubernetes: An evaluation. Int. J. Parallel Emergent Distrib. Syst. 2020 , 35 , 433–453. [ Google Scholar ] [ CrossRef ]
  • Bakker, P. One Year Using Kubernetes in Production: Lessons Learned. Available online: (accessed on 15 April 2020).
  • Graylog. Graylog Offcial Website. Available online: (accessed on 15 April 2020).
  • Graylog. Improving Kubernetes Clusters’ Efficiency with Log Management. Available online: (accessed on 15 April 2020).
  • Amazon. AWS CloudTrail. Available online: (accessed on 15 April 2020).
  • Cristian, F. Understanding Fault-Tolerant Distributed Systems. Commun. ACM 1993 , 34 , 56–78. [ Google Scholar ] [ CrossRef ]
  • Kanso, A.; Toeroe, M.; Khendek, F. Comparing redundancy models for high availability middleware. Computing 2014 , 96 , 975–993. [ Google Scholar ] [ CrossRef ]
  • Varia, J. Architecting for the Cloud: Best Practices. In AWS Whitepapers ; Amazon Web Services: Seattle, WA, USA, 2010. [ Google Scholar ]
  • Gravier, T.W. What Is RAID and Why Should You Want It? Available online: (accessed on 12 June 2020).
  • Chef Software Inc. Offcial Chef website. Available online: (accessed on 16 April 2020).
  • Red Hat, Inc. Offcial Ansible Website. Available online: (accessed on 16 April 2020).
  • SaltStack, Inc. Offcial SaltStack Website. Available online: (accessed on 16 April 2020).
  • Hashicorp. Terraform Offcial Website. Available online: (accessed on 18 May 2020).
  • Poniszewska-Maranda, A. Modeling and design of role engineering in development of access control for dynamic information systems. Bull. Pol. Acad. Sci. Tech. Sci. 2013 , 61 , 569–580. [ Google Scholar ] [ CrossRef ]
  • Majchrzycka, A.; Poniszewska-Marańda, A. Secure Development Model for mobile applications. Bull. Pol. Acad. Sci. Tech. Sci. 2016 , 64 , 495–503. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Alhazmi, O.H.; Malaiya, Y.K. Evaluating Disaster Recovery Plans Using the Cloud ; Taibah University: Medina, Saudi Arabia, 2013. [ Google Scholar ]
  • Microsoft. Health Endpoint Monitoring Pattern. Available online: (accessed on 17 April 2020).
  • Kubernetes Community. Configure Liveness, Readiness and Startup Probes. Available online: (accessed on 17 April 2020).
  • Bitnami. Apache Helm Chart. Available online: (accessed on 30 May 2020).
  • WeaveWorks. CloudWatch Logging. Available online: (accessed on 27 May 2020).
  • Kops Authors. High Availability (HA). Available online: (accessed on 30 May 2020).
  • Susmel, A. Create a High-Availability Kubernetes Cluster on AWS with Kops. Available online: (accessed on 29 May 2020).
  • Amazon. VPCs and Subnets. Available online: (accessed on 25 May 2020).
  • Kops Authors. Description of Keys in Config and Cluster.spec. Available online: (accessed on 2 June 2020).
  • Amazon. Linux Bastion Hosts on the AWS Cloud. Available online: (accessed on 25 May 2020).
  • Amazon. Amazon EKS Kubernetes Pricing. Available online: (accessed on 18 May 2020).
  • Amazon. What is Amazon EKS? Available online: (accessed on 18 May 2020).
  • Amazon. Amazon EKS Clusters. Available online: (accessed on 19 May 2020).
  • Kops Authors. Kops Project Page on Available online: (accessed on 18 May 2020).
  • Kops Authors. Installing. Available online: (accessed on 18 May 2020).
  • Kops Authors. Getting Started with kops on AWS. Available online: (accessed on 20 May 2020).
  • Kops Authors. Kubernetes Addons and Addon Manager. Available online: (accessed on 20 May 2020).
  • Kubernetes Community. Troubleshoot Applications. Available online: (accessed on 3 June 2020).
  • Kubernetes Community. Using Minikube to Create a Cluster. Available online: (accessed on 10 April 2020).
  • Google. Troubleshooting. Available online: (accessed on 6 June 2020).
  • Kubernetes Community. Debug Clusters. Available online: (accessed on 6 June 2020).
  • Amazon. Amazon EKS Troubleshooting. Available online: (accessed on 6 June 2020).
  • WeaveWorks. Troubleshooting. Available online: (accessed on 6 June 2020).
  • Kubernetes Community. Troubleshoot Clusters. Available online: (accessed on 6 June 2020).
  • Liran Polak. EKS Done Right—From Control Plane to Worker Nodes. Available online: (accessed on 4 June 2020).
  • Zhang, H. Learning Kubernetes on EKS by Doing Part 1—Setting Up EKS. Available online: (accessed on 4 June 2020).
  • WeaveWorks. VPC Networking. Available online: (accessed on 24 May 2020).
  • Amazon. File Containing Hard Limits Set on EKS, Limiting the Number of Pods Allowed for an EC2 Instance Type. Available online: (accessed on 22 May 2020).
  • Kubernetes Community. Advanced Kubernetes Scheduling. Available online: (accessed on 20 June 2020).
  • Gruntwork. Comprehensive Guide to EKSWorker Nodes. Available online:\behavior-when-there-are-node-problems-alpha-feature (accessed on 20 June 2020).
  • Kops Authors. Kops Golang Package Documentation. Available online: (accessed on 20 May 2020).
  • Github User: Schollii. Not All Resources Get Tags from CloudLabels in kops 1.13. Available online: (accessed on 7 June 2020).
  • Github User: Rifelpet. Add CloudLabels Tags to Additional AWS Resources. Available online: (accessed on 7 June 2020).
  • Amazon. Amazon VPC pricing. Available online: (accessed on 8 June 2020).
  • Amazon. Elastic Load Balancing Pricing. Available online: (accessed on 8 June 2020).
  • Amazon. AWS CloudFormation Pricing. Available online: (accessed on 8 June 2020).
  • Amazon. Amazon EC2 Pricing. Available online: (accessed on 21 May 2020).
  • Amazon. Amazon S3 Pricing. Available online: (accessed on 8 June 2020).
  • Amazon. Amazon EBS Pricing. Available online: (accessed on 8 June 2020).

Click here to enlarge figure

OperationUsing Kops MethodUsing Eksctl Method
Create a minimal cluster6 min 12 s19 min 26 s
Create a production-grade cluster6 min 35 s25 min 40 s
Test a cluster6 min 33 s6 min 40 s
Delete a cluster2 min 23 s13 min 25 s
Which ClusterUsing Kops MethodUsing Eksctl Method
A minimal clustersrc/kops/cluster-minimal.yamlsrc/eks/cluster-minimal.yaml
A production clustersrc/kops/cluster.yaml and central logging deployedsrc/eks/cluster-minimal.yaml
AWS ResourceCost Using Eksctl Method
1 NAT Gateway [ ]720 × $0.048 = $34.56
Classic Load Balancer [ ]720 × $0.028 = $20.16
2 CloudFormation stacks [ ]88 × $0.0009 = $0.0792
AWS EKS cluster [ ]720 × $0.10 = $72
2 EC2 instances of type: t2.small [ ]2 × 720 × $0.025 = $36
keeping files on S3 bucket [ ]$0.023
AWS ResourceCost Using Kops Method
3 EC2 instances of type: t2.micro [ ]3 × 720 × $0.0126 = $27.216
2 EC2 instances of type: t2.small [ ]2 × 720 × $0.025 = $36
Classic Load Balancer [ ]720 × $0.028 = $20.16
6 EBS volumes of type gp2 and 20GB each [ ]6 × 20 × $0.11 = $13.2
keeping files on S3 bucket [ ]$0.023
MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

Poniszewska-Marańda, A.; Czechowska, E. Kubernetes Cluster for Automating Software Production Environment. Sensors 2021 , 21 , 1910.

Poniszewska-Marańda A, Czechowska E. Kubernetes Cluster for Automating Software Production Environment. Sensors . 2021; 21(5):1910.

Poniszewska-Marańda, Aneta, and Ewa Czechowska. 2021. "Kubernetes Cluster for Automating Software Production Environment" Sensors 21, no. 5: 1910.

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.


Subscribe to receive issue release notifications and newsletters from MDPI journals

Research on Kubernetes' Resource Scheduling Scheme

New citation alert added.

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations.

  • Li T Qiu L Chen F Chen H Zhou N (2024) CAROKRS: Cost-Aware Resource Optimization Kubernetes Resource Scheduler 2024 9th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA) 10.1109/ICCCBDA61447.2024.10569976 (127-133) Online publication date: 25-Apr-2024
  • Cañete A Amor M Fuentes L (2024) HADES Journal of Network and Computer Applications 10.1016/j.jnca.2023.103764 221 :C Online publication date: 1-Jan-2024
  • dos Santos J Filho G Meneguette R Bonacin R Pessin G Gonçalves V (2024) Enhancing IoT device security in Kubernetes: An approach adopted for network policies and the SARIK framework Future Generation Computer Systems 10.1016/j.future.2024.107485 (107485) Online publication date: Aug-2024
  • Show More Cited By

Index Terms

Theory of computation

Theory and algorithms for application domains

Theory of randomized search heuristics


A dynamic i/o sensing scheduling scheme in kubernetes.

With the rapid development of the Container-as-a-Service (CaaS), Kubernetes has become the de facto standard for deploying containerized applications on cloud environments. However, the Kubernetes scheduler does not take the disk I/O load of nodes into ...

Cloud computing resource scheduling based on improved differential evolution ant colony algorithm

Due to the uneven distribution of cloud computing resources and the long processing time of resource scheduling, a cloud computing resource scheduling strategy based on improved differential evolution ant colony algorithm is proposed. By changing the ...

Genetic Ant Colony Algorithm Improves Resource Scheduling in Cloud Computing

When a large number of users request cloud computing resource services, rational organization of resources and task scheduling is one of the key technologies of cloud computing. Aiming at the problems of low efficiency and slow convergence speed of ...


Published in.

cover image ACM Other conferences

Association for Computing Machinery

New York, NY, United States

Publication History

Permissions, check for updates, author tags.

  • Ant colony algorithm
  • Cloud computing
  • Particle swarm algorithm
  • resource scheduling
  • Research-article
  • Refereed limited


Other metrics, bibliometrics, article metrics.

  • 21 Total Citations View Citations
  • 1,190 Total Downloads
  • Downloads (Last 12 months) 153
  • Downloads (Last 6 weeks) 12
  • Centofanti C Tiberti W Marotta A Graziosi F Cassioli D (2024) Taming latency at the edge Computer Networks: The International Journal of Computer and Telecommunications Networking 10.1016/j.comnet.2024.110444 247 :C Online publication date: 18-Jul-2024
  • Ding Z Wang S Jiang C (2023) Kubernetes-Oriented Microservice Placement With Dynamic Resource Allocation IEEE Transactions on Cloud Computing 10.1109/TCC.2022.3161900 11 :2 (1777-1793) Online publication date: 1-Apr-2023
  • Liu R Yang P Lv H Li W (2023) Multi-Objective Multi-Factorial Evolutionary Algorithm for Container Placement IEEE Transactions on Cloud Computing 10.1109/TCC.2021.3137400 11 :2 (1430-1445) Online publication date: 1-Apr-2023
  • Funari L Petrucci L Detti A (2023) Storage-Saving Scheduling Policies for Clusters Running Containers IEEE Transactions on Cloud Computing 10.1109/TCC.2021.3104662 11 :1 (595-607) Online publication date: 1-Jan-2023
  • Asenov M Deng Q Yeung G Barker A (2023) Learning to Score: Tuning Cluster Schedulers through Reinforcement Learning 2023 IEEE International Conference on Cloud Engineering (IC2E) 10.1109/IC2E59103.2023.00021 (113-120) Online publication date: 25-Sep-2023
  • Laukka L Fransson C Pappas N (2023) Load Balancing Traffic Among Kubernetes Replicas by Utilizing Workload Estimation 2023 IEEE Conference on Standards for Communications and Networking (CSCN) 10.1109/CSCN60443.2023.10453145 (353-356) Online publication date: 6-Nov-2023
  • Chen X Wu Y Xiao S (2023) Particle Swarm–Grey Wolf Cooperation Algorithm Based on Microservice Container Scheduling Problem IEEE Access 10.1109/ACCESS.2023.3244881 11 (16667-16682) Online publication date: 2023

View Options

Login options.

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

View options.

View or Download as a PDF file.

View online with eReader .

Share this Publication link

Copying failed.

Share on social media

Affiliations, export citations.

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

Kubernetes research

Research documents on node instance types, managed services, ingress controllers, CNIs, etc.

Table of contents

  • Ingress controllers comparison

Choosing an instance type for a Kubernetes cluster

  • Kubernetes managed services
  • Services meshes

Getting involved


Research requests

Comparison of Kubernetes Ingress controllers

Comparison of Kubernetes Ingress controllers

The research compares several Ingress controllers for Kubernetes.

Choosing an instance type for a Kubernetes cluster

Research on the trade-offs when choosing an instance type for a Kubernetes cluster

Comparison of Kubernetes managed services

Comparison of Kubernetes managed services

The research compares Kubernetes managed services such as Google Kubernetes Engine (GKE), Elastic Kubernetes Service (EKS) and Azure Kubernetes (AKS).

Comparison of service meshes for Kubernetes

Comparison of service meshes

The research compares service meshes for Kubernetes such as Istio, Linkerd and Kuma.

Don't miss the new research!

If you want to be notified when we publish new research documents, you can sign up for the Learnk8s newsletter.

You are in!

*We'll never share your email address, and you can opt-out at any time.

If you spot a typo or an out-of-date spec, leave a comment on the spreadsheet or get in touch at [email protected] .

Are you interested in authoring open-source research?

If you wish to contribute with new comparison, charts, or any other research join the #research channel on Slack .

If you need some ideas, here's a shortlist:

  • How does DigitalOcean Kubernetes compare with the rest of the Kubernetes managed services?
  • How does IBM Cloud Kubernetes Service compare with the rest of the Kubernetes managed services?
  • How does Alibaba's Container Service for Kubernetes compare with the rest of the Kubernetes managed services?
  • What's the average CPU workload for Kubernetes Pods? How does it affect instance types?

What about code contributions?

More research can be unlocked if we can provision several clusters in different cloud providers and run tests.

If you have an idea on how to do that and want to contribute, let's chat on Slack or drop us an email .

The following topics of research are next:

  • Analysis of container runtime interfaces (CRIs)
  • Analysis of container networking interfaces (CNIs)
  • Analysis of ingress controllers
  • Analysis of API gateways
  • Analysis of CI/CD tools

If you have an idea for a research topic or you want to sponsor the research get in touch!

Get in touch →


  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
  • Manage My Drafts

2024 DZone Community Survey: Tell us what's in your tech stack. Help us build a better DZone. Enter for a chance to win swag!

Automate (almost) Anything: Join us for a discussion to learn how to build and maintain automations, integrating databases, APIs, and more!

Database Systems: In 2024, the focus around databases is on their ability to scale and perform in modern data architectures. See why.

Data Pipeline Essentials: Dive into the fundamentals of data pipelines and the problems they solve for modern enterprises.

  • How To Convert Common Documents to PNG Image Arrays in Java
  • How to Rasterize PDFs in Java
  • How To Create and Edit PDF Annotations in Java
  • Template-Based PDF Document Generation in Java
  • Semi-Supervised Learning: How To Overcome the Lack of Labels
  • Enhancing Java Application Logging: A Comprehensive Guide
  • A Deep Dive Into Recommendation Algorithms With Netflix Case Study and NVIDIA Deep Learning Technology
  • Outlier Identification in Continuous Data Streams With Z-Score and Modified Z-Score in a Moving Window

How To Change PDF Paper Sizes With an API in Java

Discover an efficient web api solution java developers can use to quickly adjust pdfs between common iso 216 a-series paper sizes (a0 to a7)..

Brian O'Neill user avatar

Join the DZone community and get the full member experience.

Almost every business around the world works with PDF documents daily in some capacity, and that alone establishes the value of leveraging niche technologies to automate unique PDF workflows. The purpose of this article is to demonstrate an efficient web API solution Java developers can use to quickly adjust PDFs between common ISO 216 A-Series paper sizes (A0 to A7).

Before we get to our demonstration, however, we’ll first take a moment to understand the ISO 216 standard, and we’ll briefly review how PDF file structure handles page sizing to make programmatic adjustments possible.

ISO 216 PDF Paper Size Definition

PDF is the standard digital publishing format for content originally created in dozens of other applications, and it’s also frequently used to format and physically print workplace IDs, advertising pamphlets, and many other materials.

There’s a science to ensuring PDF content corresponds with standard physical printing materials, and that science is laid out by the International Organization for Standardization (ISO).  The 216 th standard published by ISO defines a variety of PDF paper sizes stemming from a common aspect ratio, which is the square root of 2. These different paper sizes —  categorized as A-Series, B-Series, and C-Series —  make it easy to scale documents without altering the layout, and, most importantly, they help ensure compatibility between different devices (i.e., printers and copiers) across the world. A-Series, broken down into A0 (largest) to A7 (smallest), is by far the most used paper size series across the world. A4 (210 x 297 mm) is the standard, default letter size we’ll find in use for most PDF documents.

PDF File Structure: Defining Page Size

While it’s easier for us humans to think of paper sizes as an arbitrary range between A0 and A7, computers don’t need to see it that way. As far as our PDF documents are concerned, the size of each page in a PDF is simply specified in the MediaBox entry of the page object dictionary. This “box” is an array that seeks to define the boundaries of the physical medium, rather than the digital medium, on which any given page is intended to be displayed or printed. Each corner of each page within the MediaBox is defined by a number, and these numbers correspond to a point system (each point = 1/72 of an inch) that collectively defines the page matrix. To put this in context, the standard A4 letter size (210 x 297 mm) mentioned earlier is defined in the MediaBox as `[0, 0, 595, 842]` because 210 mm = 595 points and 297 mm = 842 points.

So, when we make programmatic paper size changes to PDF documents, we need to navigate the PDF file structure to the MediaBox array, and from there, we need to determine the target ISO size by converting the A-Series millimeter definitions into the points-based coordinates the document is prepared to understand.

Open-Source Solution: Change Paper Size

Of course, as usual, we (thankfully) don’t have to write super complex programs from scratch to handle these steps. If we want to go the open-source route, we can use something like Apache PDFBox —  a popular library for manipulating PDFs in a variety of ways, including changing PDF paper sizes — and leverage the PDRectangle class to interact with the MediaBox and thereby define the size of our PDF pages. PDRectangle lets us handle this whole operation with a fairly minimal amount of code: the only hangup we might encounter is how memory is managed in that process. PDFs we’re planning to use for physical printing tend to be large files, and we might find that processing such files at scale in local memory buffers burns up more of our resources than we’re willing to commit. This is where a web API can step in and offer some additional flexibility; it can both simplify the operation and reduce the local processing power required to get the job done.

Web API Solution: Change Paper Size

By using a web API to handle our PDF paper size adjustments, we can offload the bulk of our burdensome PDF file processing to an external cloud-hosted endpoint, and we can simply download the result of the operation when it’s finished. We can also avoid invoking independent classes from a library altogether, instead leveraging simple, readable, and intuitively defined variables limited specifically to handling paper-size operations.

In the below demonstration, we’ll walk through each step required to call a specialized web API that lets us adjust paper size between A0 and A7 with simple string inputs (e.g., entering “A5” changes all page sizes in the document to ISO standard 148 x 210 mm). This is a free solution, and it only requires a free API key to use in perpetuity (800 API calls per month).

Step 1: Install the Maven SDK

To begin structuring our API call, we’ll first need to add repository and dependency information to our pom.xml file ( Jitpack is used to dynamically compile the library).

Let’s add the following repository reference:

And then let’s add the following dependency reference:

Step 2: Add the Import Statements

We’ll now add the following imports:

Step 3: Configure API Key Authorization

With the below snippet, we’ll set up the API client and configure our API key:

Step 4: Instance the API

In our final step, we’ll create an instance of the API, define our input file and desired paper size (remember, these are values A0 through A7), and call the API:

The try - catch block ensures our program will handle any potential errors gracefully. We’ll get informative messages and stack traces to diagnose and resolve issues in our operation.

In this article, we reviewed the relevance of ISO 216 paper size definitions, discussed how PDF file structure stores and represents paper size information independently of ISO A, B, and C Series definitions, and then looked at two solutions (one open-source and one independent web API) for programmatically adjusting PDF A-Series paper sizes.

Opinions expressed by DZone contributors are their own.

Partner Resources

  • About DZone
  • Send feedback
  • Community research
  • Advertise with DZone


  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone
  • Terms of Service
  • Privacy Policy
  • 3343 Perimeter Hill Drive
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.


  1. (PDF) Kubernetes and Docker Load Balancing: State-of-the ...

    Container-. based virtualization has revolutionized cloud computing, and Kubernetes, as a key orchestrator, plays a central role. in optimizing resource allocation, scalability, and. application ...

  2. Kubernetes as an Availability Manager for Microservice Applications

    Kubernetes is a platform for automating the deployment and scaling of containerized applications across a cluster [7]. The Kubernetes cluster has a master-slave architecture. The nodes in a Kubernetes cluster can be either virtual or physical machines. The master node hosts a collection of processes to maintain the desired state of the cluster.

  3. Kubernetes Container Orchestration as a Framework for Flexible and

    Abstract: In this paper we present the design and deployment details for Single Particle Imaging (SPI) experiments data analysis pipeline in Kubernetes infrastructure. We have analyzed various software usage patterns for different payloads. Components of the pipeline software include traditional HPC (MPI-based) applications, applications which require GPU computations, GUI-based software and ...

  4. A survey of Kubernetes scheduling algorithms

    The research papers use diverse algorithms to enhance Kubernetes scheduling. These algorithms are tested on various platforms and environments, such as Spark, MXNet, Kubernetes, Google and TwoSigma's GPU cluster, workloads, Google compute, CPU-GPU, the National Cloud Infrastructure, benchmarks, ProCon, DL2, DRF, Optimus, CBP, PP, scaling, data ...

  5. Benefits, Challenges, and Research Topics: A Multi-vocal Literature

    Kubernetes becomes a necessity so that the software engineering community is well-equipped to support practitioners who use Kubernetes. Objective: The goal of this paper is to inform practitioners and researchers on benefits and challenges of Kubernetes usage by conducting a multi-vocal literature review of Kubernetes.

  6. Kubernetes Architecture, Best Practices, and Patterns

    The field of software engineering, especially the domain of software deployment, has received a strong boost with the arrival of Kubernetes. The aspect of cluster computing, operations, management, networking, governance, resource efficiency, workload consolidation, and deployment gets greatly simplified through Kubernetes. The concept of containerization is being widely accepted and adopted ...

  7. Kubernetes as a Standard Container Orchestrator

    Nowadays, Kubernetes is a leading open-source container orchestration platform that has become the de facto standard. The aim of this paper is to provide a comprehensive overview of the Kubernetes orchestrator and grasp the current research emphasis by using a bibliometric analysis.

  8. Efficient Resource Utilization in Kubernetes: A Review of Load

    open-source container orchestration engine, plays a. pivotal role in automating deployment, scaling, and. management of co ntainerized app lications. This paper. explores the landscape of load ...

  9. Horizontal Pod Autoscaling in Kubernetes for Elastic Container ...

    Kubernetes, an open-source container orchestration platform, enables high availability and scalability through diverse autoscaling mechanisms such as Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler and Cluster Autoscaler. Amongst them, HPA helps provide seamless service by dynamically scaling up and down the number of resource units, called pods, without having to restart the whole ...

  10. Docker and Kubernetes

    The Docker containers allow the developers to package their applications with the required dependencies, such as configurations, frameworks, libraries, and runtimes, into them. This chapter introduce the fundamentals of Docker and Kubernetes to serve as the foundation, where Docker, Kubernetes, and various containerized applications are mentioned. It describes essential concepts related to ...

  11. Modelling performance & resource management in kubernetes

    In this paper, we analyse the performance of Kubernetes achieved through a Petri net-based performance model. Kubernetes is a container management system for a distributed cluster environment.

  12. Kubernetes in IT administration and serverless computing: An ...

    In this paper, a rigorous study on Kubernetes from an administrator's perspective is conducted. In a later stage, serverless computing paradigm was redefined and integrated with Kubernetes to accelerate the development of software applications. Theoretical knowledge and experimental evaluation show that this novel approach can be accommodated ...

  13. An Efficient Scheduling Strategy for Containers Based on Kubernetes

    Based on research on the Kubernetes scheduling policy, this paper proposes a Kubernetes container scheduling policy for collaborative edge computing to address the shortcomings of its single-criteria scheduling mechanism. This strategy comprehensively considers the CPU, memory, bandwidth, disk, and number of pods, automatically calculates multi ...

  14. Kubernetes Cluster for Automating Software Production Environment

    The paper presents the determination and analysis of such requirements and their evaluation in the case of Kubernetes cluster. Next, the paper compares two methods of deploying a Kubernetes cluster: kops and eksctl. ... provides an outlook for future research directions and describes possible research applications. Feature papers are submitted ...

  15. Borg, Omega, and Kubernetes

    Borg, Omega, and Kubernetes. Research. Who we are. Defining the technology of today and tomorrow. Philosophy. We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more. People.

  16. PDF Research Report: A Formal Model of the Kubernetes Container Framework

    In this paper, we develop a formal model of resource consumption and scaling for containerized microservices deployed and managed by Kubernetes. Although this model abstracts from many aspects of Kubernetes (e.g., self-healing, roll-outs, rollbacks, and storage orchestration), it already allows system deployment

  17. Kubernetes-Oriented Microservice Placement With Dynamic Resource

    Microservices and Kubernetes are widely used in the development and operations of cloud-native applications. By providing automated placement and scaling, Kubernetes has become the main tool for managing microservices. However, existing work and Kubernetes fail to consider the dynamic competition and availability of microservices as well as the problem of shared dependency libraries among ...

  18. Research on Kubernetes' Resource Scheduling Scheme

    Secondly, a large number of papers on cloud computing resource scheduling are read. In this paper, the K8s scheduling model is improved by combining ant colony algorithm and particle swarm optimization algorithm. Finally, it is scored, and the node with the smallest objective function is selected to deploy the Pod.

  19. PDF TEXT ONLY Borg, Omega, and Kubernetes

    ugh a monolithic, centralized master. Many of Omega's innovations (including multiple schedul. rs) have since been folded into Borg. The third container-management system developed at Google was Kubernetes.4 It was conceived of and developed in a world where external developers were becoming interested in Linux containers, and Google had ...

  20. Kubernetes research

    Comparison of Kubernetes managed services. The research compares Kubernetes managed services such as Google Kubernetes Engine (GKE), Elastic Kubernetes Service (EKS) and Azure Kubernetes (AKS). Read →.

  21. A Performance Evaluation of Containers Running on Managed Kubernetes

    To free users of the burden of having to configure and maintain complex Kubernetes infrastructures, but still make use of its functionalities, all major Cloud providers are now offering cloud-native managed Kubernetes alternatives. The goal of this paper is to investigate the performance of containers running in such hosted services.

  22. Git-Syncing into Trouble: Exploring Command Injection Flaws in Kubernetes

    Akamai researcher Tomer Peled found a design flaw in Kubernetes' sidecar project git-sync that allows for potential command injection. He'll present these findings at DEF CON 2024. This design flaw can cause either data exfiltration of any file in the pod (including service account tokens) or command execution with the git_sync user privileges. To exploit the flaw, all an attacker needs to ...

  23. Announcing mandatory multi-factor authentication for Azure sign-in

    As recent research by Microsoft shows that multifactor authentication (MFA) can block more than 99.2% of account compromise attacks, making it one of the most effective security measures available, today's announcement brings us all one step closer toward a more secure future.

  24. How To Change PDF Paper Sizes With an API in Java

    These different paper sizes — categorized as A-Series, B-Series, and C-Series — make it easy to scale documents without altering the layout, and, most importantly, they help ensure ...

  25. Containerization: Cloud Computing based Inspiration Technology for

    In this paper, various aspects of Containerization are explored and highlighted. The Container runtime environment-Docker and Container orchestration tool-Kubernetes are focused and deployed for exploring the possibilities of Containerization adoption, which automate the Container deployment, scaling and load balancing. ...

  26. Building Modern Clouds: Using Docker, Kubernetes & Google Cloud

    To develop and build a modern cloud infrastructure or DevOps implementation than both Docker and Kubernetes have revolutionized the era of software development and operations. Although both are different, they unify the process of development and integration, it is now possible to build any architecture by using these technologies. Docker is used to build, ship and run any application anywhere ...