This dissertation studies load balancing algorithms for many-server systems (with N servers) and focuses on the steady-state performance of load balancing algorithms in the heavy traffic regime. The framework of Stein’s method and (iterative) state-space collapse (SSC) are used to analyze three load balancing systems: 1) load balancing in the Sub-Halfin-Whitt regime with exponential service time; 2) load balancing in the Beyond-Halfin-Whitt regime with exponential service time; 3) load balancing in the Sub-Halfin-Whitt regime with Coxian-2 service time.
When in the Sub-Halfin-Whitt regime, the sufficient conditions are established such that any load balancing algorithm that satisfies the conditions have both asymptotic zero waiting time and zero waiting probability. Furthermore, the number of servers with more than one jobs is o(1), in other words, the system collapses to a one-dimensional space. The result is proven using Stein’s method and state space collapse (SSC), which are powerful mathematical tools for steady-state analysis of load balancing algorithms. The second system is in even “heavier” traffic regime, and an iterative refined procedure is proposed to obtain the steady-state metrics. Again, asymptotic zero delay and waiting are established for a set of load balancing algorithms. Different from the first system, the system collapses to a two-dimensional state-space instead of one-dimensional state-space. The third system is more challenging because of “non-monotonicity” with Coxian-2 service time, and an iterative state space collapse is proposed to tackle the “non-monotonicity” challenge. For these three systems, a set of load balancing algorithms is established, respectively, under which the probability that an incoming job is routed to an idle server is one asymptotically at steady-state. The set of load balancing algorithms includes join-the-shortest-queue (JSQ), idle-one-first(I1F), join-the-idle-queue (JIQ), and power-of-d-choices (Pod) with a carefully-chosen d.