TY - GEN
T1 - Parallelizing packet processing in container overlay networks
AU - Lei, Jiaxin
AU - Munikar, Manish
AU - Suo, Kun
AU - Lu, Hui
AU - Rao, Jia
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/4/21
Y1 - 2021/4/21
N2 - Container networking, which provides connectivity among containers on multiple hosts, is crucial to building and scaling container-based microservices. While overlay networks are widely adopted in production systems, they cause significant performance degradation in both throughput and latency compared to physical networks. This paper seeks to understand the bottlenecks of in-kernel networking when running container overlay networks. Through profiling and code analysis, we find that a prolonged data path, due to packet transformation in overlay networks, is the culprit of performance loss. Furthermore, existing scaling techniques in the Linux network stack are ineffective for parallelizing the prolonged data path of a single network flow. We propose Falcon, a fast and balanced container networking approach to scale the packet processing pipeline in overlay networks. Falcon pipelines software interrupts associated with different network devices of a single flow on multiple cores, thereby preventing execution serialization of excessive software interrupts from overloading a single core. Falcon further supports multiple network flows by effectively multiplexing and balancing software interrupts of different flows among available cores. We have developed a prototype of Falcon in Linux. Our evaluation with both micro-benchmarks and real-world applications demonstrates the effectiveness of Falcon, with significantly improved performance (by 300% for web serving) and reduced tail latency (by 53% for data caching).
AB - Container networking, which provides connectivity among containers on multiple hosts, is crucial to building and scaling container-based microservices. While overlay networks are widely adopted in production systems, they cause significant performance degradation in both throughput and latency compared to physical networks. This paper seeks to understand the bottlenecks of in-kernel networking when running container overlay networks. Through profiling and code analysis, we find that a prolonged data path, due to packet transformation in overlay networks, is the culprit of performance loss. Furthermore, existing scaling techniques in the Linux network stack are ineffective for parallelizing the prolonged data path of a single network flow. We propose Falcon, a fast and balanced container networking approach to scale the packet processing pipeline in overlay networks. Falcon pipelines software interrupts associated with different network devices of a single flow on multiple cores, thereby preventing execution serialization of excessive software interrupts from overloading a single core. Falcon further supports multiple network flows by effectively multiplexing and balancing software interrupts of different flows among available cores. We have developed a prototype of Falcon in Linux. Our evaluation with both micro-benchmarks and real-world applications demonstrates the effectiveness of Falcon, with significantly improved performance (by 300% for web serving) and reduced tail latency (by 53% for data caching).
UR - http://www.scopus.com/inward/record.url?scp=85105264529&partnerID=8YFLogxK
U2 - 10.1145/3447786.3456241
DO - 10.1145/3447786.3456241
M3 - Conference contribution
AN - SCOPUS:85105264529
T3 - EuroSys 2021 - Proceedings of the 16th European Conference on Computer Systems
SP - 261
EP - 276
BT - EuroSys 2021 - Proceedings of the 16th European Conference on Computer Systems
PB - Association for Computing Machinery, Inc
T2 - 16th European Conference on Computer Systems, EuroSys 2021
Y2 - 26 April 2021 through 28 April 2021
ER -