What is the real-time cloud and how do we get there?
This post was originally published on the Ericsson blog.
Emerging real-time systems such as smart manufacturing demand millisecond latency, something which cannot be guaranteed by today’s cloud services. But what if we could replace these models with a new computing paradigm – ingrained within the network, offering real-time compute? Welcome to the real-time cloud.
The demands on next-generation networks are evolving owing to technology advances in many fields, such as novel ways of human-machine interaction and advanced intelligent algorithms which adapt network behavior and performance to the requirements of the applications. In our 2020 Technology Trends review, we introduce the concept of the network compute fabric. This new computing paradigm exposes the network, the edge computing paradigm and the devices as a single unified integrated execution environment for distributed services. It thus provides an attractive platform for services with extremely low latency requirements.
Real-time cloud: How does it work?
The digitalization of critical systems has the potential to improve efficiency and resource utilization by leveraging compute and data services in the cloud. For example, it would enable the real-time use of data-driven analytics to optimize workflows in manufacturing and process industries, improve efficiency and enable early anomaly detection in air and rail traffic management, as well as accelerate technology adoption in general.
However, the adoption of cloud technologies for time-sensitive and critical services is hampered by performance uncertainties that are inherent in today’s cloud models. While the deterministic behavior of the network is one of its unique selling points, cloud services typically do not provide any real-time guarantees and exhibit non-deterministic performance due to shared compute and network resources. A key property of the network compute fabric is its real-time behavior, made possible by tight integration of network and compute. Consequently, the network compute fabric provides the perfect platform for predictable services through integration of real-time cloud capabilities.
Figure 1. A depiction of how a mission-critical distributed application would look, deployed in the network compute fabric, seamlessly using functionality across device, edge and cloud. It is managed by the underlying real-time compute and networking layer, which ensures that the real-time requirements are satisfied.
We first encountered real-time requirements for cloud services in our work with the 5G Enabled Manufacturing project. Combining 5G and manufacturing enables new use cases and new possibilities, for example monitoring of sensor readings for anomaly detection, manual control and fine-tuning of operations. However, it quickly became evident to us that the current best effort practices in cloud were ill-suited for these new use cases. After all, it’s on the factory floor that the real-time requirements are the most critical, obvious and tangible i.e. materials and components on the assembly line must flow like clockwork and always arrive at a certain point, at a certain time. In addition, safety sensors must trigger within a given time so as to prevent physical harm, and mobile robots and collaborating manipulators must be able to synchronize with high precision.
Requirements for real-time systems
Formally, a real-time system is a computer system that has to respond to externally generated events or inputs within a finite and specified time period. This raises other equally important requirements. For example, these systems must have high availability, otherwise it is likely to fail to meet the necessary deadlines. It is also crucial that the response is reliable, otherwise it could trigger the wrong action and, again, miss the deadline.
The vision for the network compute fabric is to provide real-time services in telecom networks, from sensing to actuation, by providing end-to-end communication between two entities in a deterministic manner. When we compare this to today’s cloud models, as we know it’s fairly common to receive a “service unavailable” response due to maintenance or unavailability caused by unexpected high demand. This can usually be handled by retrying after a period of time, but for hard real-time systems, such as those found in manufacturing or processing industries, a delay of several seconds can be the difference between “situation normal” and catastrophic failure.
Therefore, when designing future real-time cloud services, the deployment and operation of critical cloud services must be transparent to the developer and operations teams, meaning that timing and synchronization must be observable by monitoring tools. The real-time cloud orchestrator manages federated resources, from the edge nodes, the wireless- and wired networks, and the centralized data center. In addition, a real-time orchestrator must provide timing guarantees on its cloud configuration actions, such as allocation and scheduling of virtual resources, and thus be predictable in itself.
Plotting a course to the real-time cloud
Standard cloud virtualization technologies are a poor fit for especially in the context of today’s 5G networks and future 6G networks.
Yet we still face many challenges before we can begin to deploy real-time cloud models. These challenges lie partly in combining high utilization in a shared environment with timing guarantees and low interference between users, and partly in the need to cope with dynamic workloads and changes in infrastructure availability. The communication network and the compute nodes must be tightly integrated, and feedback control is instrumental. Right now, we are investigating new, efficient, and predictable network and compute virtualization technologies, designed to be closely integrated with the communication infrastructure.
The challenges of real-time performance thus require attention to the full software stack, from top-level application models for defining time-sensitive services, down to low-level execution mechanisms. End-to-end timing requirements, from sensing to actuation, need to be made an integral part of the application model and should automatically be translated into resource allocations e.g. virtual machines and network slices. By leveraging the real-time properties of the network compute fabric, communication service providers can begin to offer critical edge services for everyone, by making it simple to develop applications for the edge and cloud continuum.
Here at Ericsson’s cloud research group, we’re investigating cloud technologies which can support this new category of advanced applications and services that require low, predictable latencies and high compute capacity. Notably, we see advances in the Linux SCHED_DEADLINE community.
In the industrial setting, it is key that the whole chain, from sensor to action, is real-time. This means that everything needs to be aware of the real-time requirements; the intelligent machines, the services, and the service infrastructure. We are working closely with external and internal stakeholders to develop use cases to identify and refine these requirements.
The technical components that make up the real-time cloud include real-time hypervisors and real-time virtual networks and routers, application execution environments, development toolchains, as well as adaptive and data driven operations technologies to manage the edge and cloud continuum, providing end-to-end latency guarantees.
Real-time cloud and 6G
In 5G and 6G networks, we combine recent advances in wired- and wireless technologies, such as TSN (time-sensitive networking) and 5G URLLC (ultra reliable low latency communication), with novel ideas on virtualization and cloud orchestration to create a virtualized infrastructure that provides end-to-end deadline guarantees – from sensor to edge node and to the central cloud and back. Dedicated 5G networks provide deterministic wireless connectivity with high availability and reliability, and the real-time cloud extends this setup with a flexible yet predictable compute infrastructure.
Of course, the impact of real-time clouds does not end at manufacturing and processing. For example, 5G is expected to have a significant impact on FinTech, and with the future addition of real-time cloud services, this impact can be expected to increase further. The same is true for AR/VR which is expected to be a key component in a wide range of applications from public safetey e.g. connected firefighters, to online gaming. In healthcare, the possibility of reliable remote monitoring and data analysis promises to improve health and safety of patients. Remote- and robotic surgery can be enabled through real-time cloud services which deliver the critical latency and reliability.
The story of real-time cloud has only just begun, and we expect to see new types of execution environments targetting applications with stringent timing requirements. The necessary technology components include improved hypervisors, real-time software switches, deterministic orchestration and end-to-end schedulers. All of these rely on the close integration provided by the network compute fabric. Ericsson will be a key partner driving and evolving real-time cloud services.
Want to learn more? Read our explainer of the computing fabric network.
Find other insights on our network compute fabric page.
Author bio: Johan Eker works with data driven operations of cloud environments at Ericsson. His research interests range from programming language design for parallel hardware, real-time control systems, mobile communications, software design for mobile devices, adaptive resource management, IoT and cloud technology.
Ola Angelsmark leads the Cloud Application Execution Environment research group at Ericsson. One area of interest for the group is industrial edge application infrastructure, with a special focus on mission critical and real-time applications.
During her career at Ericsson, Azimeh Sefidcon has held several technology leading roles. She is now a Research Director and Head of the research area cloud systems and platforms, responsible for Ericsson’s research agenda in the field, with researchers in five countries. She serves on the Board of Directors for the Wallenberg center for quantum technologies (WACQT) and for the Swedish national strategic agenda for the future automation and process industry (PiiA).
Disclaimer: Any views or opinions represented in this blog are personal, belong solely to the blog post authors and do not represent those of ACM SIGBED or its parent organization, ACM.