NTTドコモR&Dの技術ブログです。

【英語版】Experience gladiator spectacles brought to life with cutting edge technology

NTTドコモ R&D Advent Calendar 2024 の12日目の記事です。(日本語版はこちら

Who we are

Hello, this is Afaf and Henrik from DOCOMO Communications Laboratories Europe, an NTT DOCOMO subsidiary located in Germany, we get to work with novel and exciting concepts for future mobile communication networks. A topic we have been looking into is “in network compute”, a concept that can revolutionize our digital experiences in the future. In network compute can be a crucial building block to enable virtual reality applications, real time usage of artificial intelligence and much more. This blogpost introduces the concept of in network compute and what it can bring to our future mobile communication networks.

Introduction

We are approaching a future where lifelike holograms bring distant colleagues into the same space for live teamwork and where retail shops can use smart mirrors to let customers visualize products and receive tailored recommendations. However, this vision implies an unprecedented load of computational that will put a strain on traditional centralised Cloud, complicating efforts to satisfy the growing demands for rapid data processing. In light of these demands, the future of telecommunication networking extends beyond providing simple connectivity; it is about compute power being integrated into the very own fabric of the network. Luckily, In-Network Computing (INC) can deliver at the juncture of these two sides (i.e., networking and computing), enabling network operators to perform computations within their network infrastructure, closer to the data’s origin or destination, unlocking thus possibilities for the instantaneous experiences that contemporary applications require and that once were challenging to achieve.

What is In-Network Computing (INC)?

INC is pioneering a new approach that integrates acceleration capabilities (e.g., SmartNICs, GPUs, DPUs, etc.) right into the network fabric, much closer to the data source and even along the paths that data takes as it flows through the network. This allows data to be processed on the go, reducing the need for that back-and-forth trip to centralized data centers. Spreading compute power from the data source, across different network points, all the way to the core reduces reliance on centralized processing, slashes latency, and boosts scalability and resilience. And this is how INC fuels innovative applications—similar to those imagined in the beginning— that network operators can deliver, unlocking fresh business prospects for both operators and customers.

The potential of INC becomes more apparent and the excitement for innovative use cases and business opportunities grows. Yet, realizing INC’s full potential is not without challenges that network operators face such as resource allocation, applications’ placement, scalability, and management intricacies. In today's entry of our advent calendar, we are going to take a closer look at one of the INC use cases.

Practical example

A practical example can enhance our understanding of the INC concept and its applicability. We can examine augmented reality (AR) as a case study to explore how INC applies in a real-world scenario. Imagine standing in the Colosseum in Rome reading a tour guide brochure passionately describing the spectacles that once took place there. However, in the empty arena, it is difficult to engage with these narratives. We get it, without gladiators, connecting with the history can be tough.

Luckily AR can greatly enrich your experience by reconstructing the arena packed with crowds and bringing those gladiators back to life, much more thrilling right? Well getting to this point involves quite a bit of work, which we can summarize in a few essential steps.

First, loading the 3D objects e.g., Colosseum, gladiators, crowd (refer to step 1 in the diagram below). This is a an intensive task that needs to be conducted on Cloud servers. Your AR glasses will then track your head and body movements to figure out where you are in the Colosseum (step 2). Based on this and on the created 3D objects (in step 1), the system creates the Colosseum view in real-time (step 3), which also requires significant computing power. Finally, the view is sent to your AR glasses (step 4), enabling you to fully immerse yourself in the Colosseum scene, with gladiators battling and crowds cheering (step 5).

The entire process must happen within a few milliseconds for you to have a seamless experience of the final rendered scene. Optimal latency should remain less than hundreds of milliseconds to prevent distortions and motion sickness; otherwise, one might prefer to go back and engage in reading the brochure stories about gladiators spectacles. Data from your AR glasses traverse the access network that transmits it to your operator’s core network (as shown below), the latter identifies your subscription, provides you with connectivity and monitors its quality and can forward your data along the right path. Finally, your data reaches the Cloud to start the processing. Such data transfer may induce latency surpassing hundreds of milliseconds threshold. How to make the entire AR rendering pipeline take less than hundreds of milliseconds, knowing that each step (tracking, rendering, etc.) needs to be performed repetitively (whenever you change your position or move your head) and without noticeable delay?

Bringing in the in-network computing

This is where INC comes into play. By leveraging computing resources (such as GPUs, smart-NICs, DPUs, etc.) embedded within the network fabric, your network operator can significantly enhance the speed of rendering. By offloading certain processing tasks, to be executed within those processing units in then network, the AR rendering experience can become faster and more realistic.

Typically, the AR experience is provided by an application provider, which handles the development and the maintenance of the application, including content and rendering techniques. As for your network operator, it supplies the necessary resources for processing and rendering (e.g., network, edge, Cloud, etc.) to ensure you enjoy high quality, low latency experience. The application provider might choose to break the application into different modules based on the processing location—whether on your AR glasses, in the network, or in the Cloud—to optimize performance and reduce latency and thus maximizing your quality of experience (QoE).

Let us assume the application provider has divided the application into five application modules:

  • Module #1: Tracking user’s position in the arena and head movement
  • Module #2: Pre-processing the tracking data in the network. This is useful to remove minor, jittery movements thereby ensuring a more stable stream of head tracking.
  • Module #3: Rendering detailed, static background components (arena walls, etc.)
  • Module #4: Rendering dynamic elements such as moving objects (crowd, gladiators, etc.)
  • Module #5: Correcting the real-time view (eliminating lens distortions, etc.)

The diagram below shows the potential locations for the implementation of each module. Modules #1 and #5 may not require high processing power, and thus can run on the AR glasses, which have a small battery and processing capabilities. Meanwhile, your network operator will receive module #2, #3, and #4 from the application provider and will need to deploy them in the network and the Cloud.

Alright, so far, we have discussed the general concept of INC, the operator’s infrastructure that can help in speeding the AR processing operations, the concept of dividing an application into modules. Now it is time to dive into challenges of implementing this whole idea. Just hang tight with us, there is going to be some new technical jargon for us to go through, but we will try to explain things in a straightforward manner.

One crucial aspect of INC is for your network operator to determine the optimal placement of these application modules and the appropriate compute and storage resources to allocate to each before deploying them throughout the network. To do so, the network operator can take into consideration:

  1. requirements and constraints issued by the application provider which are communicated to the operator (e.g., quality of the links between modules, the quality of service (QoS), the proximity to the user, etc.),
  2. the available network capacity (e.g., wireless resources available at the access network or at the wired links of the core network),
  3. the compute and storage capacities available in the Cloud and the core network.

In the diagram below we can observe a more detailed overview of how the deployment of the modules might happen based on the previous requirements. First of all, your AR glasses request network registration from the core network (step 1 in the diagram) which, if you remember, manages your subscription, connection quality, and many other things. In step 2, the AR glasses establish data connectivity, in a 5G system, a session management entity coordinates data connectivity of your AR session. In step 3, the application function, which has an agreement with your network operator, requests processing power, storage, and establishment of links within the core network and the Cloud for the 5 modules. The AF is subscribed to receive information from the core network, once your AR glasses connects to the core network, the AF receives connectivity information about your VR glasses. Of course, all this has to respect the aforementioned requirements. Upon receiving this information, an in the core network, called In-network compute entity, takes responsibility to coordinate network connectivity (in the form of links) and compute resources in order to achieve the desired QoS. For the links, this entity may interact with the session management entity to find suitable links (step 4). The link capabilities may be about the needed bandwidth between two modules, which indicates the necessary speed for data transmission between the modules. As for the compute resources, the in-network entity interacts with a resource MANagement and Orchestration platform (in short MANO) (step 5) which is designed to analyse the various constraints communicated by the in-network entity and decide for the optimal placement of the modules and reserves the processing power for the modules (in the core network and the Cloud). It then provides it with the resource allocation decisions (step 6). In reality, the compute capabilities may be expressed in some kind of standardized data unit such as gigahertz, floating operations per seconds, based on a benchmarking score, an amount of video RAM, etc. The reason for using a standardized data unit is for the MANO platform and the in-network entity to have a common language which both can understand.

When the in-network entity has found good locations with sufficient compute capabilities and links for the transferring of data between the modules, it responds to the application function as in step 7. The latter would then have to contact the MANO platform, as shown in step 8, to deploy the modules in the locations indicated in the response.

Generally, these application modules are deployed in form of images that can run on Virtual Machines (VMs) or OS containers instead of placing them directly in a dedicated hardware. This ensures quick deployment and also easy adaptation to changes, if for example the MANO platform needs to move the application modules from their current location to another location. An image of an application module is onboarded to the MANO in a packaging file which contains descriptor, software image and other artifacts; to make it simple, this file represents a blueprint that defines how the module needs to be configured and connected to the other modules. A zoom into the MANO platform and its operations is depicted in the diagram below which also describes, in more details, how steps 9 and 11 (from the previous diagram) are conducted.

We take as an example application module #4 that renders interactive elements of the gladiators’ spectacle. As we discussed earlier, module #4 needs to be deployed in the core network in the form of containers. The application function sends the descriptor to the MANO platform (step 1 in the diagram above), which consists of several components which we can simplify into three main parts. The first part, the orchestrator, receives the descriptor and based on the configuration and the connectivity requirements, instructs the application manager where to deploy the module. It also shares the connectivity details for module #4 to be connected with other modules (this is step 2). The infrastructure manager then takes care of allocating the necessary compute, storage and networking resources for the application. You may not be the only one using this AR application to attend the spectacle, there might be other visitors of the Colosseum who have also ditched the brochure for a more immersive experience. Now, if it is a peak hour and many visitors are there, your network operator might have to deploy more instances of the module #4 to handle all the data traffic (that is what we call scaling up). Conversely, if the visitors number decreases, the number of instances can be reduced (yep, it is called scaling down) and this handled by the application manager through the instructions provided by the orchestrator. Connectivity, scaling, etc., impact not just the instances (in this case the containers), it also has an impact on the underlying resources. If your operator decides to scale down module #4, then compute, storage, and network resources have to be released, and that is where the infrastructure manager is involved. When the number of instances is reduced, the application manager informs the infrastructure manager to release the resource that those instances were using (step 3). The application manager can also request from the infrastructure manager fault and alarms diagnostic related to the infrastructure to prevent any issues that can make the AR application from crashing (and for you not to find yourself back in the quiet empty arena). These are just few aspects that the MANO platform is handling, but there are much more happening behind the scenes.

Here is a non-exhaustive list of what instructions the orchestrator sends to the application manager:

deployment instructions, connectivity configurations, scaling instructions, updates and lifecycle management instructions...

The application manager in turns communicate to the infrastructure manager:

  • resource allocation needs,
  • connectivity configurations,
  • scaling instructions,
  • resources release instructions
  • request of performance data
  • request of fault and alarms diagnostic...

Conclusion

We really hope this advent calendar entry helped clearing some of the mystery of the INC concept. The Colosseum AR application was just one an example from the multitude of examples that can make use of the INC concept. If instead you have more interested in AI/ML applications, you can also take advantage of INC to split the layers of your neural network and distribute the computing tasks of the layers in the network. This helps optimize its latency instead of relying solely on a centralized Cloud.

And for your network operator, INC does not stop there, it elevates the game by enhancing their own internal processes and management tasks; with compute resources embedded throughout the network, tasks such as finding an optimal placement for the modules of your Colosseum application modules can be accelerated, and this help them create a network that is more application and consumer-focused.