4 Gaps of Network Functions Virtualization to Meet Carrier-Grade Expectations

November 03, 2016 | By Avi Dorfman @ Telco Systems

Online viewer:

Comments (1)

We are pleased to share with you all an interesting article contributed by Avi Dorfman who is technology executive with over 20 years of experience in the development and delivery of telecommunications systems.

Avi Dorfman

VP R&D at Telco Systems

All Articles by Avi Dorfman


	How to contribute your article to Netmanias.com !

	List of Contributors

Network Functions Virtualization (NFV) is a core structural change in the way telecommunication infrastructure gets deployed. This in turn will bring significant changes in the way that applications are delivered to service providers. NFV will bring cost efficiencies, time-to-market improvements and innovation to the telecommunication industry infrastructure and applications. This disaggregation will be enabled by changing the industry’s traditional approach to delivery of applications from a closed, proprietary, and tightly integrated stack model into an open, layered model, where applications are hosted on a shared, common infrastructure base.

NFV architecture benefits

Flexibility: Service providers looking to quickly deploy new services require a much more flexible and adaptable network -- one that can be easily and quickly installed and provisioned.

Cost: Cost is a top consideration for any operator or service provider these days, even more so now that they see Google and others deploying massive datacenters using off-the-shelf merchant silicon (commoditized hardware) as a way to drive down cost. Cost is also reflected in opex -- how easy it is to deploy and maintain services in the network.

Scalability: To adapt quickly to users' changing needs and provide new services, operators must be able to scale their network architecture across multiple servers, rather than being limited by what a single box can do.

Security: Security has been, and continues to be, a major challenge in networking. Operators want to be able to provision and manage the network while allowing their customers to run their own virtual space and firewall securely within the network.

ETSI NFV Architecture

Issues and challenges

The telecom vendors are currently developing proof-of-concept for moving existing network functions to virtualized infrastructure and there are quite a few challenges during the implementation and deployment phase. Many aspects come into play when the different network functions are deployed in virtualized infrastructure.

1. Meeting Carrier-Grade availability requirements

Present day Carrier network infrastructure provides reliable service and meets the availability requirement of 5’9s. Existing high capacity servers are designed for IT services and enterprise class of application with availability of the order of 2’9s to 3’9s. NFV infrastructure based on standard off-the-shelf server will not be able to meet the carrier-grade availability expectations.

NFV architecture allows very agile life cycle management to allow just in time creation of Virtual Machines to host the Virtual Network Functions; in the event of failure. The availability approach relies on spawning new instances. This is quite different than the traditional high availability architecture used in telecom systems.

In the traditional high availability architecture, while platform redundancy is used to avoid single point of failure, all the hardware and software components are hardened to prevent failures. Application specific state replication is implemented to ensure continuity of operations. This high availability support will have to be ported to the NFV environment.

2. Limitations of existing network applications

Most of the legacy systems are designed with an assumption that the network application has exclusive access to the hardware resources (CPU, NIC, disk). Resources such as BSP and hardware accelerators are directly controlled by the application.

Examples of possible issues:

Internal task interaction of real time applications assumes direct control of hardware resources. Sharing of the hardware platform with other applications would cause some performance degradation. The virtualization hypervisor provides some level of isolation between different virtual machines however it is not the same as running an application directly on the hardware.
Existing network applications would mostly be scalable; however the architecture may not support dynamic scaling in response to traffic changes. The load distribution algorithm would typically be static and assume all the installed physical servers are available for use resulting in under-utilization of each of the servers. Some amount of re-engineering would be needed to make use of ‘elastic’ nature of the virtual infrastructure.

3. Management of virtualized infrastructure and network applications

Management system of traditional networks typically consists of multiple Element Managers report to a Network Management system. Each Network Element is associated with at least one Element Manager. Existing Element Managers assume tight coupling between the Network Element and the platform itself.

Issues in existing management systems:

The operational state of the underlying platform being considered the same as that of the Network Function
The topological view showing different Network Elements and Network Functions in their physical form in the form of hierarchy of hardware modules, rather than showing the Network Functions as being overlaid over the physical infrastructure
The Element Manager Fault Management Function mapping the platform fault directly to the Network Function

The management model will have to be extended to separate the NFV platform and network applications.

4. Integration and testing challenges

Traditionally different network applications were deployed as distinctly visible entities with well-defined interface reference points. Integrations tools such as protocol analyzers and data probes are based on such controllable and observation reference points. Most of the integration tools assume easy access to these interface reference points; however in the NFV model one or more Network Functions related to an end-to-end service may be assigned to same high end hardware resource and the access to the reference point would be restricted. Some of the virtual switches address this issue by forwarding all the traffic on the virtual NIC to the physical NIC (using Virtual Ethernet Port Aggregator approach) to facilitate the tapping of these interface reference point.

System performance testing under load conditions in the deployment configuration will pose some challenges. In the lab, it will not be possible to create an environment equivalent that of the actual deployment as other VNFs sharing the same infrastructure is either unknown or cannot be instantiated.

Summary

In the last few years Network Functions Virtulatiozation has captured the attention of operators worldwide. NFV has been making waves as it represents the evolution of networking, promising to virtualise physical network infrastructure and create an environment that’s more adaptable than the legacy systems currently in use. Despite all challenges and issues, NFV is here to stay.

Wolfgang Fleischer 2016-11-05 06:18:17

Hi Avi, I fully concur with the benefits that you have mentioned at the start of your article. I do however not concur with your statements on the four gaps that you mentioned. Your statements are correct, if you assume that one has to carry over the former Telco approach and solutions to the new NFV world. However assuming this is in my opinion a wrong step into the future.

Let's take a look at point 1. Your statement is that the Telecom carrier-grade availability solution will have to be carried over to the NFV world. I do not agree on that. If we take a look at the end customer (the consumer; mobile handset user; etc.), then I believe that we would all agree that that person will be interested in possibly a 99,9999% yearly service availability (besides the unlimited throughput, which everyone wants to see). Well in reliability and availability engineering we all pretty much learned that there is not a single system that can provide that degree of availabiliy. But we have also understood that we can build compound systems that may be able to achieve that reliability / availability. The reliability / availability figures on the individual components can actually be much less, even something like 99,9% yearly reliability. Now that in my opinion pretty much fits to Cloud native deployments with additional add-ons that will not be Telecom like solutions, but will loook differently - most probably simpler and therefore easier to handle and to provide in implementation. That is actually the strength that Cloud and IT provides to us, but it seems to me that we are not using that adequately, are not taking advantage of it. Why am I pretty confident with what IT and Cloud are providing? Because the financial industry and other industries have been using those capabilities now since long in order to provde highly reliable services themselves. It seems to me a little that we in Telecoms are hanging on too much onto our old solutions and do not understand what the new world can actually provide to us (at much lower costs, while achieving identical results).

I do agree to your point 2. and would just want to add that we operators DO EXPECT equipment manufacturers to actually take an approach where the network functions are re-architected to cater for Virtualization and Cloud deployment. Without those re-architectung efforts we will not be able to harvest the benefits that you have been talking about in the introduction.

On your point 3. regarding Management, I would like to add that the management model will not have to be enhanced, it will need to change to cater for the changed environment. For instance the EMS has been a facility that permitted operations people to interfer with the network as they saw fit in order to provide the quality desired and contracted through SLAs. Now for the future, I would think that most of us agree that we will see an explosion in terms of connected instances (sessions) across our networks. For the sheer amount of devices (or sessions) and for the flexibility and self-determinism that we want to give to our customers, we will have to realize that a human centric operations model is no longer adequate in a few years from now. So the real challenge will be "how to change the management model in order to fit the potentialities that NFV has provided to us". In our company we have already concluded that an enhancement of the existing management models will not suffice to deal with the changes that come about.

Point no 4. is a real challenge to our (Telco) thinking. Yes, we will probably no longer be able to monitor reference interfaces that are realized within the same compute resource. But I would put up the question: do we really need to monitor that interface? Do we need to monitor it at all times? Well if we could agree that it does not to be monitored at all times, then there are already some means that OTT developers and IT practices provides us with: pull the data you need, when you need them; the data are locally available in the applications / functions themslevs. On the other hand, I would find it much more important to learn how we actually can monitor the performance of services directly, i.e. with means and measures that act on the same level as the service, might even be integrated with the service itself. Currently we are still relying on information from lower layers (infrastructure) to tell us (actually to compute) how the upper layers (services) are performing. I can tell you that that does not work well in our company. Enterprise services that we are using ourselves perform perfectly in our monitoring systems, but do not perform at all on service level (service is unusable). Due to that gap between monitoring and service delivery, we have not been able to resolve the issue, as operations does not see the issue and internal customers can't really prove it.

We are running already quite some substantial amount of functions based on NFV in our networks. So I would even state: NFV has arrived. Running a hybrid environment is not so much of an issue as people tell it is, as long as it is not automated. The real challenge is to support automation, once NFV has been addressed and resolving this will keep us busy over the coming years. My statement here is: the simpler the overall environment (i.e. the less Telecom specific solution have been implemented), the easier the automation of the networks will become. Please note that I am not negating the need for very good service perception to customers and consumers (although there are changes too, if you look at the digital generation becoming working adults).

Kind regards

Thank you for visiting Netmanias! Please leave your comment if you have a question or suggestion.

Comment