& cplSiteName &

NFV Service Assurance – In Need of Big Data, Small Data or Both?

Sandra O'Boyle
1/12/2018
100%
0%

Operating NFV-based services and networks is the next priority for progressive communications service providers (CSPs). And it's not easy. In fact, CSPs are telling us that too much time and effort has been spent on VNF onboarding and far too little time and investment on the reality of operations in NFV environments.

NFV is shifting from a technology focus to operations, from "How do we do this?" to "How do we operate this?" A key challenge is integrating NFV network and service management into existing operations so that CSPs can run current networks and NFV cloud networks efficiently together.

How do we run NFV with existing networks and services? How do we assure on-demand services? And how do we offer dynamic customer SLAs? Without these answers, service providers cannot commercialize services and make the business case for moving to NFV and cloud networks.

The risk of not having the answers is that NFV will become a silo, a nice lab project that ends up being sidelined and not having any real impact on the bottom line. Ironically, the main driver behind NFV and SDN is a critical business-oriented one for service providers -- the need to launch personalized services quickly and be able to operate standardized, scalable networks.

More than 100 network operators and service providers worldwide participated in Heavy Reading's NFV Service Assurance and Analytics research study completed in the fourth quarter of 2017.

CSPs across the board say they are grappling with operationalizing NFV. This is not helped by organizational struggles including: internal knowledge and software skills gaps; differences of opinion between network and IT teams on new requirements and how to fill gaps in existing IT systems; as well as a lack of clear industry direction on what's required for NFV service assurance.

Nearly every issue around service assurance is rated a massive/significant challenge by at least 40% of CSPs.

The top five challenges rated "massive" or "significant" by CSPs in operationalizing NFV include:

Table 1:

Assuring performance of multi-vendor VNFs 59%
Offering dynamic SLAs 58%
Integration/API issues between OSS and MANO 57%
Handling volume of data from VNFs in real time 54%
Assuring hybrid networks in common platform 52%

    "Where we are today is that service assurance is a typical OSS IT function. It's offline, traditional, slow and not living up to the level where it needs to be, both in terms of automated processes or from a self-optimizing network on the radio side. What we are looking at is how to make this real-time on a granular level, which allows us to follow the sessions and respond and close the loop before a customer is impacted. We see this fitting in with MANO and the Orchestrator but still need a clearer picture on how it all works." -- Tier 1 European service provider

For CSPs that are already deploying NFV in live networks, key challenges include managing interoperability and performance across multiple VNF vendors. The traditional interfaces -- mobile signaling, management systems, and element managers (EMS)/configuration, for example -- are lagging behind. At the cloud layer, there is a VIM manager that handles performance of individual VNFs. However, there's a gap in implementation when CSPs combine different vendors' VNFs to deliver a customer-facing service that needs to be provisioned, assured and monitored for end-to-end delivered service quality. CSPs need centralized platforms with real-time actionable data to proactively manage multiple network and services layers.

This is becoming a real service assurance issue with universal CPE platforms, where operators need to move beyond single vendor SD-WAN VNFs to deploy multiple lightweight VNFs -- firewall, IP PBX, load balancer, application acceleration -- from different vendors.

In this case, CSPs see active testing and monitoring as essential for managing service quality, troubleshooting customer issues and assuring that services work accurately after provisioning or after service reconfiguration, as well as meeting dynamic SLAs.

Service providers also want active virtual probes or test agents to be lightweight with small CPU and memory footprint that can be containerized, so that active testing can to be done in a very non-intrusive manner without interrupting real-time traffic. CSPs tell us that once you deploy a VNF, you need a highly-automated virtual circle or lifecycle, from order management to assurance to re-fulfillment. This has to happen in a very orderly, automated fashion and be a very well-oiled engine without any noticeable disruption to the end user.

Active testing is also rated as highly valuable by 62% of CSPs, especially when automated and driven by an NFV orchestrator. There are a number of providers -- Netrounds, for example -- offering orchestrated and closed loop assurance with APIs.

This reflects the industry's eagerness to increase programmability and automation of networks (see Image 1 below). Service providers also want service orchestration to drive automation and process improvement with as little manual intervention as possible to deliver the service, as human intervention is the main source of outages and service or configuration problems.

Value of Service Assurance Solutions to CSPs
Source: Heavy Reading NFV Assurance and Analytics Survey, 4Q17; n=105 CSPs
Source: Heavy Reading NFV Assurance and Analytics Survey, 4Q17; n=105 CSPs

Dr. Stefan Vallin, Director of Product Strategy for Netrounds argues in his recent paper, "Service Assurance – In Need of Big Data or Small Data?," that data from active testing and monitoring yields detailed, real-time service KPIs, which can be referred to as "small data." This data provides great value by itself, but it is also an enabler for the successful application of big data and AI. Small data obtained from active testing and monitoring directly answers many of the most important service assurance questions, such as, "Are we meeting the level of service quality that we promised?"

Vallin goes on to make the following point: "If you can measure service quality directly, why would you try to reverse engineer it from noisy and incomplete data pulled from the resource layer? We traditionally have low-quality data from the resource layers -- we should put stronger requirements on devices to provide less data that is of higher quality and yields better answers to the questions being asked."

Service Assurance Systems Categorized by Relevant Questions/Output
Source: Netrounds
Source: Netrounds

For customer-centric service assurance, service providers need to visualize their end-to-end services, be able to prioritize issues and avoid faults that impact customers, and reduce meaningless data overload. CSPs that focus on selling enterprise services in particular are concerned they risk drowning in data and buckets of alarms that are not correctly prioritized based on customer or service impact. They expect that, in the future, they should stop caring about device alarms in real time and only care about service alarms and monitoring of the service, since the MANO would handle policy/re-route decisions. They would then use big data analytics to correlate across layers for troubleshooting and forensics on recurring faults and root cause analysis -- network failure, memory/CPU issues, fault correlation -- rather than real-time service assurance.

It's also about managing faults in a more efficient way, moving away from manual trouble ticketing systems and non time-sensitive alarms, and focusing on having network domains remediate and fix their own issues, while only pushing up service-level faults that need to be prevented or fixed right away, thereby automating fixes for customer-impacting problems.

    "The NFV environment needs to restore customer service itself and be service-aware, manage the customer's service and be responsible for the service." – Tier 1 Asia Pacific service provider

In summary, service providers need both big data and small data to effectively operate NFV networks. There's clearly a role in Service Assurance for Big Data analytics, machine learning and AI with the caveat that it's based on relevant and high-quality data input rather than massive amounts of low-level data from resource layers. To this end, service providers are starting to work with VNF vendors to manage the volume of data from VNFs in real time and put stronger requirements on devices to provide less data that is of higher quality and that yields better answers to the questions being asked in Image 2 (above).

Higher quality data will help to train algorithms to become more sophisticated at predicting what's going to happen to prevent outages or degraded services.

The second takeaway is to adopt new data sources that can measure actual delivered service quality in real time and from the customer perspective -- the aforementioned "small data" that can directly provide CSPs with relevant service KPIs. This is the preferable way to understand the service quality and experience in the eyes of the customer, rather than trying to use resource data to generate service KPIs at a higher layer based on information from a lower layer. This is really important when it comes to configuring services, rolling back services quickly, but also being able to "take the pulse" of the customer experience in real time at key moments or on an ongoing basis.

This blog is sponsored by Netrounds. Read Stefan Vallin's full paper here: "Service Assurance – In Need of Big Data or Small Data?"

— Sandra O'Boyle, Senior Analyst, Heavy Reading

(0)  | 
Comment  | 
Print  | 
Oldest First  |  Newest First  |  Threaded View        ADD A COMMENT
More Blogs from Heavy Lifting Analyst Notes
The BT executive shares his views on NFV, automation, in-house development, what he wants from his team and more.
Zen Internet, an alternative ISP in the UK, has ambitious growth plans and is looking to a refresh of its back office software, including the introduction of SDN capabilities, to help achieve its goals.
Almost 70% of service providers in this month's Thought Leadership Council (TLC) survey say they either already have or will move compute and application execution to the edge by 2020.
For CenturyLink, transformation is about enhancing its business in terms of effectiveness, cost efficiency and customer experience. So how is it trying to achieve that?
Open source MANO (management and orchestration) developments are providing network operators with something of a conundrum.
Featured Video
Flash Poll
Upcoming Live Events
October 23, 2018, Georgia World Congress Centre, Atlanta, GA
November 6, 2018, London, United Kingdom
November 7-8, 2018, London, United Kingdom
November 8, 2018, The Montcalm by Marble Arch, London
November 15, 2018, The Westin Times Square, New York
December 4-6, 2018, Lisbon, Portugal
March 12-14, 2019, Denver, Colorado
April 2, 2019, New York, New York
May 6-8, 2019, Denver, Colorado
All Upcoming Live Events
Partner Perspectives - content from our sponsors
One Size Doesn't Fit All – Another Look at Automation for 5G
By Stawan Kadepurkar, Business Head & EVP, Hi-Tech, L&T Technology Services
Prepare Now for the 5G Monetization Opportunity
By Yathish Nagavalli, Chief Enterprise Architect, Huawei Software
Huawei Mobile Money: Improving Lives and Accelerating Economic Growth
By Ian Martin Ravenscroft, Vice President of BSS Solutions, Huawei
Dealer Agent Cloud – Empower Your Dealer & Agent to Excel
By Natalie Dorothy Scopelitis, Director of Digital Transformation, Huawei Software
All Partner Perspectives