《易觀梅森:確保IP網絡彈性(2023)(英文版)(22頁).pdf》由會員分享,可在線閱讀,更多相關《易觀梅森:確保IP網絡彈性(2023)(英文版)(22頁).pdf(22頁珍藏版)》請在三個皮匠報告上搜索。
1、 Perspective Ensuring IP network resilience September 2023 Simon Sherrington Ensuring IP network resilience|i Analysys Mason Limited 2023 Contents Contents 1.Executive summary 1 2.IP network operators must pay more attention to network resilience 2 2.1 IP networks play a crucial role in underpinning
2、 critical applications and services 2 2.2 Failures can lead to substantial outages 3 2.3 IP network failures cause significant damage and financial cost 4 3.Traditional approaches to prepare for and pre-empt network incidents are not optimal 6 3.1 There are many gaps in organisations IP resilience a
3、ssurance strategies 6 3.2 Organisations face substantial barriers to ensuring IP network resilience 7 3.3 A new approach is needed 8 4.Organisations need to ensure network resilience from the network planning stage 8 4.1 Ensuring network resilience means ensuring that services can continue in extrao
4、rdinary circumstances 8 4.2 Operators of IP networks need to design-in resilience 8 4.3 Combining data spread across multiple silos can support detailed analysis 9 4.4 Creating a digital twin of the network enables advanced visualisation and evaluation 10 4.5 Evaluation of IP resilience against a re
5、silience maturity model can show where improvement is needed 11 4.6 Dynamic testing with real-network data can support IP network refinement 12 4.7 The digital twin can be used to test architecture change 13 5.Conclusions and recommendations 14 5.1 Key recommendations 15 6.Annex:Huawei resilient net
6、work solution 16 7.About the author 19 List of figures Figure 2.1:Types of damage caused by IP network outages percentage of respondents admitting each damage type.5 Figure 2.2:Percentage of customers affected by each surveyed companys worst network outage over the last year.6 Figure 2.3:Total cost(
7、USD)of each surveyed companys worst network outage over the last year.6 Figure 3.1:Organisational barriers to ensuring IP network resilience.7 Figure 4.1:IP network resilience by design.9 Figure 4.2:Digital twin of an IP network.10 Figure 4.3:Model for evaluating IP resilience.11 Figure 4.4:IP netwo
8、rks resilience scenario analysis.13 Figure 4.5:Using the digital twin to analyse network weak points.14 Figure 6.1:Huawei resilient network solution.16 Figure 6.2:NetLIVE platform architecture.17 Ensuring IP network resilience|ii Analysys Mason Limited 2023 Contents This perspective was commissioned
9、 by Huawei.Usage is subject to the terms and conditions in our copyright notice.Analysys Mason does not endorse any of the vendors products or services.This perspective contains the details of information on T/ZGTXXH0722023,IP Network resilience specification for computing network convergence networ
10、k infrastructure,with contributions from(1)China Academy of Information and Communications Technology;(2)Huawei;(3)Research Institute of China Unicom;(4)Research Institute of China Telecom;(5)Computer Network Information Center,Chinese Academy of Sciences.We received details of the specification fro
11、m Huawei.We have used reasonable and proper care to cross-check and investigate the material supplied to an appropriate level of detail.Analysys Mason will accept no liability for damages or losses resulting from errors or omissions in materials supplied to us.Ensuring IP network resilience|1 Analys
12、ys Mason Limited 2023 1:Executive summary 1.Executive summary Most of the digital services used by consumers and businesses run over IP networks,which provide the backbone transport infrastructure for broadband and mobile services and for corporate data networks.The largest national networks operate
13、d by telecoms service providers connect thousands of network elements,hundreds of millions of consumer and business customer devices and billions of IoT devices.The IP networks operated by large organisations such as banks underpin the critical digital infrastructure that enables those organisations
14、 to function.IP networks interconnect the data centres of the largest internet content and application providers that are used by billions of people daily.Therefore,when parts of an IP network fail,the implications can be significant.Outages have taken broadband and mobile services offline,cut entir
15、e countries off from the internet,prevented financial institutions from processing payments,stopped large social media companies from providing services,and prevented people from making calls to emergency services.Large failures of IP networks can have grave consequences and can also be very costly.
16、Yet IP network outages happen remarkably frequently.This all clearly demonstrates that organisations need to invest in increasing the resilience of their IP networks.Ensuring network resilience means being certain that service levels can be maintained to an acceptable level in the context of extraor
17、dinary events.These events might include equipment failures,malicious attacks or human error.Ensuring resilience is not the same as monitoring reliability.Nor does it mean ensuring network security.Ensuring resilience means taking a strategic approach to improving the robustness of the network by im
18、proving its architecture and configuration,in order to pre-empt and prevent problems.It involves building in resilience by design.Despite the negative implications of IP network failures,many organisations are not doing everything they can to pre-empt and prevent problems.In response to a survey of
19、operators of IP networks conducted by Analysys Mason in August 20231,fewer than half of respondents(43%)stated they undertake risk analysis or fault simulation,or network element health checks,only 26%stated they simulate attacks on the network,and fewer than 20%said they undertake fault survivabili
20、ty or disaster recovery analysis.A range of barriers are preventing operators of IP networks from doing more to ensure IP resilience.These include lack of in-house expertise,lack of budget and time,or inability to observe what is happening within the network.Insufficient expertise is particularly im
21、portant;human errors cause a great number of outages for instance,during system upgrades or system reconfigurations.Traditional approaches to improving IP network resilience are not working.Organisations need to consider a new approach.Given the scale of the risks,and the potential benefits from imp
22、roving IP network resilience,operators need to adopt a strategic approach.They need to strive for IP network resilience by design ensuring that resilience is built in at the network planning phase,and that the network architecture,device configurations,service constructs and operational processes ar
23、e all designed to avoid problems,or to mitigate them without affecting customers.1 The survey comprised 23 respondents,all working for companies that operate IP networks,which have experienced some kind of failure of outage.All these companies also generated at least USD500 million annual revenue,wi
24、th 22%of the companies generating over USD20 billion in revenue.Ensuring IP network resilience|2 Analysys Mason Limited 2023 2:IP network operators must pay more attention to network resilience Organisations can benchmark their level of resilience and measure the progress of their initiatives to imp
25、rove the resilience of their IP networks by using a resilience maturity model.The model should detail metrics and assessment criteria specifically designed to ensure network resilience and enable an organisation to evaluate the level of IP resilience for different sections of its network.IP network
26、operators should deploy tools and services that support detailed visual analysis of the IP network.IP resilience by design requires a 360O view of equipment,configurations,network topologies,traffic flows and service utilisation.It requires the ability to analyse the potential impact of equipment,sy
27、stem,configuration,traffic or service changes,failures or malicious attacks.Operators of IP networks should invest in a single tool that enables detailed visualisation of a digital twin of the network.Organisations can use the digital twin of the network to test complex scenarios and evaluate the pe
28、rformance and resilience of the network in the context of a range of threats,problems and failures.By using a digital twin that matches the real-world network,the testing and evaluation can be undertaken in a safe environment,before the live system is altered.The twin network can be used to test new
29、 network architecture and configurations,so that problems can be anticipated and avoided,improving business continuity and business outcomes.2.IP network operators must pay more attention to network resilience Crucial role and increasing complexity of IP networks Examples of major outages suffered b
30、y operators of IP networks Damage and losses caused by outages,including survey data 2.1 IP networks play a crucial role in underpinning critical applications and services IP networks underpin communications services worldwide,providing the backbone transport infrastructure for broadband and mobile
31、services and for corporate data networks.IP networks serve huge numbers of devices.The largest national networks operated by consumer service providers connect thousands of network elements,and hundreds of millions of consumer and business customer devices,and billions of IoT devices.The IP networks
32、 operated by large organisations such as banks underpin the critical digital infrastructures that enable those organisations to function.IP networks interconnect the data centres of the largest internet content and application providers that are used by billions of people daily.IP networks underpin
33、many of the networks that enable emergency services communications so if they fail,people cannot easily call for assistance in times of emergency.IP networks are part of the critical infrastructure of any large communications service provider,government or large enterprise.Without IP networks people
34、 cannot access the services on which they rely.Despite the importance of IP networks in ensuring global digital connectivity,and despite the fact IP networks are used to support increasing numbers of services(such as streaming services and gaming for consumers,or mission critical national and busine
35、ss services)that do not tolerate poor network performance,IP networks are best-effort networks,not originally designed to guarantee service levels.They are reasonably robust but are based on an infrastructure designed to enable global signalling,addressing and traffic routing,with decisions made loc
36、ally based on the information received from other parts of the network.This makes them liable to propagate problems if configured incorrectly.IP networks are designed to use statistical multiplexing and best-Ensuring IP network resilience|3 Analysys Mason Limited 2023 2:IP network operators must pay
37、 more attention to network resilience effort forwarding.Routes are chosen hop-by-hop(by the router)and based on availability of routes.Traffic is prone to congestion,which can cause delay and packet loss.There is little overall control and visibility.At the same time,IP networks are becoming increas
38、ingly complex.The capabilities of devices range from on-premises user equipment capable of serving a single home or branch office,to multi-gigabit or terabit devices residing in operator and enterprise core networks.IP networks rarely contain equipment from a single vendor.They are typically multi-v
39、endor environments.Devices also vary substantially in the protocols they run(for example IPv4 or IPv6).Corporate networks can run to hundreds of thousands of devices;service provider networks can include millions of devices.Enterprise and government networks are also changing to accommodate new ways
40、 of architecting and operating IT systems to deliver services.The IP networks being operated by large organisations such as banks are evolving to include applications hosted in multi-cloud environments encompassing public and private cloud solutions,and the applications are managed or delivered from
41、 data centres distributed across multiple locations.These distributed cloud and data-centre architectures require responsive,highly resilient and highly secure networks to ensure that services can be accessed by employees at central and remote or branch sites,as well as by customers 24 hours a day,7
42、 days a week.IP networks are also being used in more complex ways.Operators of IP networks are introducing services and applications that require end-to-end visibility and service management.Traffic engineering can be deployed to ensure quality of service for selected customers,services,applications
43、 or routes.Multiple overlay systems(for example,IP domain controllers,network management systems and SDN layers)can be used to influence how the network operates or how it manages traffic.However,there remain significant challenges in seeing and understanding what is happening throughout the network
44、 whether at physical,protocol,slice and service layers.The scale and complexity of IP networks is too great for human minds to be able to understand in sufficient detail.This makes it increasingly difficult to ensure their resilience.2.2 Failures can lead to substantial outages IP network failures c
45、an cause large-scale and lengthy outages of services that affect millions of users,sometimes for many hours.Public examples of major outages include the following.Operator A in Canada(July 2022)a Canada-wide service outage lasting nearly 20 hours caused by a router misconfiguration.The outage affect
46、ed tens of millions of customers and led to the loss of cable,fixed telephony and wireless network services including loss of ability for customers to call emergency services.Operator B in Japan(July 2022)a huge outage affecting more than 30 million mobile phone users as well as critical business se
47、rvices(such as ATM,delivery and weather systems)for more than 3 days.The outage was caused by a router misconfiguration in the core transport network during routine maintenance,which caused a cascade of problems.The misconfiguration disrupted the location registration function within the VoLTE netwo
48、rk(devices must register their locations to make VoLTE calls).This triggered numerous retransmissions which in turn caused traffic congestion.Distributed processing caused further spread of the congestion.To make matters worse,the subscriber database became overwhelmed.VoLTE nodes and the mobile pac
49、ket gateway must authenticate for each call each retransmission led to a new request and led to data inconsistencies at the subscriber database.This triggered more problems.It took Operator B in Japan more than 72 hours to fix the outage.Ensuring IP network resilience|4 Analysys Mason Limited 2023 2
50、:IP network operators must pay more attention to network resilience In October 2021,Operator C in in South Korea suffered a network-wide outage preventing the operation of fixed and mobile services for more than an hour.The loss of service affected schools,health services,financial trading organisat
51、ions and consumers.The outage was initially attributed to cyber attacks,but the operator subsequently confirmed that a border gateway protocol(BGP)configuration error had caused the downtime.In April 2020,a Border Gateway Protocol(BGP)configuration error believed to have been caused by a BGP optimis
52、er led to traffic relating to 8000 prefixes(including those relating to Akamai,Amazon,Cloudflare Facebook and Google)being routed through Rostelecoms network in Russia and then black holed.Although Rostelecom revoked the routes,they had already been propagated to peers by some other ISPs.Problems co
53、ntinued for around 5 hours.2 In January 2020,the national provider of Gambia,Gamtel announced a total internet outage affecting the entire country for more than 8 hours.The failure was caused by a faulty network card on the backup link itself commissioned due to problems with cable cuts to the main
54、submarine cable linking the country to the internet.3 The European Union Agency for Cybersecurity(ENISA)reports on network security incidents in the region.Its report Telecom Security Incidents 2021 states that network operators in Europe reported 168 incidents and the loss of over 5 billion user ho
55、urs of service in 2021.The main assets affected by security incidents were addressing servers(23%)and switches and routers(18%).Switches and routers have been the largest causes of incidents for 5 years,accounting for 18%of all incidents reported to ENISA between 2017 and 2021.4 Outages caused by IP
56、 network problems are not limited to telecoms service providers.In October 2021,Facebook(now Meta)suffered outages of its Facebook,Instagram,Whatsapp and Messenger platforms for billions of users for hours.In a public statement,it said the failures were caused by configuration changes to the backbon
57、e routers that co-ordinate traffic between its data centres.IP network problems are also known to have caused outages at financial institutions and other major critical national infrastructure providers although the root causes are not typically made public.2.3 IP network failures cause significant
58、damage and financial cost Analysys Mason conducted a survey of operators of large IP networks in August 2023 to understand the extent and impact of IP network failures.The responses show that IP network failures can have significantly damaging implications for organisations.74%of the organisations s
59、urveyed reported IP network failures had caused loss of customers,and 73%indicated IP network outages had damaged customer satisfaction.57%of the respondents indicated that they had been required to pay compensation to customers as a result of IP network failures.2 Rostelecoms Route Hijack Highlight
60、s Need for BGP Security()3 The Gambias Internet Outage Through an Internet Resilience Lens(internetsociety.org);https:/ 4 https:/www.enisa.europa.eu/publications/telecom-security-incidents-2021?v2=1 Ensuring IP network resilience|5 Analysys Mason Limited 2023 2:IP network operators must pay more att
61、ention to network resilience Figure 2.1:Types of damage caused by IP network outages percentage of respondents admitting each damage type5 The scale of losses can be substantial.Following its outage in July 2022,Operator A in Canada experienced significant public backlash,and subsequently committed
62、to a 3-year CAD10 billion programme of investment to improve its network resilience.Operator B in Japan also suffered financial consequences.Nearly 2.8 million customers who were unable to use services for more than 24 hours were eligible for a deduction of 2 days of their subscription charge from t
63、heir monthly bill,and more than 36 million customers had JPY200 deducted from their bills.Damage caused by IP network failures are not limited to public examples.When asked about the worst IP network failure at their organisation over the last year6,the companies surveyed by Analysys Mason,indicated
64、 that the worst outages they had experienced over the previous 12 months had affected substantial numbers of customers.Nearly half of respondents said the outage had affected 40%of customers or more.The scale of financial costs of those individual outages often exceeded USD20 million and in one case
65、 exceeded USD100 million.5 Question:“What impact have IP network failures had on your business?(IP network failures have never caused this/have sometimes caused this/have often caused this).”6 Question:“Thinking about the worst IP network outage you have suffered over the last year,please estimate:-
66、How many hours were services down?Enter the number of hours.-What was the percentage of customer affected?Enter the percentage value.-What was the total cost(including lost revenue,compensation to customers,and cost to fix)?Enter the cost in units of USD million.”Ensuring IP network resilience|6 Ana
67、lysys Mason Limited 2023 3:Traditional approaches to prepare for and pre-empt network incidents are not optimal Figure 2.2:Percentage of customers affected by each surveyed companys worst network outage over the last year7 Figure 2.3:Total cost(USD)of each surveyed companys worst network outage over
68、 the last year8 Source:Analysys Mason The survey results demonstrate that ensuring IP network resilience must be a critical component of an IP network operators strategy or significant damage can be caused.3.Traditional approaches to prepare for and pre-empt network incidents are not optimal Gaps in
69、 systems and processes used by organisations with IP networks,including survey data Organisational barriers to improving IP network resilience,including survey data 3.1 There are many gaps in organisations IP resilience assurance strategies Analysys Masons survey of IP network operators shows that c
70、ompanies employ a wide range of approaches to ensuring IP network resilience.IP network audits are typically conducted at least once per month(69%),with 26%claiming they conduct such checks on an ongoing basis.This highlights the fact that most companies are 7 Does not sum to 100%due to rounding.8 D
71、oes not sum to 100%due to rounding.Ensuring IP network resilience|7 Analysys Mason Limited 2023 3:Traditional approaches to prepare for and pre-empt network incidents are not optimal at least somewhat proactive in the practice of audit.However,the survey results also reveal significant gaps in audit
72、ing processes.Significantly,only around half of all respondents to Analysys Masons survey reported9 that they systematically undertake network topology analysis or IP optimisation analysis,fewer than half(43%)undertake risk analysis or fault simulation,or network element health checks,only 26%simula
73、te attacks on the network,and fewer than 20%undertake fault survivability or disaster recovery analysis.3.2 Organisations face substantial barriers to ensuring IP network resilience Companies are hindered in their ability to ensure resilience of their IP networks by a range of organisational and tec
74、hnological factors.Organisational barriers to improving IP network resilience include the lack of in-house skills(cited by more than 50%of all respondents),as well as time and lack of budget.43%of respondents also stated that preventing IP network outages was not a priority for executives within the
75、ir company.Figure 3.1:Organisational barriers to ensuring IP network resilience10 Technological barriers also prevent companies from improving their IP network resilience.The barriers most commonly cited by the respondents to Analysys Masons survey included inability to observe IP network behaviour
76、in sufficient detail(74%of respondents)and lack of real-time information(52%of respondents),with a range of other factors(such as lack of insight into topology,service performance and individual device configurations)cited by 43%of respondents.9 Question:“Which of the following activities do you und
77、ertake to pre-empt and prevent failures?Select all that apply.”10 Question:“What factors have hindered or limited your ability to prevent IP network outages?Select all that apply from the list of organisational factors.”Ensuring IP network resilience|8 Analysys Mason Limited 2023 4:Organisations nee
78、d to ensure network resilience from the network planning stage 3.3 A new approach is needed It is clear that organisations understand the importance of their IP networks for underpinning their services and their customers services,and it is clear they are taking a range of measures to sustain servic
79、es when unexpected events occur.However,traditional approaches to ensuring network resilience are leaving gaps and weak points,and are failing to prevent serious network outages.A new strategy is clearly needed.Organisations need to consider a broader programme of measures including evaluation and m
80、easurement across a wide range of performance indicators,adoption of new resilience evaluation methods,and the introduction of tools enabling improved and much more granular visualisation of what is happening(or could happen)in the network.They should also consider benchmarking the resilience of the
81、ir network against a structured framework,so they can clearly judge the resilience of their systems.4.Organisations need to ensure network resilience from the network planning stage The need for resilience by design The benefits of creating a digital twin Using a resilience maturity model to evaluat
82、e IP network resilience Using the digital twin to evaluate the architecture for weak points and test extraordinary events 4.1 Ensuring network resilience means ensuring that services can continue in extraordinary circumstances The evidence clearly demonstrates that organisations need to invest in in
83、creasing the resilience of their IP networks if they are to avoid the catastrophic failures and damage to their businesses that can result from network outages.Ensuring network resilience means being certain that service levels can be maintained to an acceptable level in the context of extraordinary
84、 events.These events might include equipment failures,malicious attacks or human error.Ensuring resilience is not the same as monitoring reliability.The network can have a very high level of reliability most of the time but may still fail very badly when things go wrong especially when problems casc
85、ade throughout the infrastructure.Nor does it mean ensuring network security which is also critical.Ensuring resilience means taking a strategic approach to improving the robustness of the network by improving its architecture,and configuration to pre-empt and prevent problems.It involves building i
86、n resilience by design so that when security issues occur,or when events happen that cause issues within the network,those issues can be mitigated,contained and resolved in the fastest possible time,with the minimum disruption for services and customers.4.2 Operators of IP networks need to design-in
87、 resilience It is critical for operators of IP networks to put in place robust strategies that enable them to identify the root causes of problems,to fix issues quickly,to implement architectural or configuration changes that reduce the likelihood of future outages,and to make the IP network more re
88、silient when issues arise.Simply monitoring uptime and service availability levels is not sufficient.Relying on people not to make errors under pressure is Ensuring IP network resilience|9 Analysys Mason Limited 2023 4:Organisations need to ensure network resilience from the network planning stage n
89、ot sufficient.Operators of IP networks need a comprehensive programme of data collection,visualisation,analytics and IP network refinement all benchmarked against clear metrics,and they need to pre-empt causes and model solutions.Figure 4.1:IP network resilience by design The starting point for this
90、 process is for operators to understand their current network architecture and configurations,and to be able to visualise them.4.3 Combining data spread across multiple silos can support detailed analysis An initial step for IP network operators to take is to gain a detailed understanding of the cur
91、rent state of their IP infrastructure.This requires data collection for individual network elements,IP network topology and behaviour,and services running over the network.Individual network elements need to be audited for factors such as configuration,location and utilisation levels.The audit must
92、encompass devices at end-user premises as well as those within the core IP network.Given the mix of ages,types and locations of network elements within a large IP network,it is likely that some of the data collection and aggregation will need to be undertaken manually,using a variety of systems.Conf
93、iguration is a particular challenge given that thousands of devices might need to be audited.Ideally,the configuration of the devices would be checked against a database of known configuration issues and the process would be undertaken automatically using a software-based approach.The manual alterna
94、tive would be very time-intensive.This is an important part of any resilience improvement strategy as configuration errors are known to have caused significant errors in IP networks.The data can then be brought together within a visualisation tool to enable analysis.Ensuring IP network resilience|10
95、 Analysys Mason Limited 2023 4:Organisations need to ensure network resilience from the network planning stage 4.4 Creating a digital twin of the network enables advanced visualisation and evaluation In most cases,the data needed to undertake a detailed assessment of the resilience of an IP network
96、is likely to reside in various data silos and across multiple systems.Analysys Masons survey shows that few IP operators have a single tool they can use to undertake sophisticated,holistic element,topology,traffic and service analysis.74%of the companies surveyed stated that inability to observe net
97、work behaviour in sufficient detail is a barrier to ensuring and improving IP network resilience.One approach to overcoming this is to use software and advanced visualisation techniques to build a digital twin of the IP network(Figure 4.2).Figure 4.2:Digital twin of an IP network Source:Analysys Mas
98、on Creation of a digital twin means integrating all relevant data sets within a single tool and using advanced visualisation techniques to create a holistic digital version of the network.This maps the physical world onto the digital world.Ensuring IP network resilience|11 Analysys Mason Limited 202
99、3 4:Organisations need to ensure network resilience from the network planning stage The digital twin enables visualisation of all the network elements,physical and logical topologies,traffic utilisation levels and flows.With a detailed understanding of what is happening within the IP network,it is p
100、ossible to identify weak points,and to identify opportunities to increase the resilience of the network.The digital twin can also be used to analyse network resilience under a wide range of possible scenarios by simulating malicious attacks,service provider failures,environmental disasters,equipment
101、 failures and operational or configuration errors.The simulations will identify additional weak points,or opportunities for optimisation,and the learning from those simulations can be used to make further improvements to the real-world network.Critically,creation of a digital twin enables evaluation
102、 of potential threats and their impact on services,before the changes are made within the network itself.Human failures are a significant cause of IP network outages.ENISAs report identified human error as the cause of 23%of all incidents it reported and those incidents were typically catastrophic,a
103、ccounting for 91%of all hours lost.Analysys Masons survey of IP network operators also investigated causes of IP network failures and outages.This also confirmed the impact of human activity.While equipment failures feature highly in causes of failure cited by survey respondents,so too does human mi
104、stakes;57%of respondents reported outages caused by errors made during system upgrades and 39%reported outages caused by errors made during reconfigurations.Using a digital twin could enable IP network operators to avoid some of the human-induced errors.An organisation can test network adjustments,o
105、ptimisation and maintenance activities in a safe environment before adjustments are made to the live network.4.5 Evaluation of IP resilience against a resilience maturity model can show where improvement is needed Organisations can measure the progress of their initiatives to improve the resilience
106、of their IP networks by tracking their performance against an IP resilience maturity model.One example of an IP resilience maturity model has been developed by the China Institute of Communications(CIC).11 It has drafted a specification that is applicable to all types of IP network operator.It is de
107、signed to assure resilience during five operational phases:prevention,detection,response,recovery and ongoing adaptation.The specification envisages five levels of resilience(level 5 is the most resilient)and recommends different parts of the network should be targeted to achieve different levels of
108、 resilience for instance,core networks should achieve resilience level 5 whereas standard internet services should achieve resilience level 3 or above.The specification recommends evaluation of resilience in six areas(Figure 4.3).Figure 4.3:Model for evaluating IP resilience Level 1Level 1 Level 2Le
109、vel 2 Level 3Level 3 Level 4Level 4 Level 5Level 5 Impact resistance 0 1 2 3 4 Service impact 30%=30%=20%=10%30%of customer services affected to fewer than 5%of customers.More-detailed metrics for evaluation cover service quality(packet loss,delay and hop count),and proportions of customers that hav
110、e been affected.Recovery speed:time needed to restore services.Fault limitation:a measurement of how widely faults propagate.To achieve level 5 status,faults are restricted to a single site.Scoring in this area can encompass measurement of how technologies are deployed and the software protocols tha
111、t have been activated,covering factors such as software and hardware patch deployment;deployment of anti-loop protocols at layers 2 and 3;layer 2 and layer 3 fault domain isolation,and BGP fault isolation.Evaluation of management structure in areas such as physical separation of business operations,
112、network management and maintenance in the core network;and physical or logical separation of business operations and network management and maintenance in the access network,as well as account log-in policies.Visualisation capability:ability of the IP network operator to visualise network topology a
113、nd network service path in real-time;to monitor routers and interior gateway protocol(IGP)and BGP activity and alarms in real time;and view real-time network quality across metrics such as latency,bandwidth and packet loss.The specification sets out detailed scoring mechanisms for each of these area
114、s.With a model that can be used as an evaluation framework,an organisation can assess its starting point,and can monitor its progress in improving the resilience of the IP network.4.6 Dynamic testing with real-network data can support IP network refinement By using a digital twin of the live network
115、,an operator can test the behaviour and resilience of the network under a range of scenarios(Figure 4.4).It becomes possible to visually analyse the risk and impact of extraordinary events that might cause parts or all of the network to fail,or to operate sub-optimally.Ensuring IP network resilience
116、|13 Analysys Mason Limited 2023 4:Organisations need to ensure network resilience from the network planning stage Figure 4.4:IP networks resilience scenario analysis A variety of scenarios can be tested by introducing a mix of controlled and uncontrolled disturbances into the digital twin network.Th
117、ese can include the launch of new services,or management of system upgrades,to equipment failures,configuration errors or external causes of failures such as malicious activity or environmental disasters.Ideally,the network can be stress-tested under a range of different scenarios,and the results se
118、en visually by the network planners.The performance of different parts of the network can be evaluated to ensure higher levels of resilience in core areas,and sufficient resilience for less important sites or services.4.7 The digital twin can be used to test architecture change Many problems arise b
119、ecause IP network architecture has not been designed to maximise resilience.Single points of failure have caused cascading problems that have led to failure of large sections of an IP network,and the services it supports.Once the digital twin of the network has been developed it becomes possible to
120、review the overall structure of the IP network,with a real-time analysis of topology and the routes used.With advanced visualisation,an IP network operator can undertake a detailed architecture evaluation.It is possible to analyse whether existing structures need to be altered,and whether introducin
121、g new hierarchical levels or mesh topologies within the network,or use of IGP/BGP protection mechanisms could improve resilience.Ensuring IP network resilience|14 Analysys Mason Limited 2023 5:Conclusions and recommendations Figure 4.5:Using the digital twin to analyse network weak points Source:Hua
122、wei An IP network operator could review the impact of any architecture changes against its ability to pre-empt and avoid problems,as well as its model for resilience improvement,looking at factors such as service impact,fault limitation and recovery speed.The review could additionally encompass an e
123、valuation of the services running over the IP network including capacity utilisation,review of service-level agreements(SLAs)and quality of experience.As weak points are identified,the digital twin can be adapted to determine how changes can pre-empt,prevent or minimise the impact of outages,or ensu
124、re swifter recovery.Once the changes have been safely tested in the digital twin,the live network can be adapted.5.Conclusions and recommendations The evidence shows that IP network failures and outages can have a significant impact on operators of IP networks and their customers.Organisations that
125、do not invest in the resilience of their IP infrastructure can suffer damage to reputation,reduction of customer satisfaction,loss of revenue or customers and the requirement to pay compensation to customers.Large-scale outages happen too frequently,and with significant consequences.Traditional appr
126、oaches to ensuring network resilience are clearly not working.Organisations should consider new methodologies and systems for assuring the resilience of their IP networks.The positive benefits of increasing the resilience of the IP network are likely to include improved customer experience,increased
127、 customer satisfaction,and avoided costs and damages.As revenue losses will be avoided,investment in improved IP resilience is also likely to lead to increased revenue.In addition,new tools and approaches can help organisations to improve business continuity and avoid issues as they make important i
128、nvestments such as upgrades from IPv4 to IPv6,new service roll-outs or new network deployments.Ensuring IP network resilience|15 Analysys Mason Limited 2023 5:Conclusions and recommendations 5.1 Key recommendations Organisations should take several steps to ensure IP network resilience.Recommendatio
129、n 1.Take a strategic approach to improving IP network resilience and ensure IP resilience by design.It is evident that many operators of IP networks are suffering negative impacts from IP networks failures.It is also evident that there are many additional precautionary measures they can take in orde
130、r to pre-empt and avoid or limit IP network outages.This requires IP resilience by design ensuring that resilience is built in,and that the network architecture,device configurations,service constructs and operational processes are all designed to avoid problems,or to mitigate them without impact on
131、 customers.Recommendation 2.Assess resilience against a maturity model.Use of a detailed model that sets out targets for resilience,with clear metrics,steps to take,and means of measuring them can help an organisation improve the resilience of its IP network.The model can be used to assess the start
132、ing point,and the progress made towards best-in-class resilience.Recommendation 3.Deploy tools and services that provide a holistic view and visual analysis of the IP network.IP resilience by design requires a holistic view of equipment,configurations,network topologies,traffic flows and service uti
133、lisation.It requires ability to analyse the potential impact of equipment,system,configuration,traffic or service changes,failures or malicious attacks.Operators of IP networks have access to many data sets on performance of their IP networks,although these are typically spread across a range of dif
134、ferent applications.Operators of IP networks should invest in a single tool or service that enables them to combine data from all the different data sources available to them in a single application.The application should enable detailed visualisation of a digital twin of the network.Recommendation
135、4.Organisations should undertake detailed scenario testing using the digital twin.This can enable them to evaluate the network performance under a range of stress conditions,and to experiment with configuration or architecture changes before making them in the live network.They can evaluate the perf
136、ormance of the network in the context of the resilience maturity model to determine whether adjustments are required.Ensuring IP network resilience|16 Analysys Mason Limited 2023 6:Annex:Huawei resilient network solution 6.Annex:Huawei resilient network solution With the digital and intelligent tran
137、sformation of enterprises,communication infrastructure is becoming more and more important as the backbone of daily operations.However,the continuous development of services and evolution of network architecture inevitably adds to the complexity of the network.As a result,the risks to network resili
138、ence increase,and the networks ability to cope with unexpected impacts decreases.As a leading network equipment and service provider,Huawei is seeking to address industry challenges and explore industry best practices together with partners.Huawei provides solutions that can help IP network provider
139、s to ensure their networks are reliable,available and cost-effective.Huaweis resilient network solution is based on four capabilities:the end-to-end capabilities of the technology(from chips and boards to networks)the database of risks and faults of network operations the network deployments leading
140、 industry leaders the NetLIVE platform that enables monitoring and network optimisation.Based on the NetLIVE platform and the Identify,Protect,Detect,Respond and Recover(IPDRR)framework,Huawei has built a set of solutions from identification to recovery,with a view to solving the resilience problems
141、 faced by multiple industries,including telecoms,banking,energy and so on(Figure 6.1).Figure 6.1:Huawei resilient network solution Source:Huawei The NetLIVE platform architecture consists of five technical engines and four competence centres.It can be consumed from the centre cloud and the edge clou
142、d(Figure 6.2).Ensuring IP network resilience|17 Analysys Mason Limited 2023 6:Annex:Huawei resilient network solution Figure 6.2:NetLIVE platform architecture Source:Huawei Four competence centres Collection centre.Collects,stores and models data for the solution.Makes the data visible to the users.
143、Network insight centre.Supports the network insight process and provides capabilities such as network management,network insight requirement management,lead management and milestone management.It builds the cross-domain network evaluation capability,integrates the single-domain network evaluation ca
144、pability,accumulates network evaluation experience and lowers the network evaluation threshold.Network evaluation reports are managed in a structured and online manner,and are automatically generated.Design centre.Supports the completion of network design and mobile network verification activities,o
145、utputs scenario-based design documents and verification reports through AI-generate content(AIGC)and intelligent interaction,and supports onsite(remote)delivery and enterprise delivery for capability invoking.Auxiliary operation centre.Implements and monitors the Reference Architecture,provides serv
146、ice performance optimisation services based on cloud big data and AI analysis capabilities,and helps customers to increase revenue.Five technology engines Network resource engine.Restores collected network data and stores the data in a structured manner.It describes the status of an operators networ
147、k and continuously records the changes.This engine serves as the basis for other engines,providing network specific data foundation for analysis and optimisation.Typical scenarios include network restoration,network modeling and network dynamic database.Network simulation engine.Provides formal conf
148、iguration verification,mechanism-driven modeling,data-driven modeling,hybrid modeling and automatic testing capabilities based on the existing simulation capabilities of the single-domain product line.It is used for deduction analysis in network planning,design,test and change scenarios.Typical scen
149、arios include traffic load impact simulation,service routing and traffic simulation.Machine perception engine.Provides capabilities such as image recognition,video acceptance,defect detection,3D modelling and rendering,and AR rendering.It is used in scenarios such as engineering surveys,visualising
150、designs,and quality inspection and acceptance,and can be extended to various xToB scenarios.Typical scenarios include optical cable dumb resource device detection,digital device modelling and measurement,and industry application defect detection.Ensuring IP network resilience|18 Analysys Mason Limit
151、ed 2023 6:Annex:Huawei resilient network solution Analysis and optimisation engine.Provides white-box optimisation(one can use mathematical formulas to express objectives and constraints and find the optimal solution),black-box optimisation(one can find the optimal solution by continuously interacti
152、ng with the simulation system)and big data mining capabilities,which are used to solve the optimal solution in scenarios such as data collection,network planning,design and auxiliary operation.Typical scenarios include site selection solution for the combination of two networks,identification of pot
153、ential home broadband customers and scheduling of digital logistics workshops.Generative engine of network integration service.Accumulates professional network knowledge and Huaweis knowledge of network integration,provides assistance through interactive interfaces,and improves frontline network pla
154、nning,design and operation efficiency,user data collection,network planning and design,and onsite operations.Typical scenarios include Multi-Vendor IP configuration translation,MOP(method of procedure)document generation and intelligent auxiliary design for enterprise campus networks.Huaweis NetLIVE
155、-based resilient network solution has provided services for some significant customers and is performing well in multiple fields,such as live network configuration check,disaster recovery reconstruction,survivability optimisation,signalling storm prevention and flow control.Ensuring IP network resil
156、ience|19 Analysys Mason Limited 2023 About the author 7.About the author Simon Sherrington(Research Director)leads Analysys Masons new Transport Network Strategies research programme,and its established Telecoms Strategy and Forecast programme.He also has a remit to expand Analysys Masons research f
157、orecasts and cross-programme thought leadership.He has nearly 30 years of experience in the industry,having worked as an analyst,consultant,market researcher and publisher.He has commented and advised on many different aspects of the telecoms business during that time.His CV includes a wide range of
158、 assignments covering fixed and mobile devices and networks,operator strategies,infrastructure evolution,as well projects encompassing retail and wholesale,and business and consumer services.Simon joined Analysys Mason from Innovation Observatory,a business he founded in 2005 to help clients working
159、 in the telecoms,media,IT and environmental technology sectors.Prior to that,Simon worked for Analysys Mason in a number of roles including Head of Custom Research,and early in his career he worked for CIT Publications(at the time a publisher of telecoms and media reports).Simon holds an LLB from th
160、e University of Exeter.12 12 Analysys Mason Limited.Registered in England and Wales with company number 05177472.Registered office:North West Wing Bush House,Aldwych,London,England,WC2B 4PJ.We have used reasonable care and skill to prepare this publication and are not responsible for any errors or o
161、missions,or for the results obtained from the use of this publication.The opinions expressed are those of the authors only.All information is provided“as is”,with no guarantee of completeness or accuracy,and without warranty of any kind,express or implied,including,but not limited to warranties of p
162、erformance,merchantability and fitness for a particular purpose.In no event will we be liable to you or any third party for any decision made or action taken in reliance on the information,including but not limited to investment decisions,or for any loss(including consequential,special or similar lo
163、sses),even if advised of the possibility of such losses.We reserve the rights to all intellectual property in this publication.This publication,or any part of it,may not be reproduced,redistributed or republished without our prior written consent,nor may any reference be made to Analysys Mason in a regulatory statement or prospectus on the basis of this publication without our prior written consent.Analysys Mason Limited and/or its group companies 2023.