TM’s Autonomous Network Operations: A Self-Healing Network
TM’s Autonomous Network Operations: A Self-Healing Network
TM’s Autonomous Network Operations : A Self-Healing Network
Summarise Expand it Summarise it Extend it

“RM0 unifi: We’re sorry. Your service no (603XXXXXXXX) may be impacted by a network disruption. Expected recovery on 11/11/2023 12.52AM. Thank you for your patience.”


As a Unifi user, if you ever receive an SMS like the one above, guess what? It’s a glimpse into the future of telecommunications, where disruptions are identified and rectified before you even realise, they even happened.


The process of addressing network disruptions is quite typical. In a service failure scenario, a customer would contact customer service to report that their service has been disrupted. This would prompt a field team to be dispatched to resolve the problem and notify customer service that the issue has been fixed, who would then get back in touch with the customer to address any remaining concerns and close the ticket.


But let’s face it – the current long-winded process doesn’t match TM’s vision of powering a Digital Malaysia, and how a digital citizen should be served.


TM’s mission is to enable its customers to access world-class digital services. When it comes to managing our network operations to serve a Digital Malaysia, we asked ourselves – what does a world-class network operation process look like?


The scale of our network is extensive and growing. Presently, we have over 92,000 network elements spread out globally. These include satellites, content nodes, data centres, and cloud infrastructure, including approximately 690,000 kilometres of domestic fibre cables and 340,000 kilometres of international submarine cables. This infrastructure culminates in over 200,000 alarms to be managed daily by our 1,000-strong team of network engineers at TM’s Network Intelligence Centre (NIC).


The work to maintain this huge and expanding network and manage the large number of daily alarms is highly complex. It involves repacking and isolating the issues detected, launching investigations into identifying root causes of the issues, which then progress into various rectification and resolution processes. Answering the question of proactive operations becomes more crucial, given the complexity of managing and maintaining our network.


We soon realised that the answer to this question lay in system autonomy, and it prompted us to build a network that not only identifies disruptions, but intelligently self-heals, leverages optimisation opportunities and seamlessly adapts to ever increasing capacity demands with little to no intervention from our engineers.


To build this kind of network, we focussed on a few key components:


  1. Automation: Automating routines and manual tasks such as detection and isolation of network malfunctions that free up our engineers’ valuable expertise for more strategic tasks, while ensuring that routine tasks are handled precisely with great accuracy.


  1. Artificial Intelligence (AI) and Machine Learning (ML): Incorporating AI and ML into our networks ensure effective identification of common trends and behaviours within the network. It then deploys the right rectification processes without human intervention.


  1. Big Data Analytics: Integrating and correlating all the information flowing through our vast network domains allows for real-time, data driven decision-making; making rectification of issues much faster and simpler.


  1. Intelligent Process Management: A network powered by AI, ML and analytics enables and ensures constant improvements of our processes by identifying opportunities to continuously simplify and streamline our processes.


  1. Capability-building: Possibly the most important component of building and maintaining autonomous networks, our engineers need to be equipped with the right tools and skillsets to manage and run our networks to their’ fullest potential.


While we are still 2 years away from realising a full-stack autonomous architecture of TM’s entire network, the use cases to date have already begun to show significant outcomes for both our users and TM:


  1. Better Customer Experience: Proactive notifications to customers (such as the SMSes from Unifi), with swift rectification processes as a result of AI, ML and analytics have significantly improved our users’ experience with our connectivity services, bumping up our Net Promoter Score (NPS) to +40.


  1. Higher, More Meaningful Employee Productivity: Fully automated routine tasks enable our engineers to focus and upskill by engaging with more complex, high-value tasks – which saw a 40% increase in their productivity levels since 2021. It also future-proofs our engineers by orienting them towards high-value positions such as policy engineering, cloud specialisations, virtualisation, and AI/ML programming.


  1. Cost Optimisation: Various elements of our autonomous network are continuously reducing network downtime and maintenance costs, while increasing the network’s capability to support new services.


In the dynamic landscape of telecommunications, the journey towards autonomy represents a paradigm shift and a revolution in how we envision and manage connectivity. Unifi’s SMS notification is a humble yet powerful example of a future where disruptions are pre-emptively addressed, ensuring uncompromising quality in our customers’ experience.


Our commitment to powering a Digital Malaysia has led us to question the conventional, prompting us to reshape our network operations from reactive to proactive. With just a year on the horizon until the realisation of our autonomous network, our fusion of technology, foresight, and innovation will continue to drive us in redefining the very essence of telecommunications, powering our digital nation.




Article summary: 

If you are a Unifi user and have once received an SMS informing you of a service disruption and an expected recovery time, you have experienced a feature of TM’s self-healing network!  

With 92,000 network elements and 1000-strong team of engineers rectifying over 200,000 alarms daily, the work involved in maintaining our network is incredibly complex. We realised that reactively maintaining our network was a slow, unwieldy process. The network of the future should be a proactive and fully autonomous one, capable of predicting issues and maintaining itself with little to no intervention from our engineers.  


In building a network of this nature, we focussed on a few components: 

  1. Automation: ensuring routine tasks were handled with precision and accuracy 
  1. Artificial Intelligence (AI) and Machine Learning (ML): to identify common patterns in our network and automatically deploy rectification processes 
  1. Big data analytics: to integrate information across our entire network to streamline and speed up decision-making 
  1. Intelligent process management: to help identify constant opportunities to simplify and streamline our processes 
  1. Capability-building: to equip our engineers with the right skillset to maintain an autonomous network.  


Our vision of shaping a Digital Malaysia has put us on a path to redefine the old ways of telecommunications and propelling us to a future driven by innovation; all to provide the world-class services fit for the citizens of a digital nation. 


Find out more!


Did you find what you were looking for?
How can we improve?
Get content that's currently relevant to you
Choose your content categories