Delve into SONiC’s updates at OCP Summit 2024. This session highlights the past year’s pivotal software and community advancements- the breakthrough features of SONiC 202405 and 202411- and the insights on AI networking that are redefining Ethernet networking. Join us to explore the community’s growth and the full potential of open- innovative networking.
Yanzhao Zhang is the SONiC PM manager in Microsoft, driving the public SONiC community and managing SONiC product roadmap & adoption on Azure cloud. Yanzhao holds a bachelor and master’s degree in computer science and has more than 20 years of software development and management... Read More →
Large-scale AI models impose stringent performance- reliability- and management requirements on underlying networking infrastructure. Network operating systems like SONiC must evolve to meet these demands effectively. This talk identifies the challenges posed by AI workloads- discusses strategic pathways for enhancing SONiC to address these challenges- and showcases specific technologies as exemplars of these advancements. Join us to explore how SONiC can evolve to support robust network infrastructure for AI workloads- ensuring scalability- efficiency- and seamless integration with modern AI-driven applications.
This presentation will offer a glimpse into the latest developments within the SONiC AI Working Group. We will cover progress in key areas: Network fabrics for AI- Routing and network design- Congestion Control and Load Balancing. Attendees will gain insights into the challenges and solutions related to SONiC's support for AI clusters- the already identified gaps and the next steps on the agenda. \nJoin us to explore how SONiC is evolving to meet the demands of high-performance- low-latency network architectures tailored for AI clusters and GPU-accelerated workloads
Senthil Kumar Ganesan is from Dell Technologies working as a Tech Staff. Senthil has specialized in the field of Networking OS Development for 15+ years and has led many network stack development for various OS. He has contributed various open source projects such as Open Switch... Read More →
Senthil Kumar Ganesan is from Dell Technologies working as a Tech Staff. Senthil has specialized in the field of Networking OS Development for 15+ years and has led many network stack development for various OS. He has contributed various open source projects such as Open Switch... Read More →
This presentation unveils a new capability of SONiC- positioning it as a powerful and effective AI/ML networking test appliance. \nSONiC has demonstrated remarkable adaptability across various network usecases. This presentation will illustrate how SONiC can be harnessed effectively to test RoCEv2 performance between endpoints. \nBy leveraging robust features like ACL UDF for deep packet lookup actions through ASIC- SONiC can be used to test AI/ML server NIC performance- specifically in handling network abnormalities such as out-of-order placement- congestion control- PFC- and ECN at line rate. In this context- SONiC serves as a powerful network fault injection appliance. \nMoreover- with third-party container support in SONiC- the integration of a packet editing tool as SONiC container extends the solution into a comprehensive payload packet editing device for slow path fault injection- further enhancing its capabilities.
Adam Yeung is a seasoned network software expert with nine US patents to his name. Adam is the SONiC Open Source engineering lead at Broadcom, he also manages an engineering team responsible for design and development of advanced networking features, including EVPN VxLAN, multi-chassis... Read More →
AI infrastructure requirements are multiplying by the day. More and more companies are turning away from expensive proprietary systems and adopting SONiC, the 100% open-source, hardware-agnostic network operating system for their next-generation data centers. This session will discuss SONiC as the perfect vector to revolutionize the future of AI. SONiC escapes the constrictions of vendor-lock in while providing all the networking functionality the data center requires. We will demonstrate NVIDIA’s full commitment to SONiC as the leading contributor to the SONiC project, emphasizing NVIDIA’s dedication to provide open networking for everyone.
We will also demonstrate how to deploy and use SONiC within NVIDIA Air, our cloud-hosted environment for hosting digital twins and testing infrastructure. SONiC on NVIDIA Air offers a hassle-free, preconfigured lab for trying out SONiC instantly at no cost.
The SONiC Unified Management Framework (UMF) provides a solution for network operators to achieve real-time visibility and control over their networks. Through the integration of gNMI and OpenConfig data models- UMF enables streaming telemetry capabilities through on change notifications which is superior to polling. This streaming telemetry data provides valuable insights into network health- performance- and utilization. A wide range of telemetry data types- including metrics- events- and alarms are/can be supported. Metrics provide quantitative measurements of network performance- such as CPU utilization- memory usage- etc. Events represent significant occurrences in the network- such as link failures- switch reboots- etc. Alarms indicate critical conditions that require immediate attention- such as high temperature readings or power supply failures. This presentation also includes specific examples of how these capabilities improved speed of mitigation of network failures.
PhoenixWing initiative has been kicked off at the 4th Networking Open Source Technology Ecological Conference in Beijing- China- on May 25th 2024. Representatives including Alibaba- Cisco- Broadcom- Accton- Inspur- Huaqin and Ruijie- jointly announced the official kick-off of this initiative.\nIt is the first community achievement from the SONiC Routing WG. The short term goal is to verify the deployment readiness of SRv6 solutions via using the SONiC 202411 release on participated hardware devices. The enhancements not only contain SRv6 features such as SRv6 VPN and SRv6 Policy with BFD offload- but also include infra optimizations in terms of FPM channel optimization- BGP loading acceleration and prefix-independence convergence. The long term goal for this initative is to add SRv6 into SONiC interop lab and create a test suite which could help on SRv6 adoption. \nThis presentation would give an introduction on this initiative and current progress towards the short term goal.
This talk presents a refined methodology for Network Operating System code development that leverages reference virtual hardware. This approach has been instrumental in the evolution of IOS-XR and the recent advancements in SONIC. We discuss the comparative benefits and constraints of this method against abstract platform NOS simulators and introduce SONIC-NGDP-VS, a virtual switch/router that marries the agility of an open platform with the robust capabilities of Cisco’s Silicon One dataplane.
For the Silicon One based Cisco 8000 routers, we have developed a suite of virtual platforms capable of running IOS-XR and SONIC. These range from simulations of compact systems to large-scale platforms with dual router processors. These models are used internally for development and test, and also provided to customers.
To support the SONIC community, we have developed a new disaggregated solution that allows users to integrate SONIC source with our proven SAI/SDK and NPU models to construct their own Silicon One based SONIC-VS. This approach ensures that development against a commercial-grade SAI/SDK/dataplane increases the probability of seamless functionality on physical hardware.
APRESIA is a Japanese distributor of open networking products such as white box switch, Network OS, network orchestrator and so on. KDDI, NTT PC Communications and some other customers have deployed SONiC in commercial network and we've supported their activities. In this presentation, I'd like to share takeaways of SONiC commercial deployments and operations: - use case: Management network, AI/ML network - requirements: Change of interface name, uplink tracking, ZTP, - network design: Interoperability of EVPN/VXLAN with server NIC teaming and IPv6 - trouble shooting: Analysis of sairedis debug log for EVPN routing issue.
Delve into the collaborative efforts of Broadcom and Cisco in implementing EVPN-VxLAN multi-homing on the SONiC platform. This presentation will unveil the architecture and the implementation details that enables robust- scalable- and redundant network topologies. Attendees will gain insights into the practical challenges and solutions encountered during this integration.
As the lead architect of EVPN & Network Fabric Technologies at Cisco Systems, Patrice primarily focuses on the Web portfolio. He expanded his expertise into the architecture and development of SONIC, with a particular emphasis on AI/ML and data center use cases. With 25 years of experience... Read More →
CMIS (Common Management Interface Specification) has emerged as the leading management interface for the modules ranging from copper cables to coherent pluggable modules.\nSONiC provides support for custom-built CMIS compliant modules with extended capabilities like updating the firmware on the remote end of the cables like AEC and ACC and support of custom memory map.\nLearn how SONiC is extending the capabilities of CMIS compliant modules and providing following advantages:\n1. Custom feature support with custom memory map\n2. Platform/ASIC/Switch and module vendor agnostic support\n2. Dynamic port breakout support for modules supporting multiple applications\n3. Streaming telemetry for DOM (Digital Optical Monitoring)\n4. Custom host side and module side signal integrity support\nAlso- we can learn from Nvidia’s story of moving from their custom firmware handling of the modules to SONiC adoption for managing the modules with improved link reliability along with hitless upgrade of the NOS.
Sflow uses UDP datagrams to send packet samples to the collector and the samples include first N bytes of the actual payload data that may have sensitive information. UDP packets are not encrypted and hence prone to man-in-the-middle attacks. UDP being connectionless is also prone to packet losses that lead to inaccurate traffic measurements.
gNPSI (gRPC Network Packet Sampling Interface) addresses the security vulnerabilities in sending Sflow packets over UDP. gNPSI encapsulates the Sflow samples in gRPC format and hence adds authentication, re-transmission and encryption to the samples, enabling usage of these samples in critical network-control loops. gNPSI also changes the mode of the collector connection from a dial-out to dial-in.
This talk will cover the integration of gNPSI into SONiC Sflow stack and the benefits it brings to SONiC. It will cover the configuration, migration, monitoring and performance parity aspects of gNPSI in SONiC.
The control of transceivers is becoming increasingly complex with the advent of technologies like 400ZR and Co-Packaged Optics (CPO). Platforms utilizing these transceivers require coordinated control with switch ASICs. Enabling development in a virtual environment can enhance software quality and allow for experimentation with novel transceiver control mechanisms. In SONiC, transceiver control is primarily managed by xcvrd. However, xcvrd is not enabled in SONiC's virtual environment (VS), making it challenging to conduct such tests and experiments.
This session will provide an overview of the necessary modifications to SONiC to enable xcvrd in the virtual environment. By making these adjustments, we can facilitate the development and testing of advanced transceiver control software in the VS environment without the need for physical hardware. Additionally, we will introduce a transceiver emulator developed to support the testing of sophisticated transceiver controls, such as those required for 400ZR and CPO.
Join us at the OCP Summit 2024 to explore the revolutionary world of Smart Switch for BYOD(Bring Your Own Datapath)- a cutting-edge form factor empowered by SONiC. In this session- we will delve into the software architecture behind Smart Switch for enabling BYOD and its seamless integration with SONiC. \nIn this session- we will demonstrate SONiC's Bring Your Own Datapath (BYOD) capability through a critical Azure scenario involving policy support on Private Endpoints. This session showcases the collaborative efforts between Microsoft and NVIDIA- illustrating how our existing application can be seamlessly transitioned to NVIDIA DPUs in Smart-Switches without requiring any application modifications. Join us to explore how SONiC empowers intelligent networking with Smart Switches- ensuring you stay ahead of the curve in embracing future networking paradigms.
SONiC is revolutionizing enterprise networking. Dell- a key contributor from SONiC's inception- has extended its capabilities for enterprise use- promoting widespread adoption. Partnering with BE Networks- Dell streamlines SONiC deployment across Data Centers and Edge- enhancing automation- agility- and fabric resiliency. The solution supports advanced features like Network Time Traveler- Firmware Management- Cabling Autodetection- PoE Mgmt- Device & Port Security- and Topology Assurance.\nRecent updates introduce packet broker capabilities with SONiC- allowing for advanced traffic management and monitoring. Along with support for GenAI fabric management and enhanced fabric observability for AI workloads. These innovations are essential for managing complex AI environments and ensuring optimal network performance.
Broadcom is a large corporation (~25000 employee base- 23 semiconductor and infra software divisions) has a huge R&D investment with over 13+ DCs worldwide. In early 2019- Broadcom decided to upgrade its DC infrastructure from a legacy 3-tier architecture with high vendor lock-in costs to a more robust and scalable CLOS architecture using an Open NOS SONiC with an Open ethernet ecosystem. This presentation describes Broadcom's journey using SONiC from 2020 to 2024 - how we had to have a paradigm shift for moving towards disaggregation- initial challenges we faced- lessons learnt after upgrading to SONiC. In the end- we were able to achieve a tenant based architecture- remove vendor lock-ins and high hardware costs whilst improving quality- much faster- scalable and reliable DC- with lessened overall TCOP costs for the entire DC infrastructure.
Accomplished and Passionate Product Line Manager with 11+ years of progressive product management, technical marketing and solutions engineer experience spanning Enterprise Data Center and Campus Switching technologies.
Collaborating with SONiC routing WG- we have introduced the prefix independent convergence into SONiC. Current FRR / SONiC’s routes handling come from data center use cases. But routing domain is another story due to the scale and the introduction of VPN routes. Since the DCN solution leads to prolonged routes convergence time at network outages- the traffic loss would be significant. This is one of the key infrastructure enhancements required to extend SONiC domain from data center to routing use cases. \n\nIn this talk- we introduces how we reformat FRR/ SONiC routing domain related data structure:\n- Decouple prefix information with next hop information\n- Decouple next hop information and VPN context information. \n- Introduce next hop group ID and VPN context ID programming from FRR to SONiC\nWe introduce e quick fix based on existing routing information to minimize traffic loss window. We also use BFD to detect BGP next hop connectivity.
Applications experience/detect the network problem and then we troubleshoot. We repeatedly encounter this issue. Why legacy network performance measurement solutions cannot measure the application experience? The network fabric is built on top of IP. The nature of IP is ECMP. There are many ECMP paths between the fabric edges that connect our applications. Legacy solutions lack the scale required to measure all the ECMP paths. In addition, they rely on metrics such as min/max/average that don’t reflect the experience of each application. The IP Measurements solution (IPM) is a network performance measurement solution applicable to any IP Fabric (SRv6/MPLS/VXLAN). It provides the scale required to monitor the network experience of all applications via the HW integration. IPM provides detailed insights into the application experience through accurate metrics (latency histograms). Additionally, it reduces CAPEX and OPEX via elimination of external probing appliances. The SAI/SONiC implementation of IPM is available and currently in the process of upstreaming to the community SAI/SONiC. In this presentation we review the IPM solution and provide an update on the SAI/SONIC support.
Ahmed Abdelsalam is Technical Leader at Cisco. His main focus is the IP technology. Ahmed has been working on several IP engineering projects including SRv6 uSID and IP Measurements. He helps the customers in their SRv6 and IP Measurement adoption journey. He contributed to the SRv6... Read More →
As AI workloads and digital transformations continue to reshape enterprise IT landscapes- the demand for robust- scalable- and flexible networking solutions is more critical than ever. This panel will examine the potential of SONiC (Software for Open Networking in the Cloud) for enterprise adoption. We will explore its journey from being successfully utilized by hyperscalers such as Google and Microsoft to the unique challenges and opportunities it presents for enterprises. Key discussion points include the role of enterprise distributions- the importance of vendor-agnostic integration- and the essential support mechanisms necessary for large-scale enterprise deployment. Join us to evaluate if SONiC is truly equipped to meet the needs of modern enterprises.
I am an enthusiastic guy who loves to meet new people and establish a long lasting relationships. My business expertise is sales and marketing.Let's talk how we can be of help to each other, after all helping others is what gives us positive energy and good vibes. Being kind is essential... Read More →
Scaling data centers for GPU-as-a-Service (GPUaaS) customers- connecting multiple data halls poses significant challenges- especially as the number of VxLAN endpoints grows within the network fabric. This presentation explores how EVPN VxLAN Multisite can address these challenges by dividing the network fabric into smaller segments and isolating them with split-horizon groups. We'll discuss the standard implementation and how SONiC- combined with Free Range Routing (FRR) and Switch Abstraction Interface (SAI)- achieves this. Additionally- we'll cover important considerations for network design to avoid common pitfalls and ensure a robust- scalable infrastructure for AI clusters.
ECMP (Equal Cost Multi-pathing) is the most commonly used technique to distribute traffic across various types of networks. WCMP (Weighted Cost Multipathing) is an additional method that further optimizes traffic distribution. In this talk- we will discuss WCMP implementation in SONiC disaggregated chassis and its benefits for Microsoft Azure Data Centers.