AI Computing Power Surge Imminent, Optical Modules Embrace High Certainty Growth
On 2022-04-19
Comments Off on AI Computing Power Surge Imminent, Optical Modules Embrace High Certainty Growth

Optical Modules: Key Optoelectronic Devices for Optical
Communication Optical modules are optoelectronic devices used for data transmission between communication equipment, primarily performing optical-electrical and electrical-optical conversion. Structurally, optical modules mainly consist of optical transmitter devices (TOSA, including lasers), optical receiver devices (ROSA, including photodetectors), functional circuits, and optical (electrical) interfaces. Among these, the optical transceiver components TOSA/ROSA are the core of optical modules, with TOSA converting received electrical signals into optical signals, transmitted via optical fibers, and then converted back into electrical signals by ROSA, followed by amplification. According to OFWEEK and the Forward Industry Research Institute, optical device costs account for approximately 73% of the total optical module cost, with TOSA representing 48% of the optical device cost, equivalent to 35% of the entire optical module cost; ROSA accounts for 32% of the optical device cost, corresponding to 23% of the entire optical module cost.
From an industry chain perspective, optical modules are positioned in the middle of the optical communication industry chain. Upstream components include optical chips (active and passive) and optical components, which together form optical devices (also divided into active and passive). Optical devices, along with electrical chips, PCBs, etc., constitute optical modules, with optical devices representing the highest cost within optical modules, where optical chips alone constitute over 50% of the optical module cost. Optical modules are subsequently applied in optical communication equipment within the industry chain and are utilized in terminal applications in the telecommunications market, as well as in the data communication market dominated by major cloud service providers such as Google, Microsoft, Amazon, and Meta.
AI Drives Surge in Demand, Manufacturers Seize Opportunities
In this section, we will elaborate on the current status and trends of the optical module industry. Previously, optical modules were primarily applied in two major markets: telecommunications and data communication. While the share in the telecommunications market has gradually decreased, the data communication market has become the primary market. Post-2022, capital expenditures by cloud providers have declined somewhat, but the emergence of ChatGPT has sparked an AI technology revolution, marginally driving the overall upward trend in the data communication market. AI has become the primary growth point for optical modules and is expected to lead to profound changes both in technology and demand in the long term.
Telecommunications Market Steadily Growing, Vast Prospects in Data Communication Market
Telecommunications Market: Birthplace of Optical Modules, Operator Capex Shifting Towards AI in 5G Era The telecommunications market is the birthplace of optical modules. In terms of telecommunications network transmission requirements, the 5G transmission network consists of front-haul, mid-haul, and back-haul segments, connecting cellular base stations, core networks, and data centers. Front-haul primarily uses 10G and 25G optical transceiver modules, mid-haul primarily uses 50G, 100G, and 200G optical transceiver modules, and back-haul primarily uses 100G, 200G, and 400G optical transceiver modules. The front-haul subsystems, with their long-distance and high-density characteristics, have the highest demand for optical modules, thus primarily requiring lower-rate optical modules. According to LightCounting, the global telecommunications optical module market was approximately $21.66 billion in 2020, projected to reach $33.54 billion by 2025, with growth rates of 3.4%/11.8%/9.2% from 2023 to 2025. Additionally, based on our analysis of historical industry development, optical module iterations in the telecommunications market occur approximately every 7-10 years, relatively slower compared to the data communication market.
On the other hand, 5G investments have gradually peaked, and growth rates are stabilizing.
Furthermore, the expansion of fiber-to-the-home (FTTH) continues, with a shift towards 10G PON upgrades becoming increasingly prevalent. Asia-Pacific operators are leading the global upgrade to 10G access networks. Globally, according to Omdia data, the momentum for FTTH infrastructure construction is strengthening in most countries, with global FTTH household penetration expected to exceed 1.2 billion households by 2027, and the global PON equipment market expected to surpass $18 billion by 2027.
In summary, in the telecommunications market, the proportion of capital expenditure on traditional mobile networks has decreased, with telecom operators expanding AI-powered network layouts as a second growth curve, combined with the expansion of FTTH scale and the increase in the number of 10G PON ports, resulting in a stable and growing demand for optical modules in the overall market.
Data Communication Market Driving Evolution, Leaf-Spine Architecture Boosting Demand.
The data communication market is currently the largest and fastest-growing market for optical modules. According to Lightcounting predictions, the data communication market will experience rapid growth in the coming years (excluding AI considerations). The rapid development of the data communication market is mainly due to the significant increase in data center traffic and changes in data center network architecture. Specifically, on one hand, the explosion of cloud computing, big data, and the Internet of Things has led to a surge in data center traffic and convergence. On the other hand, upgrades in data center architecture and increased connections between switches have jointly driven the demand for optical modules.
From the perspective of data center architecture, data center networks are gradually transitioning from traditional three-tier structures to new spine-leaf structures. The spine-leaf network architecture adopts a flattened network, where each spine switch is connected to all leaf switches. Data transmission can dynamically select multiple paths, effectively alleviating bandwidth pressure and improving data transmission efficiency. Due to the significant increase in the number of ports connected under the spine-leaf architecture, the demand for optical modules increases accordingly. Calculations based on the optical modules required for general servers show that in the spine-leaf architecture, with OR switches interconnected with cabinets, and each cabinet containing 6-10 servers, each server is cross-connected with all upper-level devices. The convergence ratio of downstream to upstream ports is generally 3:1. Therefore, in the spine-leaf architecture, one server requires 4-6 optical modules. If a data center employing the spine-leaf architecture requires 10,000 servers, approximately 60,000 optical modules would be needed.
Furthermore, the spine-leaf network architecture also increases the connection density, interface speed, and switching capacity of internal devices, thereby driving the iteration of optical module products towards higher speeds. Currently, optical modules are in the early stages of transitioning from 400G to 800G, with the expectation that 800G optical modules will experience rapid growth after 2023. Looking back, before 2018, the optical module market was primarily dominated by 100G, mainly driven by the construction of 5G networks in the telecommunications market. After 2019, capital expenditures in the data communication end increased rapidly, with cloud computing driving the iteration of optical modules towards 400G. According to LightCounting’s predictions, by 2026, the optical module expenditures of the top five cloud computing companies – Alibaba, Amazon, Facebook, Google, and Microsoft – will exceed $3 billion, while 800G optical modules will dominate the market.
AI: ChatGPT Initiates Technological Revolution, Vast Opportunities in the Optical Module Market.
In this section, we will elaborate on two aspects. Firstly, the AI revolution led by ChatGPT has sparked a surge in computing power demand, becoming the primary driving force for the optical module market. The second part will use NVIDIA computers and GPT-3 large models as examples for calculation.
Large AI models open up the ceiling of computing power demand, and cloud providers’ performance exceeding expectations catalyzes subsequent expenditures. The emergence of ChatGPT leads the AI wave: ChatGPT is an AI chatbot software (model) released by the American startup company OpenAI, which is a large-scale language model. ChatGPT uses a large language model (LLM) based on the GPT-3.5 architecture, capable of tasks such as writing emails, code, translation, etc., and providing more extensive search services. Subsequently, an AI arms race ensued, significantly driving computing power demand.
Specifically, large AI models are divided into training and inference sides: The training process, also known as the learning process, involves training a complex neural network model through large amounts of data to determine the values of weights and biases in the network, enabling it to adapt to specific functions. The training process requires high computational performance, massive amounts of data, and the trained network possesses certain generality. The inference process, also known as the judging process, involves using the trained model and the determined parameters in the neural network model during training to perform calculations and draw various conclusions using new data. The factors affecting the training process of large models mainly include the number of model parameters, the amount of training data, and chip computing power. Regarding the number of model parameters, according to Wave Computing’s “Design and Optimization of AI Computing Cluster Solutions”, the number of model parameters for large models has increased from 94M to 530B over the past four years, an increase of nearly 5600 times. Currently, chip computing power is mainly provided by NVIDIA GPUs, which are also used in supercomputers corresponding to the use of optical modules. NVIDIA AI GPUs mainly include A100, H100, etc., and the newly launched supercomputer DGX GH200 at the end of May is equipped with its super chip, Grace Hopper. Grace Hopper combines NVIDIA Hopper GPUs with NVIDIA Grace CPUs, using NVIDIA NVLINK-C2C to provide a total bandwidth of 900Gb/s, greatly increasing the demand for optical modules corresponding to supercomputers.
Additionally, cloud providers exceeded expectations in 2022, with NVIDIA data center revenue reaching a record high. This was mainly due to cloud vendors and enterprise customers increasing their demand for GPU chips for training AI, further releasing positive signals. With domestic and foreign cloud providers successively launching large AI models and increasing AI training, high growth certainty will continue to be sustained.
Predictions from NVIDIA computers and large model computing power: Significant increase in demand for 800G optical modules
Firstly, let’s take the NVIDIA DGX SuperPOD as an example to calculate the ratio of a single GPU to optical modules. The DGX A100 and DGXH100 network clusters mainly use two types of networks: InfiniBand and Ethernet. According to network functions, they are divided into four categories: computing, storage, InBand, and Out-of-Band. Among them, computing and storage use IB networks, while InBand and Out-of-Band use Ethernet networks.
From the perspective of data center architecture, data center networks are gradually transitioning from traditional three-tier structures to new spine-leaf structures. The spine-leaf network architecture employs a flattened network where each spine switch is connected to all leaf switches. Data transmission can dynamically choose multiple paths, effectively alleviating bandwidth pressure and improving data transmission efficiency. Due to the significant increase in the number of ports in the spine-leaf architecture, the demand for optical modules also increases.
Taking the example of a DGX A100 cluster with 140 nodes, where each DGX A100 server requires 8 A100 GPU chips, totaling 1120 chips; the cluster consists of 7 scalable units (SU), with each SU comprising 20 DGX A100 servers. The computing side network adopts a fat-tree topology, with leaf switches, spine switches, and core switches forming three layers. Under the IB network architecture, a complete fat-tree topology is achieved, where each port corresponds to an optical module. Considering the cabling specified in NVIDIA’s DGX SuperPOD whitepaper, each port corresponds to an optical module. Therefore, the total number of optical modules required for the computing and storage sides is calculated to be 8048. Thus, the ratio of the number of 200G optical modules required per A100 GPU is approximately 1:7.2.
On May 29th, NVIDIA introduced the latest DGX GH200 supercomputer, equipped with 256 NVIDIA GH200 Grace Hopper super chips, where each chip can be considered a server. The DGX GH200 adopts a two-layer fat-tree topology, with 96 and 36 switches in the first and second layers, respectively. Each NVLINK SWITCH switch has 32 ports with a speed of 800G. Additionally, the DGX GH200 is equipped with 24 NVIDIA Quantum-2 QM9700 IB switches for the IB network. Calculating based on ports, the DGX GH200 requires a total of 1920 800G optical modules, with each GH200 chip requiring 12 800G optical modules.
Next, we calculate the incremental number of optical modules brought by large language models on both the training and inference sides. For the training side, taking GPT-3 as an example, the training process involves a considerable number of floating-point operations (FLOPs). Using NVIDIA A100 GPU for training, with an estimated training time of 1 day, GPT-3 requires over 80,000 200G optical modules. Considering the utilization rate of FLOPs, the required number of optical modules may reach around 350,000.
Currently, with the rapid increase in the number of large models, the demand for optical modules continues to rise. Additionally, from the perspective of optical module development, as the speed of optical modules increases, their cost proportion in Ethernet switches also rapidly rises.
At the same time, starting from 1.6T, traditional pluggable rates are upgraded or reaching their limits, and Co-Packaged Optics (CPO) is recognized by the industry as the future form of high-speed products. CPO refers to the optical-electrical co-packaging of the optical engine and switch chip, ultimately replacing optical modules. Compared to traditional pluggable forms, CPO shortens the wiring distance between the switch chip and the optical engine, thereby reducing the power consumption of electrical signal driving. According to Cisco’s official website, replacing pluggable optical modules with CPO in a 51.2T system can reduce the power required for connecting the switching ASIC to the optical engine by 50%, and the total system power can be reduced by 25-30%.
Currently, CPO technology is still in the early stages of deployment. Due to factors such as reliability and cost, its advantages are not yet clearly demonstrated in lower-speed optical modules. According to Lightcounting, CPO shipments are expected to start from 800G and 1.6T ports, with commercialization beginning from 2024 to 2025, and scaling up from 2026 to 2027. According to Yole, currently, CPO will coexist with traditional pluggable optical modules and its market share will continue to expand, occupying the main market by 2034. In terms of scale, according to Yole, the CPO market size was approximately $6 million in 2020 (compared to $900 million for optical modules), and it is expected to reach $300 million in 2026 (compared to $2.2 billion for optical modules), with a 6-year CAGR of 104% (compared to 19% for optical modules). By 2032, the CPO market size is projected to reach $2.2 billion, with a CAGR of 19%, while the optical module market during the same period is expected to be $3.6 billion, with a CAGR of 10%.

























