Ep0-2: The Origin of Light 1

Where is Light Born?

Last episode, Ep.0-1, we explored the massive flow of data heading toward the monitor. But a fundamental question arises here: Who generates all this data, and how?

We often handwave it by saying “the computer just handles it,” but inside, different components are constantly working in divided roles to maintain this incredible speed.

Before the 0s and 1s flow through the cables, there is a source where they are born.

Today, I want to talk briefly about the two types of brains coexisting inside a computer: the CPU (Central Processing Unit) and the GPU (Graphics Processing Unit). Let’s follow the starting point of how these two entities cooperate to create the light.

THE CPU

If you open a computer case, hidden deep beneath a massive cooler, lies a small chip. This is the CPU (Central Processing Unit).

As the name implies, this device processes everything at the ‘center’; it is the main brain of the computer. Every command—moving the mouse, typing on the keyboard, executing a program—arrives here first.

It excels at calculating complex formulas and making different decisions based on the situation, a capability known as Branch Prediction. For instance, when loading a complex website, the CPU reads through thousands of lines of JavaScript code and instantly handles tens of thousands of logical judgments, such as “If the user is in Dark Mode, paint the background black; if they aren’t logged in, show a popup.”

However, the CPU has one fatal weakness. It is optimized for ‘Serial Processing’.

To use a metaphor, the CPU is like a ‘Competent Architect’. It is skilled at complex design and judgment, but inefficient at simple repetitive tasks.

The CPU basically processes data in order. No matter how fast its calculation speed is, performing millions of repetitive tasks 60 times a second all by itself is physically near-impossible.

This is because the CPU is an ‘Architect’ who designs the overall plan and issues commands, not a ‘laborer’ who stamps out millions of dots one by one.

Therefore, this architect needs a partner with overwhelming quantity to handle this massive amount of simple calculation on their behalf.

THE GPU

Once the Architect (CPU) has finished the design, an entity is needed to receive those blueprints and actually do the work. That is the GPU (Graphics Processing Unit).

Usually, office laptops have ‘Integrated Graphics’ residing in a corner of the CPU, while high-end gaming PCs have a massive independent space called ‘Discrete Graphics (Graphics Card)’.

The GPU was born with a destiny exactly opposite to that of the CPU. If the CPU is the Architect, the GPU is like ‘Laborers specialized in simple manual labor’.

This difference is evident from the physical structure of the semiconductor. Even a high-performance CPU usually has only about 8 to 16 Cores. However, high-end GPUs (e.g., RTX 4090) house over 16,000 cores.

Of course, each GPU core is much simpler than a CPU core. They are clumsy at complex branch prediction or logical operations. But their weapon is Parallelism—or simply, numbers.

The Aesthetic of Parallel Processing (SIMD)

Imagine this. If the screen resolution is 1920x1080, there are about 2 million pixels to paint.

The CPU tries to paint these 2 million dots one by one, alone. No matter how fast its hands are, there is a limit. On the other hand, the GPU gives a command to thousands of laborers simultaneously.

Everyone, paint your assigned spot red!

At this single command, thousands of cores rush in Simultaneously (Parallel) and color their respective coordinates. They don’t need to care whether the person next to them is painting or not. The points on the screen are independent of each other.

This is the operating principle of the GPU called SIMD (Single Instruction, Multiple Data)—‘processing massive amounts of data simultaneously with a single instruction.’

Ultimately, the spectacular graphics we see on the monitor are the result created in a short time through the cooperation of the Architect (CPU) and thousands of Laborers (GPU).

Reference: The Birth of the GPU

In fact, the CPU and GPU didn’t have this relationship from the start.

1. The Era of the CPU

In the early 1990s, all graphics processing was solely the CPU’s job. Of course, there was a device for monitor connection (Graphics Card) back then too. But it was merely a ‘Video Adapter’ (VRAM + RAMDAC) with no computing capability.

2. The Arrival of 3D Accelerators

The situation changed with the advent of 3D games. Calculating polygons and mapping textures was too harsh for the CPU. At this time, 3D Accelerators like 3dfx’s ‘Voodoo’ appeared. The CPU did the complex calculations, and these accelerators handled the simple coloring.

3. The Birth of the GPU (GeForce 256)

In 1999, NVIDIA marked a historical turning point with the release of the ‘GeForce 256’. It wasn’t just coloring anymore; it could handle the ‘Calculation Realm’ such as 3D coordinate transformation and lighting effects (T&L). NVIDIA named this chip the GPU (Graphics Processing Unit) for the first time. It was the moment it was elevated from a simple auxiliary device to a ‘Processor (Unit)’ with independent computing capabilities.

Pipeline

Inside the computer, thousands of conversations take place in the blink of an eye. So, how do the CPU and GPU cooperate?

Draw Call

Imagine a 3D modeling program. When you rotate a massive building on the screen with your mouse, the CPU is incredibly busy.

It calculates the ‘Physical Situation (Physics)‘—whether the building is collapsing or if a window has collided with another object. Once this logical judgment is finished, the CPU issues the final command to the GPU.

Alright, draw the building in this (X, Y, Z) space on the screen! (Draw!)

This command is called a Draw Call. This is like the architect giving instructions and stepping back. Only when this command falls do the thousands of laborers(GPU) finally start moving. Based on the coordinates thrown by the CPU, the GPU builds the frame, calculates the lighting, and starts coloring.

At this time, the interpreter role between the two is played by Graphics APIs like DirectX, OpenGL, and Vulkan. They are protocols that help the CPU and GPU, which speak different languages, communicate without misunderstanding.

Bottleneck

The problem is that the working speeds of the two brains do not always match. If one side is too fast or too slow, a Bottleneck occurs where the entire work is delayed.

1. CPU Bound (When the Architect is slow)

Situation: Strategy simulation games with countless units appearing on the screen (StarCraft, Civilization, etc.).
Phenomenon: The GPU has already finished the drawing and is waiting for the next task, but the CPU is gasping for breath calculating the positions of the units, delaying the command (Draw Call).
Result: Even with a high-end graphics card, the frame rate remains low.

2. GPU Bound (When the Laborers are slow)

Situation: The latest high-spec games set to 4K resolution with full options.
Phenomenon: The CPU pours out commands, but the GPU cannot handle that massive amount of pixels.
Result: The screen stutters or lags. (Most high-end gaming environments fall into this category).

Therefore, rather than unconditionally buying expensive parts, it is important to configure a system where the Balance of these two fits the usage purpose.

The Evolution of the GPU

Before finishing the story, we must check the ‘Second Evolution’ of the GPU. These laborers, who used to simply help with the CPU’s workload, have now started solving Math Problems directly.

CUDA & GPGPU

As mentioned earlier, the GPU is specialized in ‘performing simple calculations simultaneously.’ However, some researchers discovered a fascinating fact: “The math used to train Artificial Intelligence (matrix operations) and the math used to draw graphics are surprisingly identical.”

But there was a problem. These laborers were born to understand only graphics-related commands. To make them solve math problems, one had to forcibly disguise the math as graphics commands.

What broke down this language barrier was NVIDIA’s ‘CUDA’.

CUDA was a new platform that allowed us to command the laborers to “solve math problems directly” instead of graphics. Thanks to this, developers could pour the GPU’s immense computing power into Data Analysis and AI Training, not just graphics.

This is the beginning of GPGPU (General-Purpose computing on GPU). The Bitcoin mining craze and the current AI revolution (ChatGPT, etc.) were all made possible because graphics cards started solving ‘Math’ instead of ‘Pictures’. The reason NVIDIA grew from a simple graphics card company to a massive global enterprise lies in this ‘Intellectualization of Laborers’.

Tracing Light (Ray Tracing & Upscaling)

Of course, a revolution occurred in its main job, graphics, as well. Past GPUs only mimicked light (Rasterization). Shadows and reflections were all calculated tricks.

But now, GPUs trace actual particles of light through Ray Tracing technology. They physically calculate the reflection in a mirror or light refracting in water. The computational load increases explosively, but the GPU now solves this problem by borrowing the power of AI (Deep Learning).

Technologies that draw roughly in low resolution and then use AI to predict the blanks and convert it to high resolution (DLSS, FSR) have appeared. Now, the GPU is not just a laborer stamping pixels, but an Artist that corrects and creates images on its own.

Cooperation of Logic and Aesthetics

Spectacular graphics, smooth movement, and realistic light reflections. All these visual experiences are miraculous collaborations created in an instant by the Architect (CPU) with meticulous logic and the Laborers (GPU) with overwhelming execution power.

We now know what Form (Bitmap) data takes, what Path (Cable) it flows through, and fundamentally by Whom (CPU & GPU) it is born.

But there is still an unresolved question.

What rules are those countless 0s and 1s combined by? Specifically, what operations does the GPU perform so that simple numbers gather to become the beautiful ‘shapes’ we see?

Next time, we will enter the world of Rasterization, the process where these 0s and 1s are calculated and completed into a single scene.