Tuesday, September 22, 2020

An overview of PC build performance limitations and their resolutions

This post is the follow up to the more philosophical discussion of why 'bottleneck' isn't a terribly helpful concept in understanding and fixing PC build issues. It intends to offer a more sound perspective that isn't fixated on the 'bottleneck' concept and is geared towards new builders.

This article isn't intended as a comprehensive guide to tuning or optimizing every aspect of a PC build. It's more of an overview to help new builders get oriented towards building for and optimizing the total play experience. I'll probably follow up with a final post in the series with more practical, step-by-step build development advice based on the framework I'm laying out here.

The total play experience

Focusing just on statistical performance optimization in building or optimizing a PC is, I think, too narrow a view. Instead, the goal should be to optimize the total play experience: this includes not only what is traditionally thought of a 'performance optimization' (i.e. trying to squeeze every single FPS out of a game), but also elements that are more about 'quality of life' and contribute less directly, if at all, to, e.g., a system's scores on a synthetic benchmark.

Broadly, the 'performance optimization' elements have to do with directly optimizing the system's rendering pipeline, so we'll start that topic with a brief discussion of what this is and the three main components that contribute to it -- CPU, GPU (or graphics card) and the monitor (or display) refresh rate.

But first, we'll start with the more indirect, 'quality of life' elements, because they're easier to understand and summarize.

Quality of life elements

There are many aspects of a system that improve the quality of the play experience. For instance, having a keyboard or mouse that is responsive and 'just feels right' may significantly improve a gamer's experience. The right desk or chair might greatly increase comfort. A reliable, low-latency Internet connection greatly improves the experience of online games. Not to take away from any of these factors, but at the highest level, I think there are three build elements that, when properly considered and set up, greatly contribute to quality of life in a modern gaming PC build: mass storage, RAM and monitor factors other than refresh rate (we'll discuss refresh rate in the performance optimization section).

Mass Storage

Mass storage refers to where your system stores large sets of files, such as your operating system, game downloads, documents and save game files. In a typical Windows system, mass storage devices are assigned drive letters like C, D, E and so on.

Physically, mass storage devices can either be mechanical Hard Disk Drives (HDDs), which store data using magnetic plates or solid state drives (SSDs), which store data using electrical signals in memory chips. HDDs are the older technology, are much cheaper per gigabyte of storage and tend to come in lager capacities (eg 2TB, 4TB and up). SSDs are newer, are more expensive per gigabyte and tend to come in smaller capacities (e.g. up to 2TB, though larger devices are starting to become available). Both HDDs and SSDs come in a variety of form factors. All HDDs and some SSDs need to be mounted to mounting point in your PC case and connected to your motherboard with one cable for data and to your system's power supply with a second cable for power. Some SSDs are cards that are installed directly into a slot on your motherboard and don't require any separate cables. Others are installed and connected just like hard drives.

SSDs are able to store and retrieve data significantly (dozens or hundreds of times) faster than HDDs. This means that if you install your operating system on an SSD, your PC will boot noticeably faster than it will from an HDD. And if you store your games and save game data on an SSD, level and save file load times will be faster.

In my experience of using computers over 30 years, switching to a system with an SSD had the single biggest impact on how 'snappy' I perceived my system to be as compared to any other innovation or evolution of hardware.

RAM

RAM (Random Access Memory) is very fast storage that your system uses for programs that are actively running. For most modern (circa 2020) games will run fine on a system with 8GB of available RAM, with 16GB providing reasonable room for future proofing.

Keep in mind that these are recommendations for available RAM. Your game you're playing isn't going to be the only thing using RAM on your system. The operating system itself has some overhead. And if, like most people, you have several browser tabs, chat programs and utilities running in the background while you game, each of those consumes RAM, leaving less than 100% of your total RAM available to the game.

Running out of RAM while gaming is not fun. It can lead to the game becoming unstable or crashing in the worst case scenario. Short of this, when a typical Windows system runs low on available memory, it attempts to free it up by writing some data from RAM to one of the system's (much slower) mass storage devices in a process known as paging. This process consumes processor and disk resources as it runs, which can end up stealing those resources from the game itself and leading to lag or hitching.

There are other considerations that go into fully optimizing a system's RAM beyond capacity. These include the speed at which the memory operates, the number of memory modules, the fine tuning of memory timings and more. Different systems are more or less sensitive to different aspects of memory performance/optimization: AMD Ryzen processors, for example, have been shown to benefit significantly from fast RAM.

Monitor Factors

First of all, I strongly encourage you to think of your monitor as integral component of your gaming system. It's the thing you're going to spend all your time looking at. If it doesn't deliver a good experience to you, it's going to be a bummer.

Refresh rate (discussed) later is only one factor affecting the quality of the play experience provided by your monitor. Other factors (by no means exhaustive) include
  • How accurately the monitor reproduces color
  • How black its blackest blacks are
  • How big it is physically
  • It's pixel resolution
  • What its pixel response time is (i.e. how fast pixels can change color)

Rendering Pipeline Performance

The elements of a gaming system most traditionally identified with its performance and specs make up its rendering pipeline. Before discussing the components -- CPU, GPU and monitor refresh rate -- let's talk about what the rendering pipeline is.

Frame rendering

As you probably know, moving video and game images are made up of frames: still images that get shown to us very rapidly, one after the other. If the images are presented rapidly enough, this creates the illusion of motion. In games, the rendering pipeline is the process by which these frames get generated in realtime and are presented to the user.

Performance of the rendering pipeline on a given system is traditionally expressed in Frames Per SecondFPS or framerate. As you know from your experience as a player, a game's framerate can vary from moment to moment based on how demanding the rendering work is at that moment. If a game isn't able to consistently generate enough FPS for a given user to perceive continuous motion, the play experience is compromised by stutter, lag, hitching, etc. The more FPS the system can present to the user, the smoother the play experience will seem (up to certain limits).

In order to ultimately 'deliver' a frame to the player, each component of the rending pipeline must perform a specific function before 'shipping' the frame on to the next component. For example, the CPU must finish its work on a given frame before the GPU (the next component in the pipeline) can begin its work on it. The GPU needs the results of the CPU's work in order to do its job (this is a slight oversimplification but its true enough for our purposes).

For a user to play a game at a steady 60 FPS, the system must present him with a new frame every 16.67 milliseconds. If the CPU takes so long to do its work before handing the frame off to the GPU that the entire rendering process can't complete within that 16.67 milliseconds, the frame rate with go down. This is an example of the proper understanding of the concept 'bottleneck.' If the frames take long enough to generate, the user will experience stutter, hitching and the like.

CPU

The Central Processing Unit or CPU is the general-purpose computing 'engine' of your system. In the frame rendering cycle, the CPU is responsible for two main things. The first is computing the state of the game in every frame. You can think of the game's state as the complete set of information about every element of the game and every object in the game world.

Computing the game's state requires the CPU to do many things, for instance processing control inputs, updating the game's physics simulation and processing the behavioral instructions of enemy AI. The more complex the game, the more the CPU has to do each frame. A strategy game with hundreds of AI units on the screen at once will require more CPU resources to update its state than a simple 2D platformer.

Once the game's state has been calculated, the CPU uses that data to create the instructions that the GPU will use in the next step of the frame rendering cycle. These instructions are known as draw calls. The more visually complex the frame is, the more draw calls will be required for the GPU to render it and, therefore, the more draw calls the CPU will first need to make. Factors like how high the game's resolution is, texture quality, number of light sources in the scene and whether any post-processing effects (e.g. fog or blur) are being applied affect the number of draw calls.

As far as performance goes, one CPU can be superior to another in its ability to run more instructions per second. When we speak of one CPU being 'faster' than another, this is essentially what we're referring to. A CPUs clock rate, usually expressed in megahertz or gigahertz (e.g. 4.5Ghz) is a measurement of how many instruction cycles per second the CPU can perform, with higher numbers indicate higher performance. All things being equal, a faster CPU will be able to drive more FPS than a slower one.

Modern CPUs typically contain more than one CPU core. Each core is capable of running instructions independently of the other cores and at the same time, allowing the CPU on whole to do more things at once. Historically -- and still in 2020 -- most games are not coded in a way that takes advantage of lots of processing cores, so games generally benefit more from faster CPU clocks than they do from more cores. This is starting to change though. With the availability of inexpensive, higher core count consumer CPUs like AMD Ryzen and low-level APIs like DirectX 12 and Vulkan that let developers more easily create games that take advantage of and automatically scale to higher core counts, expect more and more games to optimize for multicore performance over pure clock rates in the years to come.

GPU

The GPU is an integrated set of components, including dedicated processors and (usually) memory, that can create visual images very quickly. Because their components are highly optimized for this task, GPUs can create these images much more quickly than general-purpose computing hardware (like a regular CPU) can, which is a necessity for high-FPS gaming: the modern CPUs and system RAM can't both process the game state and render the visuals quickly enough, so the workload gets split between the CPU and GPU.

GPUs use raw visual assets (such as the textures used to create object's appearances) and follow the instructions contained in the draw calls to produce the frame images that the player eventually sees. A draw call might (effectively) contain an instruction that says 'apply this texture to the surface of a triangle with it vertices at the following screen coordinates...' The GPU is the system component that actually executes these instructions and ultimately determines what color each pixel in the frame should be based on that frames total set of draw call instructions.

As GPUs evolve over time, manufacturers increase their capabilities by making their processing units faster, adding more processing units, adding more and faster memory and more. More powerful GPUs are able to process more FPS than less powerful ones. In some cases, new GPUs support entirely new capabilities. Nvidia's RTX GPUs for example, have specialized processing cores that allow them to accurately simulate how light travels and interacts with different surfaces, creating much more realistic lighting effects in a process known as realtime ray tracing (in games where the developer supports it).

If a CPU is consistently able to hand the GPU an acceptable number of FPS but the GPU is unable to do its work quickly enough, then the GPU may be bottlenecking the system (in the legitimate sense of that term).

Monitor Refresh Rate

Once the GPU has finished rendering a frame, it sends that frame to the monitor (or other display device like a TV), which then displays it. Monitors are capable of updating themselves a certain number of times per second. This is known as the monitor's maximum refresh rate, or simply refresh rate, and is measured in cycles per second, or hertz (hz). A 60hz monitor can fully replace the image on the screen 60 times per second.

Like other parts of the rendering pipeline, fully refreshing the display takes some (tiny) finite amount of time: it doesn't happen instantaneously. Modern monitors update themselves by replacing the contents of the screen one line (or row) at a time from top to bottom. On a 1920 by 1080 monitor, there are 1080 rows of pixels on the screen. During each update cycle, the first line is updated, then the second, then the third, and so on. After the 1080-th line is updated, the process starts over again for the next frame.

As mentioned earlier, a monitor with a higher refresh rate (paired with other system components capable of driving it) will result in a gaming experience that feels smoother and more immediate, within the limits of what the player can appreciate. Some people can actually perceive the flicker between frames of a 60hz monitor. Others can't. Most gamers will perceive an improvement in smoothness when moving from a 60hz to a 120 or 144hz display. Elite gamers like esports athletes can benefit from 240hz and even 360hz displays, at least under certain circumstances. At an extreme, it's almost certain that no human being would benefit from a hypothetical 1000hz monitor as opposed to a 500hz one.

If your GPU is not capable of driving more FPS than your monitor's refresh rate, you are leaving performance -- in the form of potential smoothness and immediacy -- on the table, assuming you can personally perceive the difference between the current and potentially higher refresh rates. For me -- a 40+ year-old person (whose vision and reflexes have therefore started to deteriorate with age) who doesn't play a lot of twitch-heavy game titles, I can appreciate the difference between 60 and 144hz in the titles I like to play. But I can't perceive any difference between 144 and 240hz or higher. A younger, elite player of twitch-heavy games might have a different experience, but a system capable of doing 240hz (as opposed to 144) would be wasted on me.

The opposite situation can also be true: your GPU may be capable of providing more FPS than your monitor's refresh rate is capable of displaying. In practice, this usually results in one of two conditions:

  1. If you do nothing else, the system will continue to deliver frames to the monitor as it generates them. This means that the monitor may receive more than one frame per refresh cycle. At whatever point during the refresh cycle the new frame is received, the monitor will pick up refreshing the next display line based on the newer data in the second frame. This results in what players experience as tearing: a visible line on the screen that represents the border between where the monitor drew the frame based on the older vs. the newer frame data.
  2. The GPU/game setting called Vsync forces the GPU to synchronize its refresh rate with that of the monitor. Since a standard monitor is a 'dumb' device in this respect, this is accomplished by the GPU artificially limiting the framerate it outputs to coincide with monitor's refresh rate. This means that even though your GPU and CPU might be able to output 300 FPS on a given game, Vsync will limit that output to 144hz if that's all your monitor will support. This eliminates tearing but leaves frames on the table.
There are also what are known as adaptive refresh rate technologies like GSync and Freesync which make the monitors that support them smarter. These monitors are able to communicate with the GPU and can synchronize their refresh cycles precisely at the arbitrary FPS value the GPU is outputting at a given moment (up to the monitor's maximum refresh rate).

If you are in this scenario (and assuming you don't have an adaptive display), which side of the lower FPS / no tearing vs. higher FPS / with tearing line you fall on is a matter of personal preference.

As noted elsewhere, there are verifiable benefits -- especially for competitive gamers -- to running at the highest FPS possible, even if your monitor's refresh rate is lower. A full discussion of this issue is beyond the scope of this post, but I do want to acknowledge it. But I consider it a specialized issue relevant to certain gaming scenarios, not one of general build advice.

Putting in All Together

This table summarizes the six components of a build that most commonly impact quality of life and rendering performance, along with how the user will perceive a component with sub-optimal performance.


‘Quality of Life’ Components

System Component

Description

What you’ll experience if this component is limiting system performance

Mass Storage

Provides long-term storage for lots of files like your operating system, game files and savegame data

A slow mass storage device (such a traditional HDD) will make loading games and saving/loading savegame seem slow.

RAM

Provides super-fast working memory for running programs (including games)

If the system doesn’t have enough available memory for all running programs, games may hitch, lag or slow down as the operating system pages to use slower mass storage to hold data that would otherwise be in RAM.

Monitor factors (not refresh rate)

Affect the perceived quality of the visuals displayed

·         Inaccurate color reproduction

·         Harsh or jagged edges around objects

·         Motion trails

·         Visual artifacts

·         Overly bright or dim images (even after monitor adjustment)

Rendering Pipeline Components

System Component

Function

What you’ll experience if this component’s performance is being limited by the preceding one

What you’ll experience if this component gates the performance of the preceding one

CPU (or processor)

Provides general-purpose computational power to the system. In games, does computational work to update the game’s state and to create the draw call instructions the GPU will use.

N/A

·         N/A

GPU (or graphics card)

Executes draw call instructions to generate the frame images that will be presented to the user

·         Lower FPS

·         Lag and stutter

·         Lower FPS

·         Lag and stutter

·         Missing out on certain visual effects / visual quality as you lower graphics settings to compensate

Monitor refresh rate

Refers to how many times per second the monitor can update every pixel on the screen

Visuals that seem less smooth than they might otherwise be (difference may not be perceptible to all users)

·         With Vsync disabled, frame tearing

·         With Vsync enabled, lower FPS than you would otherwise achieve


No comments: