The Interactive & Immersive HQ

Think like a GPU

Why should you think like a GPU?

Processing power is everything in the world of graphics programming, interactive media, and real time media. The more processing power you have, the more content you can generate in real time. The increase in hardware processing capability combined with the decrease in hardware costs has aided in making immersive media accessible to the average computer user. It’s incredible that you can run Processing, openFrameworks, and TouchDesigner on consumer hardware and learn to create interactive applications without breaking the bank.

There have been some stalls in hardware development despite the increase in accessibility. CPU manufacturers are often focused on miniturization and power management to advanced their position in the mobile market. You don’t really see game-changing advancements in CPUs if you compare the performance of the 2016 Intel I7-7700k and the 2011 Intel I7-2700k.

This is in stark contrast to the development of GPUs over the last number of years. Here’s some of the improvements from the 2011 nVidia Geforce GTX 560 and the 2016 nVidia Geforce GTX 1080:

  • 8x more GPU memory
  • Almost 2x core speed
  • More than 2x memory bandwidth
  • 4x L2 cache
  • 4x texture fillrate
  • 3x more pixel fillrate

GPUs have made great advances. This is why many softwares are now turning towards GPU processing, and graphics programs are no exception. But this shift comes with some workflow changes that may not be obvious. Below are a few ways to think that will make working with GPUs easier.

Allocate data once & transfer fewer streams

Memory allocation and data transfer are two elements of your programming to rethink. CPUs are good at being dynamic. They can jump between tasks and have fast access to their RAM. Data often needs to be transferred to the GPU before it can do anything though. This transfer of data between the CPU and GPU is a comparatively slow process. It can often be accidentally triggered if you use up all of your GPU memory and the GPU begins offloading data to the CPU.

A good practice to start with when working with GPUs and shaders is to try to allocate how much data you’ll need once. Consider this situation:

  • you have 1000 interactive bouncing balls that enter and leave the screen

In CPU programming, you might have no objection to making an arbitrary number of balls and then creating and destroying them as needed. This is not an issue when working purely with the CPU. When working with the GPU, you are able to have a cleaner and more efficient workflow is you allocate your data once. This is a general strategy in GPU programming. Games try to allocate as much as possible on launch. Loading screens in games exist as a time when the application can dump all of its memory and allocate all the memory/transfer all the data needed for the next level.

In this case or the bouncing balls, you would allocate and create 1000 balls right on launch and then hide the balls that are not needed. As balls are required, they would already be ready to go and their position values would be updated to reveal them. This is much faster than creating another ball in memory and allocating the memory on the GPU and creating a separate data transfer stream for it.

This technique of allocating data (generally by setting a maximum) allows you to avoid memory fragmentation and reduces the overhead associated with streaming data from the CPU to the GPU.

Get Our 7 Core TouchDesigner Templates, FREE

We’re making our 7 core project file templates available – for free.

These templates shed light into the most useful and sometimes obtuse features of TouchDesigner.

They’re designed to be immediately applicable for the complete TouchDesigner beginner, while also providing inspiration for the advanced user.

Process lots of data at once

Parallel computing may be a term that you’ve heard used in regards to GPUs. The easiest way to think understand this is to realize that a GPU is just a large amount of very simple CPUs. So take your Intel I7 mentioned earlier, reduce the clock speed and complexity of its features, and strap hundreds of them onto a board and you have a GPU. You’ll find that in practice this means that if:

  1. You have lots of data, and
  2. You have a repeatable calculations that needs to happen on every single element of that data set

Ding, ding, ding, this type of work is where GPUs shine. This is a perfect example of what parallel processing is all about. Even though the individual processing units on the GPU are “weaker” than your regular CPU, because there are so many of them, they can all do the same task at the same time across large data sets much faster than your CPU. This is why there is such a huge benefit to things like particle systems being computed on the GPU. You can calculate the positions of millions and millions of particles on the GPU because you’re repeating the same calculations across a large data set.

It’s the equivalent notion to trying to decide which is more useful:

  1. A very few number of very smart people
  2. A very large number of average intelligence people

So when you’re trying to think like a GPU, you should think towards larger data structures. Embrace the larger but less capable workforce. Don’t think in terms of variable integers or floats. Think in terms of arrays/lists/matrices. It is all too common to see applications with many individual control channels being processed by similar operations on an individual scale. This is CPU style thinking. You should being to thinking about having a single large array of control values. Cram as much similar data together as possible into the largest data set possible. This sets you up to send a single stream of data to the GPU where you can perform your similar operations in parallel on all the elements of your array.

Importance of thought process

I specifically chose to write about thinking like a GPU because it’s often the biggest hurdle when developers move from working on the CPU to working on the GPU. They often try to be very agile and dynamic with memory. Or they try to transfer lots of small data chunks to be processed repeatedly. There are a ton of CPU idioms that will lead to frustration and full-on programming road blocks when you’re working with GPUs. The best thing you can do to program on GPUs is to learn how GPUs work and why they’re efficient. Then you need to be flexible, understand you’re working with a different beast, and not be afraid to work in a strange way that may not make sense at first.

I remember writing my first GPU shaders and being utterly dumb-founded by the process. I kept pestering people more knowledgeable than myself to just try and understand it. If you’re able to release your old programming habits and take advantage of GPUs, you’ll be able to extend your real time immersive media capabilities by at least 10 fold.