Paul Engine

Progress Report: Deferred Rendering and Render Passes in Paul Engine

Date published: 27/07/2025

This post will go into detail on how I designed the initial implementation of the modular render pass and frame renderer system in Paul Engine. To evaluate the design, I will be putting it to the test by building a deferred renderer with it and proposing improvements in various areas.

Intro

Currently, rendering in Paul Engine is limited to an abstract renderer class which can be used to submit meshes for drawing and there is no way to define an actual render pass outside of raw calls to the renderer. A way of defining an independent render pass needed to be set up, so that renderer logic and graphics features can be expanded upon in a clean and efficient way.

How would we set up rendering logic right now? I guess we could write some code in the main update loop like this:


// Pseudo-code does not represent how the engine and the renderer 
// is actually architected, just a quick example to give you an 
// idea of the problem
void OnUpdate(const Timestep timestep)
{
  // Process inputs
  // ...

  // Update scene
  // ...
  
  // Render
  Renderer::BeginScene(m_ActiveCamera);
  for (Mesh m : m_ActiveScene)
  {
    Renderer::SubmitMesh(m);
  }
  Renderer::EndScene();
}

But what if we want to extend this logic into multiple render passes? For example, a simple post processing pass. Again, we could just write more logic here, maybe even write a function to avoid a messy main loop:


void MainRenderPass()
{
  Renderer::BeginScene(m_ActiveCamera);
  for (Mesh m : m_ActiveScene)
  {
    Renderer::SubmitMesh(m);
  }
  Renderer::EndScene();
}

void PostProcessPass()
{
  Renderer::BeginScene(nullptr);
  Renderer::SubmitDefaultQuad(m_ScreenTexture);
  m_PostProcessShader->Bind();
  Renderer::EndScene();
}

void OnUpdate(const Timestep timestep)
{
  // Process inputs
  // ...

  // Update scene
  // ...

  // Render
  MainRenderPass();
  PostProcessPass();
}

Okay, we've added another render pass to the main loop. But what if the player wants to disable the post processing effect? Let's add an if statement:


void OnUpdate(const Timestep timestep)
{
  // Process inputs
  // ...

  // Update scene
  // ...

  // Render
  MainRenderPass();

  if (PlayerOptions::IsPostProcessingEnabled())
  {
    PostProcessPass();
  }
}

Problem solved! Or... maybe not. What if we have a long list of potential render passes that could be active in a render pipeline?

First of all, this can get very messy very quickly. Scrolling through a sea of if statements trying to find the render pass you're looking for will get annoying fast, and every time you need to extend the render pipeline you will wish there was a better way to set this up (trust me).
Furthermore, what if a future render pass requires a previous render pass to have been executed? What if the hypothetical render pass D should only run if render pass A and B were executed AND render pass C was not? You'll find yourself in a nightmare of ghoulish if statement conditions and branches within branches. What about inputs? What if the texture used in a render pass needs to be changed? What if we are using multiple framebuffers? Where should all of these resources exist? Should the main update loop be aware of the bloom mip chain? Should various shadow map textures be passed to the update function?

Second of all, whilst the condition in the if statement may not be computationally expensive, there may be a performance impact from the simple existence of the if statement. Modern CPUs can predict the result of a conditional branch through a built-in optimisation called "branch prediction". Now, it's important to note that I said "may". The CPU may be able to predict the branch to take in a code path and it may create a notable impact on performance if the prediction is incorrect. This isn't to say that all branches are evil. Performance profiling will tell you whether or not a branch is worth worrying about. It's very likely that a branch is completely fine for applications outside of extremely high performance fields like high frequency trading. In the case of game engines, performance is important, yes. But, there are much bigger fish to fry in architecture design optimisations instead of worrying about micro optimsations like branch prediction, at least right now (early in development). However, it's important to be aware of and I personally think it should be avoided when reasonably possible. For more details on branch prediction, read this article by John Farrier.

Either way, performance impact or not, this is a mess that needs cleaning up.

If you want an example as to how bad this can get, I can show you the deferred renderer I set up in the old version of Paul Engine (pre re-write / teardown). Take a look at DeferredPipeline.cpp here.

Whilst it may not look like the worst thing in the world, you'll have to take my word for it when I say that maintaining this thing was a huge headache. Notice all of the raw OpenGL calls, the messy parameters (why does the deferred renderer need a collision manager?) and the constant calls into singletons for various resources.