Building an ECS #2: Archetypes and Vectorization

This is the second in a series of posts about the guts of <a href="https://github.com/SanderMertens/flecs" rel="noopener ugc nofollow" target="_blank">Flecs</a>, an Entity Component System for C and C++. Each post will cover a different part of the design and turn it inside out, so that at the end of each post you will have enough information to implement it yourself. The <a href="https://ajmmertens.medium.com/building-an-ecs-1-where-are-my-entities-and-components-63d07c7da742" rel="noopener">first entry</a> explained how we can efficiently keep track of which components an entity has. In this second entry we will explore one way we can store the component data. ECS is often touted for its performance benefits, and after reading this entry you will know why. <h2>Arrays, array, arrays</h2> As a rule of thumb, if you want to get things done fast, use arrays. The reason for this is that iterating over an array has a very predictable memory access pattern. CPUs take advantage of this by prefetching data from RAM into the CPU cache that it thinks you will access next. If memory access is randomized, like what happens in object oriented code when each object is allocated separately, a CPU can’t predict which data is needed next. This leads to loading junk data into the CPU cache, and in turn more RAM roundtrips. RAM access is much slower than the CPU cache, which can cause measurable slowdowns in application code. For a sense of scale, the following diagram shows approximately how many bunnies can travel from different storages to the CPU in the same time: <a href="https://ajmmertens.medium.com/building-an-ecs-2-archetypes-and-vectorization-fe21690805f9">Website</a>