An Entity Component System with Data Locality in C++

An Entity Component System with Data Locality in C++

In this article, I will take you through the creation of an ECS that focuses on data-locality in C++. I’ll cover ECS-related topics, the architecture, and go over some cool Template-Metaprogramming ideas as well. This article is intended for people who have made a game or two previously, or anyone that’s interested in bettering their C++ skills. I don’t intend this article to be the be-all and end-all of ECS Architecture; what I’ve developed here was enough to suit my immediate needs.

24/05/2020 – Updated some of the code in the article, there were some interesting bugs due to static initialisation of ID Types that didn’t come up until certain random-order conditions were met. Also replaced a lot of the data storage code so that non-trivially copyable types can be used.

I recently started work on a new hobby project. For this new game, I know I will need to really push a lot of onscreen enemies, player characters, and bystanders. Further to this, I knew that the nature of the game would mean that there’s a lot of behavior that could be assigned to any of these characters.

From previous reading, I knew that the class-hierarchy that I would normally use wouldn’t be sufficient, and that I would need to utilize an Entity Component System (or ECS) that focused on the data-locality aspect of an ECS, as opposed to the flexibility offered by an ECS.

Ideally, I would want both:

  1. Flexibility to create Entities based on the Components given; adding new Components easily
  2. Data Locality as much as possible so that the Systems operating on these Entities could blast through the data

Why write another article on ECS?
There’s a few reasons, firstly, a lot of the material I’ve seen is about the flexibility of ECS as opposed to the Data Locality of ECS. The architecture in those examples, in my view, only goes halfway.

The second reason is that there seems to be only two types of information. Beginners articles like the above, and then hardcore ECS implementations on Github. I thought there was some room for middle-ground.

The third reason is that I wanted to go over some interesting Template Metaprogramming and this seemed like a great time to do that.

And the fourth and final reason is that I enjoy writing 🙂

And because I love to program these things myself, rather than use a well-tested and tried and true library, I opened up my editor and got to work.

Recap of Entity Component System

I can hardly do the topic justice, so I strongly advise you read the Wikipedia article, and read this article as well for good measure. Next read this, and this (and everything else on the gameprogrammingpatterns.com). Then google and read all the other articles you can.

Now that you’ve done that, I’m going to recap a few ideas and terms so that we’re both on the same page as you read this article. I think this is necessary because of the amount of confusing and conflicting information I came across when I was reading about ECS architecture and looking at code.

What is an Entity Component System?

Is an alternative to the more traditional class hierarchy that you used to see in Game Programming with C++.

  • There is a GameObject class which everything inherits from. It contains some data and logic for that data.
  • A Player class inherits from the GameObject class. It therefore features the parents data and logic. It extends this data and logic, perhaps even overriding the logic of the parent.
  • An Enemy class also inherits from the GameObject. It does much the same as the Player, except one class features logic for player movement, the other for enemy movement.
Simplified but typical class structure in your game

There’s a lot of problems with the above approach as your games get more difficult, but the bare-bones summary is:

  • The class hierarchy will inevitably become large and confusing. Some objects will exhibit different behavior because they override the logic of a parent even though they operate on the same data. This is what you want to do, but it can become confusing and unwieldly.
  • The data for every object in your game is scattered throughout memory. You’re chasing pointers looking for all the data before you can perform the logic against it.

This Inheritance-style Object Oriented Programming can be replaced with ECS’s Component-style OOP:

  • A Person Component is created, it consists of nothing but person data
  • Similar components are created for Enemy Component and GameObject Component. They only contain data
  • A Person System is created which only performs logic on any Entity that has a Person Component
  • A similar system is created for Enemy Components
  • A single Entity is created which is given a GameObject and Player Components
  • Multiple entities are created which contain GameObject and Enemy Components
Simplified ECS architecture

The benefits of this style are:

  • Objects in your game are no longer thought of as “A Mushroom-Man is an Enemy, which is also a GameObject and is also Renderable”. Instead, an Object is just whatever it contains.
  • Separation of discrete data from the logic, and operating on these separately is often easier to comprehend and work with
  • Anecdotally it is more flexible, but adding a particular component to an existing entity, you’ve managed to add more to your game with minimal effort. By creating a new system, you add logic to whatever entity has components that match the systems.

Definitions

Let’s go over a few terms/words and set some definitions. Some of these won’t make a lot of sense yet, but you can always refer back.

Component

Is a definition of a data structure. For example:


struct Physics { float x; float y; float velocityX; float velocityY; };

Entity

Is a single thing in your ECS. A player, enemy, bullet, car, rock, tree etc. Entities contain instances of Components.

The more things that are an entity, the bigger the benefit of using ECS.

... Entity myEntity; myEntity.AddComponent<Physics>({30, 49, .2f, .1f}); ...

Here, the Entity is being given a new instance of the Physics Component with the values of 30 and 49 for position and .2f and .1f for velocity.

The collection of Components an Entity has is called the Archetype (I’ve seem some articles refer to it as the Family).

System

Is the logic executor. Things die when their health is zero, cars collide with walls.

A system contains the logic for one or more Components. For example, a collision system will want any components with a physics instance. A kill system may want anything with a health component so that it can remove those entities from the game-world and award points to the player. Finally, a vampire mutation system might want any entity with an infected component and human component.

The actual systems you have in place, and their responsibilities will be determined by you when you’re making your game. Thinking about your game in these terms is a little hard to begin with, but you get the knack of it quickly.

Archetype

The list of Components that an Entity has instances of.

Systems will operate on any Entity whose Archetype is a superset of the Components the System operates on. For example, if an entity has components A, B, and C; and a system operates on entities with components A and C, it will operate on the aforementioned entity because it has both an A and a C (the system just doesn’t care about the B component).


Okay, with all of that out of the way, we can continue on with the actual implementation details.

Memory Layout

We’re going to be seperating entities into archetypes. This means that the data for ComponentA could be in more than one place. It’ll be grouped by archetype. Strictly speaking, this is a performance loss as we have data in separate places, but it affords us an amount of flexibility we otherwise wouldn’t have, without wasting big chunks of memory to do so (or pointer chasing, which would defeat the purpose of laying out memory contiguously anyway).

Here’s how it’s to be laid out:

Memory layout

Note that differrent archetypes can have the same component. A system doesn’t have to want all the components from an archetype in order to operate on it. A system will be invoked once for each archetype it can operate on. A system is only given data that it wants (not all components of the archetype).

Within each archetype, all the component data is stored contiguously, so a system will operate on one or more arrays of contiguous data. Your multiple caches in a modern CPU will cache each array. In my experience, you don’t normally operate on more than a couple of Components at a time.

Identifiers

If you’ve read the articles I linked to above, you will know that you don’t generally have an Entity as an object in your code. Instead you refer to every Entity via an ID, the actual entity and it’s components are typically stored in a manager-type class.

Within the guts of the ECS we also use IDs for Component-Types. Meaning that every instance of the Physics Component shares the same Type-ID.

These can all use the same datatype for the identifier. An integer works perfectly well for us, it’s quick to serialise and sort, uses a known amount of memory, and every one of them is unique. I’m going to use a 32-bit integer, because I’m building with a 32-bit compiler, so it’s the native word-size (slightly faster). If you’re going 64-bit, then use a 64-bit integer.

#include <cstdint> typedef std::uint32_t IDType; typedef IDType EntityID; typedef IDType ComponentTypeID;

We also need an Identifier for every distinct Archetype as well. An Archetype is unique if all of it’s ComponentTypeIDs are unique compared to another Archetype. We can do this with a sorted std::vector (we’ll later guarantee is a unique set):

typedef std::vector<ComponentTypeID> ArchetypeID;

It needs to be unique because Entities cannot have multiple instances of the same component. The reason we’re not using an std::set is for performance reasons, they’re not contiguous in memory.

ID Generation

Now that we know our IDs are all integers (or a set of integers), we need a way to generate them. Instead of using #defines or consts, we’ll do a little template programming:

template<class T> class TypeIdGenerator { private: static IDType m_count; public: template<class U> static const IDType GetNewID() { static const IDType idCounter = m_count++; return idCounter; } }; template<class T> IDType TypeIdGenerator<T>::m_count = 0;

One of the geat things about template classes, is that every time you instantiate a concrete class, you get a new class. Because there’s a static member, it’s a unique static for every instatiation of the template.

Or in other words:

TypeIdGenerator<Component>::m_count

is a different static member compared to

TypeIdGenerator<Entity>::m_count

So, we can use the TypeIDGenerator to generate different IDs for different types. That’s great, but then:

template<class T> class TypeIdGenerator { ... template<class U> static const IDType GetNewID() { static const IDType idCounter = m_count++; return idCounter; } ... };

This template member function is making a static copy of the static m_count and increments it upon every call. What this means is that for each T instantiation, you can generate a new set of IDs for each subsequent U.

The same T and U combination will always provide the same IDType because of the use of static.

The reason we’re doing this is to ensure that we can generate as many unique IDs as possible. I’ve seem some implementations re-use IDs for Entities that have been killed, but I think ensuring that entities have a range of IDs from 0 to 4294967295 is enough. I won’t use more entities than this, but if you think you will… well that’s quite a game.

This also means that a ComponentTypeID could be 424 and that could also be used by an EntityID. This is fine though, as these will never get confused in our implementation.

Component Implementation

When we look at all the different parts we need to build, it’s clear that everything starts with the Component. We know our components are just data. We also know that our component requirements will grow and change over time, so we don’t want to limit whatis available.

How we’ll do this is allow the user (i.e. yourself), unrestricted use of whatever “component” they can come up with. Typically, your componetns will be simple structs like we’ve seen before.

Our ECS can’t work with arbitrary “components” though, and we don’t want the user to be forced to inherit from a common base class (we are trying to avoid inheritance afterall). So instead, we’ll use a template class to wrap around our components when they’re provided to the system (we’ll get into the “provided to” part later).

class ComponentBase {}; template<class C> class Component : public ComponentBase { public: static ComponentTypeID GetTypeID() { return TypeIdGenerator<ComponentBase>::GetNewID<C>(); } };

This is a very small wrapper. We have a base class because we will have arbitrary “components” provided in the template parameter C; by having a non-template base-class we can store any templated version of the child class using a pointer and polymorphism.

All of this is really just to surface the ComponentTypeID. If you re-read the above about the TypdIdGenerator you can see how we’re able to generate a unique ComponentTypeID for every class / struct that we wrap up in Component<YourStructHere>. This isn’t a static method because then we run into the static initialisation order problem.

Entity Implementation

Alright; so if the Component is the simplest thing, and kind of the whole point, then I guess Entity would be the next step, we have to store all these components somewhere right?

No! Remember, an “entity” is just an ID. In the articles I linked to above, it should be clear why; but in case not, let’s recap.

For the purpose of data locality we want to store all of the data for all of the same components next to each other in memory. This gives fast access for the CPU because the component data is generally processed via one of our systems. And that system will want to process all of the data for a component. The system doesn’t give two-shits whether it’s EntityA or EntityB’s data, it just wants to process all of the Physics components as fast as possible.

If all the data for every physics component is all next to each other, the CPU will make use of the cache as best as possible.

And because an Entity is just a bag of components (it doesn’t have logic, all the logic is in the systems), then we don’t really need a class.

This means that we no longer will pass pointers or references to game objects around functions and systems anymore. Instead, we can just pass around integers, and the ECS systems will know what to do. Of course the ECS itself will need to provide ways to access individual components of an entity when needed, but we can do that via making requests to the ECS using the EntityID (we don’t need the entirety of the Entity and it’s constituent components, just the ability to get a component when the need arises).

Here’s the Entity class:

class Entity { public: Entity(ECS& ecs) : m_id(ecs.GetNewID()), m_ecs(ecs) { m_ecs.RegisterEntity(m_id); } template<class C> Entity& Add(const C&& value) { m_ecs.AddComponent(m_id, value); return (*this); } EntityID GetID() const { return m_id; } private: EntityID m_id; ECS& m_ecs; };

A few things to note: we’ll define what ECS is later. The other thing I want to mention is my use of references. I know most game-programming code you see tends to involve pointers, but references give you everything a pointer does, but with some added safety; so I prefer to use them where possible.

What we have here, is at the end of the day, a wrapper around the EntityID. It’s such a wrapper, that its destruction won’t change anything; your entity will still exist. Of course, depending on how you want to handle things, it might be prudent to add a destructor that calls something like ecs.RemoveEntity(m_id);.which in turn deletes the associated data.

Our usage pattern so far will look something like this:

struct Physics { float x; float y; float dX; float dY; } ... ECS ecs; ecs.RegisterComponent<Physics>(); Entity myEntity(ecs); myEntity.AddComponent<Physics>({2.f, 5.f, .34f, .1f}); ... ecs.RemoveEntity(myEntity.GetID());

That’s a pretty simple and easy to use API so far (I know things like “RegisterComponent” aren’t known yet).

The important thing about the Entity class is that it registers the entity with the ECS. Now, this might seem a little arse-backward; and maybe it is. I prefer to do things this way, but it would be just as sensible to have the ECS provide something like CreateEntity which return a new Entity, all nice and registered with it’s unique ID already set.

Many implementations of Entity Component Systems that I’ve seen have a few different “manager” classes. I’ve compressed it all into a single one, ECS, for ease of use. Many of these other implemtnations will ask you to do something like:

EntityID myEntity = ecs.CreateEntity(); ecs.AddComponent<Physics>(myEntity, {2.f, 5.f, .34f, .1f});

These sort of systems… I don’t like it. If you prefer it, do it your way, as you can see, my implementation is just forwarding those calls to the ECS& anyway. Whatever works for you. With my implementation, you can add a Component to an Entity either through the Entity interface or the ECS interface.

Archetype Implementation

typedef std::vector<char> ComponentData; struct Archetype { ArchetypeID type; std::vector<ComponentData> componentData; std::vector<EntityID> entityIds; };

Here we have a little struct that denotes an Archetype. There’s the ID, which we covered before, is a collection of ComponentTypeIDs. We also have a vector or vectors.

The ComponentData, is a byte array. All of the data related to a type of component (i.e. “Physics”) that is used by entities of this Archetype will be stored in this contiguous array. Because an archetype can have more than one Component, we have these in an outer vector.

Technically, we could combine all Component Types into one larger byte vector and move across it in strides. Each stride being the size of the combined components, each stride encompassing all of the component data for a single entity. However, modeern CPUs have multiple cache’s, and so, I’m told, multiple arrays is just as good.

Lastly, an archetype knows all of the entities whose component data it’s holding onto.

There’s a couple things to note here because it’s not obvious:

  1. If two Entities have a Physics Component, but one of them has a Ghost Component, they will be in different Archetypes. This means that their Physics Components won’t be next to each other in memory. We only have data locality for entities of the same archetype.
  2. The outer array of the componentData is synchronised with the ArchetypeID. I.e., the first ComponentTypeID of the ArchetypeID is the first byte-vector of componetnData.
  3. I mentioned that the ArchetypeID is sorted earlier. This is very important because it gives us consistancy between IDs. Imagine if an Entity has an ArchetypeID of {Physics, Ghost} and another entity had an ArchetypeID of {Ghost Physics}, when using the standard == operator of std::vector, these would not be equal. Keeping them sorted ensures that they are equal, regardless of the order the ComponentTypeIDs were added.

Entity Component System Core

Now it’s time to define what the ECS is. Here’s the class definition:

class ECS { private: struct Record { Archetype* archetype; std::size_t index; }; typedef std::unordered_map<EntityID, Record> EntityArchetypeMap; typedef std::vector<Archetype*> ArchetypesArray; typedef std::unordered_map<ComponentTypeID, std::size_t> ComponentIdDataSizeMap; typedef std::vector<SystemBase*> SystemsArray; public: ECS(); EntityID GetNewID(); template<class C> void RegisterComponent(); void RegisterSystem(SystemBase* system); void RegisterEntity(const EntityID entityId); void RunSystems(const float elapsedMilliseconds); Archetype* GetArchetype(const ArchetypeID& id); template<class C> void AddComponent(const EntityID& entityId, const C& value); template<class C> void RemoveComponent(const EntityID& entityId); void RemoveEntity(const EntityID& entityId); private: EntityArchetypeMap m_entityArchetypeMap; ArchetypesArray m_archetypes; EntityID m_entityIdCounter; ComponentIdDataSizeMap m_componentIdDataSizeMap; SystemsArray m_systems; };

It looks like a lot, but it’s not, let’s go through these private members.

typedef std::vector<SystemBase*> SystemsArray; SystemsArray m_systems;

We haven’t covered them yet, but the ECS will keep an array of SystemBase* that it will iterate over when we call RunSystems.

typedef std::unordered_map<ComponentTypeID, std::size_t> ComponentIdDataSizeMap; ComponentIdDataSizeMap m_componentIdDataSizeMap;

We’re going to sacrifice some space for speed by using an std::unordered_map over a std::map. I think we can all agree that the purposes of focusing on Data Locality is for speed concerns over memory concerns (not that you’d have that many components that it would ever matter).

What this provides the ECS, which comes in handy when adding and removing components, is the size of an individual component’s data. For example, when we add a Physics component, we know the data size, because we have access to the type. But by adding this component to an entity, we would be changing it’s Archetype (which remember is set of ComponentTypeIDs), so this will neccessetate moving the data around internally (which we’ll get to later). If we need to move the data of some other component type that the entity already has, we need to know how much data to move, and because we don’t have access to this other type, we need to at least have access to it’s size.

And so; we need to Register the component types that will be used, so that the ECS always knows the size of these types when shifting around data. Technically, you could simply capture the size of a component type when it’s first encountered; but I like having the added safety of the ECS shitting itself and crashing in the cases where an unknown ComponentTypeID is encountered.

EntityID m_entityIdCounter;

Every time we generate a new EntityID we increment this counter. We don’t need to be fancy.

typedef std::vector<Archetype*> ArchetypesArray; ArchetypesArray m_archetypes;

An array of all the possible Archetypes that we’ve come across.

struct Record { Archetype* archetype; std::size_t index; }; typedef std::unordered_map<EntityID, Record> EntityArchetypeMap;

It’s not enough to just track Archetypes, we also need to track which entity is which archetype. Normal separation of responsibilities would declare that this information should exist in the Entity class; but, because I’m going to keep a few of these Entity instances around in my code, and likely want to do something with all of them at once, I thought it best to keep them super-lightweight. Yes, it’s only a pointer and an std::size_t, but each byte you use is one byte that can’t be used to fill the cache with the next entity.

Likewise, this information is already in the Archetype itself! But it’s a lot faster to lookup a single EntityID and know exactly where it’s data is, compared to looking through every array of every archetype.

The Record is telling us not only what Archetype the mapped Entity belongs to, but also the index into that archetypes arrays this entity lives at. You’ve seen how an archetype has the data for the components inside it (remember, the Archetypes ArchetypeID is just a set of it’s ComponentTypeIDs); for us to access the right data for an entity, we need to know how deep into that array we need to go (and the m_componentIdDataSizeMap will tell us the size of that data).

Let’s get into the methods now…

ECS Constructor & GetNewID

ECS::ECS() : m_entityIdCounter(0) {} EntityID ECS::GetNewID() { return m_entityIdCounter++; }

Nothing to talk about really.

Registering a Component

template<class C> void ECS::RegisterComponent() { ComponentTypeID componentTypeId = Component<C>::GetTypeID(); if(m_componentIdDataSizeMap.contains(componentTypeId)) return; // can't re-register a type m_componentIdDataSizeMap.emplace(componentTypeId, sizeof(C)); }

This is how we fill out the m_componentIdDataSizeMap for later. Each component that we want to provide to our entities needs to be registered first. Registration consists of using our TypeIDGenerator from within the Component class to get an ID, and then using that ID to map the data-size.

Register a System

void ECS::RegisterSystem(SystemBase* system) { m_systems.push_back(system); }

We’ll go over Systems later, but for now, all we’re doing is holding onto a system pointer. It’s important that you understand that the System itself can’t be destroyed after providing it to the ECS. The ECS won’t manage the system, but have a record of it. You could adjust this so that it takes ownership of the memory associated.

Register an Entity

void ECS::RegisterEntity(const EntityID entityId) { Record dummyRecord; dummyRecord.archetype = nullptr; dummyRecord.index = 0; m_entityArchetypeMap[entityId] = dummyRecord; }

This is another straight-forward one. Remember, the Entity itself calls this, so you could make it private and have Entity as a friend class if you want.

When we first register an Entity, it doesn’t have an Archetype yet, because it has no components. So we’ll create a “dummy” record. We’re doing this because I think it’s important that a registered entity is kept track of, somehow. Otherwise, why register it.

You’ve probably realised you could skip some of this registration stuff. However, I like it because it helps me pick up mistakes and ensures that I’ve thought about my components and entities before throwing them around everywhere. Your mileage may vary.

Run Systems

This would typically be called every frame of your game.

void ECS::RunSystems(const float elapsedMilliseconds) { for(SystemBase* system : m_systems) { const ArchetypeID& key = system->GetKey(); for(Archetype* archetype: m_archetypes) { if(std::includes(archetype->type.begin(), archetype->type.end(), key.begin(), key.end())) { system->DoAction(elapsedMilliseconds, archetype); } } } }

We haven’t covered the code yet, but every systen has a Key. It’s sort of like an Archetype… but it’s more like, “any entity which features all of the components in my key will qualify”. It doesn’t matter if that Entity has other components, what matters is that it has every component that the system is interested in.

Or put another way, the Entities Components are a superset of the Systems Components.

So what’s happening is:

  1. For every System
    1. Get the Key
    2. For every Archetype
      1. Check if the System Key is a subset of the Archetype
      2. If it is, provide the system with this archetype

We will go over what DoAction is later.

Retrieve an Archetype

This is more of a helper function rather than one that would be called directly, so it should probably be private.

Archetype* ECS::GetArchetype(const ArchetypeID& id) { for(Archetype* archetype : m_archetypes) { if(archetype->type == id) return archetype; } // archetype doesn't exist, so create a new one Archetype* newArchetype = new Archetype; newArchetype->type = id; m_archetypes.push_back(newArchetype); // add an empty array for each component in the type for(ArchetypeID::size_type i = 0; i < id.size(); ++i) newArchetype->componentData.push_back({}); return newArchetype; }

It’s a helper function because if the requested ArchetypeID doesn’t exist, it will create a new one. When it creates a new one, it also creates, within the Archetype, an empty vector, within a vector. You’ll see Archetype later, but this is important because the componentData of the Archetype needs to be synchronised with the ArchetypeID (remember, the archetype Id is just a list of ComponentTypeIDs).

Add Component

Now we’re getting into some real meat:

template<class C, typename... Args> C* ECS::AddComponent(const EntityID& entityId, Args&&... args) { ComponentTypeID compTypeId = Component<C>::GetTypeID(); const std::size_t& compDataSize = m_componentIdDataSizeMap[compTypeId]; Record& record = m_entityArchetypeMap[entityId]; Archetype* oldArchetype = record.archetype; C* newComponent = nullptr; if(oldArchetype) { if(std::find(oldArchetype->type.begin(), oldArchetype->type.end(), compTypeId) != oldArchetype->type.end()) { // this entity already contains this component, we can't have // multiple so just exit return nullptr; } ArchetypeID newArchetypeId = oldArchetype->type; newArchetypeId.push_back(compTypeId); std::sort(newArchetypeId.begin(), newArchetypeId.end()); Archetype* newArchetype = GetArchetype(newArchetypeId); // go through each component of the new type. If the component // is in the original archetype, memcpy the data from the old // archetype into the new ones end. // if the components is not in the original, then it's the new data // that's being provided for(std::size_t j = 0; j < newArchetypeId.size(); ++j) { const std::size_t& newCompDataSize = m_componentIdDataSizeMap[newArchetypeId[j]]; std::size_t currentSize = newArchetype->componentData[j].size(); std::size_t newSize = currentSize + newCompDataSize; newArchetype->componentData[j].resize(newSize); ArchetypeID oldArchetypeId = oldArchetype->type; for(std::size_t i = 0; i < oldArchetype->type.size(); ++i) { if(oldArchetype->type[i] == newArchetypeId[j]) { // this ComponentTypeID of the new Archetype is sourced // from the old Archetype std::copy(&oldArchetype->componentData[i][record.index], &oldArchetype->componentData[i][record.index] +newCompDataSize, &newArchetype->componentData[j][currentSize]); goto cnt; // lol what are you doing?! } } // we couldn't find the ComponetnTypeId in the old archetype, so // that can only mean it's the new data that we will source from // the value being added newComponent = new (&newArchetype->componentData[j][currentSize]) C(std::forward<Args>(args)...); cnt:; } // we've now copied data from the old archetype to the new archetype // now we need to remove the data from the old archetype for(std::size_t i = 0; i < oldArchetype->type.size(); ++i) { const std::size_t& oldCompDataSize = m_componentIdDataSizeMap[oldArchetype->type[i]]; ComponentData::iterator startItr = oldArchetype->componentData[i].begin() + record.index; ComponentData::iterator endItr = startItr + oldCompDataSize; oldArchetype->componentData[i].erase(startItr, endItr); } std::vector<EntityID>::iterator willBeRemoved = std::find(oldArchetype->entityIds.begin(), oldArchetype->entityIds.end(), entityId); std::for_each(willBeRemoved, oldArchetype->entityIds.end(), [this](const EntityID& eid) { Record& moveR = m_entityArchetypeMap[eid]; --moveR.index; }); oldArchetype->entityIds.erase( std::remove(oldArchetype->entityIds.begin(), oldArchetype->entityIds.end(), entityId),oldArchetype->entityIds.end()); newArchetype->entityIds.push_back(entityId); record.index = newArchetype->entityIds.size()-1; record.archetype = newArchetype; } else { ArchetypeID newArchetypeId(1,compTypeId); Archetype* newArchetype = GetArchetype(newArchetypeId); std::size_t currentDataSize = newArchetype->componentData[0].size(); newArchetype->componentData[0].resize(currentDataSize + compDataSize); newComponent = new (&newArchetype->componentData[0][currentDataSize]) C(std::forward<Args>(args)...); newArchetype->entityIds.push_back(entityId); record.index = newArchetype->entityIds.size()-1; record.archetype = newArchetype; } return newComponent; }

Let’s go through it section by section:

ComponentTypeID compTypeId = Component<C>::GetTypeID(); std::size_t& compDataSize = m_componentIdDataSizeMap[compTypeId]; Record& record = m_entityArchetypeMap[entityId]; Archetype* oldArchetype = record.archetype;

The first part gets the ID of this Component, by putting it into our Component wrapper. We then look for the data size of this component in our map; this is where the program will crash if you haven’t registered a component previously (amend this behaviour if you want).

Next, we grab this entities current Record from which we can quickly grab a pointer to the Archetype it belongs to… or can we?

if(oldArchetype) // not the first component of this entity { ... } else // the first component of this entity is easier { ArchetypeID newArchetypeId(1,compTypeId); Archetype* newArchetype = GetArchetype(newArchetypeId); std::size_t currentDataSize = newArchetype->componentData[0].size(); newArchetype->componentData[0].resize(currentDataSize + compDataSize); newComponent = new (&newArchetype->componentData[0][currentDataSize]) C(std::forward<Args>(args)...); newArchetype->entityIds.push_back(entityId); record.index = newArchetype->entityIds.size()-1; record.archetype = newArchetype; }

Because of the “dummy” Record we created when registering the entity, it’s quick to know if it’s the first time we’ve added a component to it or not. When it’s the first time, we can execute a smaller chunk of code (which we’ll go through first).

On Line 7 above, we create this new ArchetypeID by creating a new vector featuring this one ComponentTypeId. Within the Archetype on line 11 we’re resizing this new vector (the only one in this new Archetype, to be large enough to hold the data of this new component.

Next we forward our constructor arguments for the type C to a placement new call into the memory we just created in the componentData[0].

We then add the entity to the archetype’s list and finally update the entities Record so that next time we add a component, we know where it’s data is held.

Back into the section where there’s an existing Archetype for this entity:

if(oldArchetype) // not the first component of this entity { if(std::find(oldArchetype->type.begin(), oldArchetype->type.end(), compTypeId) != oldArchetype->type.end()) { return; } ArchetypeID newArchetypeId = oldArchetype->type; newArchetypeId.push_back(compTypeId); std::sort(newArchetypeId.begin(), newArchetypeId.end()); Archetype* newArchetype = GetArchetype(newArchetypeId); for(std::size_t j = 0; j < newArchetypeId.size(); ++j) { const std::size_t& newCompDataSize = m_componentIdDataSizeMap[newArchetypeId[j]]; std::size_t currentSize = newArchetype->componentData[j].size(); std::size_t newSize = currentSize + newCompDataSize; newArchetype->componentData[j].resize(newSize); ArchetypeID oldArchetypeId = oldArchetype->type; for(std::size_t i = 0; i < oldArchetype->type.size(); ++i) { if(oldArchetype->type[i] == newArchetypeId[j]) { std::copy(&oldArchetype->componentData[i][record.index], &oldArchetype->componentData[i][record.index] +newCompDataSize, &newArchetype->componentData[j][currentSize]); goto cnt; } } newComponent = new (&newArchetype->componentData[j][currentSize]) C(std::forward<Args>(args)...); cnt:; } for(std::size_t i = 0; i < oldArchetype->type.size(); ++i) { const std::size_t& oldCompDataSize = m_componentIdDataSizeMap[oldArchetype->type[i]]; ComponentData::iterator startItr = oldArchetype->componentData[i].begin() + record.index; ComponentData::iterator endItr = startItr + oldCompDataSize; oldArchetype->componentData[i].erase(startItr, endItr); } std::vector<EntityID>::iterator willBeRemoved = std::find(oldArchetype->entityIds.begin(), oldArchetype->entityIds.end(), entityId); std::for_each(willBeRemoved, oldArchetype->entityIds.end(), [this](const EntityID& eid) { Record& moveR = m_entityArchetypeMap[eid]; --moveR.index; }); oldArchetype->entityIds.erase( std::remove(oldArchetype->entityIds.begin(), oldArchetype->entityIds.end(), entityId),oldArchetype->entityIds.end()); newArchetype->entityIds.push_back(entityId); record.index = newArchetype->entityIds.size()-1; record.archetype = newArchetype; }

It looks bigger, but if you focus on the loop it should make sense, much of what we’re doing here, we were doing before, this is just a little different because there’s existing data we need to worry about. If it’s still confusing, remember that by adding a new component, we are guaranteed to be changing the Archetype.

I’ve highlighted a couple of key bits. First, we sort the ComponentTypeIDs when constructing the new ArchetypeID. Secondly, we use a gulp goto statement. I know I know, they’re the worst thing ever; but honestly, I can’t think of a better way of handling a loop like this.

It boils down to:

  1. If this is an old component from the old archetype
    • Make room in the new archetypes Component Data to fit the old data
    • Copy the old data across
  2. If this is the new component
    • Make room in the new archetypes Component Data to fit the new data (same line of code as with old components)
    • Placement new the new component with the provided constructor arguments
  3. Delete old component data from the old archetype and fix the old archetypes entityID records so that they have the correct index

For the rest of the code, the snippet at the start of this section contains all the comments which should help you if you need it.

Remove Component

This one is very similar to AddComponent, except it’s the opposite way… hahaha. We know that the Archetype of the entity has changed, so we need to shift data again. In my real code, a lot of this is in some shared helper functions, but I’ve pieced it back together for the purposes of this article:

template<class C> void ECS::RemoveComponent(const EntityID& entityId) { ComponentTypeID compTypeId = Component<C>::GetTypeID(); if(m_entityArchetypeMap.find(entityId) == m_entityArchetypeMap.end()) return; // it doesn't exist Record& record = m_entityArchetypeMap[entityId]; Archetype* oldArchetype = record.archetype; if(!oldArchetype) return; // there's no components anyway if(std::find(oldArchetype->type.begin(), oldArchetype->type.end(), compTypeId) == oldArchetype->type.end()) { // this entity doesn't have this component return; } // find the new archetypeId by removing the old ComponentTypeId ArchetypeID newArchetypeId = oldArchetype->type; newArchetypeId.erase(std::remove(newArchetypeId.begin(), newArchetypeId.end(), compTypeId), newArchetypeId.end()); std::sort(newArchetypeId.begin(), newArchetypeId.end()); Archetype* newArchetype = GetArchetype(newArchetypeId); // go through each component of the new type. If the component // is in the original archetype, memcpy the data from the old // archetype into the new ones end. for(std::size_t j = 0; j < newArchetypeId.size(); ++j) { const std::size_t& newCompDataSize = m_componentIdDataSizeMap[newArchetypeId[j]]; std::size_t currentSize = newArchetype->componentData[j].size(); std::size_t newSize = currentSize + newCompDataSize; newArchetype->componentData[j].resize(newSize); ArchetypeID oldArchetypeId = oldArchetype->type; for(std::size_t i = 0; i < oldArchetype->type.size(); ++i) { if(oldArchetype->type[i] == newArchetypeId[j]) { // this ComponentTypeID of the new Archetype is sourced // from the old Archetype std::copy(&oldArchetype->componentData[i][record.index], &oldArchetype->componentData[i][record.index] +newCompDataSize, &newArchetype->componentData[j][currentSize]); break; } } } // we've now copied data from the old archetype to the new archetype // now we need to remove the data from the old archetype for(std::size_t i = 0; i < oldArchetype->type.size(); ++i) { C* oldComponent = GetComponent<C>(entityId); oldComponent->~C(); const std::size_t& oldCompDataSize = m_componentIdDataSizeMap[oldArchetype->type[i]]; ComponentData::iterator startItr = oldArchetype->componentData[i].begin() + record.index; ComponentData::iterator endItr = startItr + oldCompDataSize; oldArchetype->componentData[i].erase(startItr, endItr); } // each entity in the old archetypes entityIds has an index 1 less now std::vector<EntityID>::iterator willBeRemoved = std::find(oldArchetype->entityIds.begin(), oldArchetype->entityIds.end(), entityId); std::for_each(willBeRemoved, oldArchetype->entityIds.end(), [this](const EntityID& eid) { Record& moveR = m_entityArchetypeMap[eid]; --moveR.index; }); oldArchetype->entityIds.erase( std::remove(oldArchetype->entityIds.begin(), oldArchetype->entityIds.end(), entityId),oldArchetype->entityIds.end()); newArchetype->entityIds.push_back(entityId); record.index = newArchetype->entityIds.size()-1; record.archetype = newArchetype; }

Remove Entity

The last piece of this puzzle, at least for now, is removing an entity. This works much the same as removing a component, except you don’t need a new archetype.

void ECS::RemoveEntity(const EntityID& entityId) { if(m_entityArchetypeMap.find(entityId) == m_entityArchetypeMap.end()) return; // it doesn't exist Record& record = m_entityArchetypeMap[entityId]; Archetype* oldArchetype = record.archetype; oldArchetype->entityIds.erase( std::remove(oldArchetype->entityIds.begin(), oldArchetype->entityIds.end(), entityId),oldArchetype->entityIds.end()); m_entityArchetypeMap.erase(entityId); }

Super important that you see that you need to remove the components before you remove the entity. If you don’t, the entities data will stay sitting around in archetypes. This is because there’s no way to retrieve the type of the component in order to call its destructor. This code could be updated to make this possible, but it would require another level of indirection around the component data being stored so that the it inherits from a base class with a virtual destructor.


We’ve now covered everything except Systems and this is where things get cool.

The Goal of our ECS API

Let’s expand our example from earlier. At the end of the day, I would like to be able to write a program like this:

struct Position { float x; float y; }; struct Velocity { float x; float y; }; struct Randomness { float a; }; int main() { ECS ecs; ecs.RegisterComponent<Position>(); ecs.RegisterComponent<Velocity>(); ecs.RegisterComponent<Randomness>(); Entity ent1(ecs); ent1.Add<Position>({1,2}); ent1.Add<Velocity>({.5f,.5f}); ent1.Add<Randomness>({.25f}); Entity ent2(ecs); ent2.Add<Position>({5,24}); ent2.Add<Randomness>({0.8f}); System<Position,Velocity> system(ecs); system.Action([](const float elapsedMilliseconds, const std::vector<EntityID>& entities, Position* p, Velocity* v) { for(std::size_t i = 0; i < entities.size(); ++i) { std::cout << i << " PV -----------------\n" << "P: (" << p[i].x << ',' << p[i].y << ")\n" << "V: {" << v[i].x << ',' << v[i].y << ")" << std::endl; p[i].x += v[i].x * (elapsedMilliseconds / 1000.f); p[i].y += v[i].y * (elapsedMilliseconds / 1000.f); } }); System<Randomness, Position> system2(ecs); system2.Action([](const float elapsedMilliseconds, const std::vector<EntityID>& entities, Randomness* r, Position* p) { for(std::size_t i = 0; i < entities.size(); ++i) { p[i].x -= r[i].a; std::cout << i << " RP -----------------\n" << "R: (" << r[i].a << ")\n" << "P: {" << p[i].x << ',' << p[i].y << ")" << std::endl; } }); for(int i = 0; i < 3; ++i) ecs.RunSystems(1000); ecs.RemoveComponent<Randomness>(ent2.GetID()); for(int i = 0; i < 3; ++i) ecs.RunSystems(1000); ecs.RemoveEntity(ent1.GetID()); for(int i = 0; i < 3; ++i) ecs.RunSystems(1000); return 0; }

That’s the goal. Where I think the beauty is where I’ve highlighted. You simply provide the types you want to operate on, and provide a function of the form void(time pased, array of matched entities, array of components...). It’s important to remember that this is called per-Archetype that matches, but it’s still pretty sweet.

Here’s the System declaration.

class SystemBase { public: virtual const ArchetypeID& GetKey() const = 0; virtual void DoAction(const float elapsedTime, Archetype* archetype) = 0; }; template<class... Cs> class System : public SystemBase { public: friend class ECS; typedef std::function<void(const float, const std::vector<EntityID>&, Cs*...)> Func; System(ECS& ecs); virtual const ArchetypeID& GetKey() const override; void Action(Func func); protected: template<std::size_t Index1, typename T, typename... Ts> std::enable_if_t<Index1==sizeof...(Cs)> DoAction(const float elapsedMilliseconds, const ArchetypeID& archeTypeIds, const std::vector<EntityID>& entityIDs, T& t, Ts... ts); template<std::size_t Index1, typename T, typename... Ts> std::enable_if_t<Index1 != sizeof...(Cs)> DoAction(const float elapsedMilliseconds, const ArchetypeID& archeTypeIds, const std::vector<EntityID>& entityIDs, T& t, Ts... ts); virtual void DoAction(const float elapsedMilliseconds, Archetype* archetype) override; ECS& m_ecs; Func m_func; bool m_funcSet; };

System Interface

We know from looking at the ECS that we need to keep all the systems in a single array. So the base class for these is the interface/abstract class below.

class SystemBase { public: virtual const ArchetypeID& GetKey() const = 0; virtual void DoAction(const float elapsedTime, Archetype* archetype) = 0; };

The Action Function

This typedef defines the signature of the function you provide when setting the “action” of the system:

typedef std::function<void(const float, const std::vector<EntityID>&, Cs*...)> Func;

Because the System class is using a variadic template, we need to pass that variadic template down to the function format.

template<class... Cs> void System<Cs...>::Action(Func func) { m_func = func; m_funcSet = true; }

Constructor

template<class... Cs> System<Cs...>::System(ECS& ecs, const std::uint8_t& layer) : m_ecs(ecs), m_funcSet(false) { m_ecs.RegisterSystem(layer, this); }

This is where we register ourselves with the ECS. Again, you could have the ECS serve as the creation point for the System, like you could for the Entity, but I prefer it this way.

Getting the Key

Retrieving the key is simple enough:

template<class... Ts> ArchetypeID sort_keys(ArchetypeID types) { std::sort(types.begin(), types.end()); return types; } template<class... Cs> ArchetypeID System<Cs...>::GetKey() const { return sort_keys({{Component<Cs>::GetTypeID()...}}); }

Firstly, there’s a little helper function that uses std::sort, like we’ve used before, to sort the keys provided to the system via the variadic template. Think about it, I could define a System that operates on Position & Velocity instead of Velocity & Position. This ensure that the keys are all ordered.

I’ve mentioned before the ArchetypeID should also be a set, you could add a better constraint here if you wanted. I.e. a System can’t operate on Position & Position components.

Do Action 1

Let’s go over these in reverse order, starting from the bottom, this is the DoAction that the ECS calls.

template<class... Cs> void System<Cs...>::DoAction(const float elapsedMilliseconds, Archetype* archetype) { if(m_funcSet) DoAction<0>(elapsedMilliseconds, archetype->type, archetype->entityIds, archetype->componentData); }

Seems easy enough. Pulls out the parts from the Archetype*, but it also provides a parameter to the next template, 0.

Do Action 2

template<class... Cs> template<std::size_t Index, typename T, typename... Ts> std::enable_if_t<Index != sizeof...(Cs)> System<Cs...>::DoAction(const float elapsedMilliseconds, const ArchetypeID& archeTypeIds, const std::vector<EntityID>& entityIDs, T& t, Ts... ts) { using IthT = std::tuple_element<Index, std::tuple<Cs...>>::type; std::size_t index2 = 0; ComponentTypeID thisTypeCS = Component<IthT>::GetTypeID(); ComponentTypeID thisArchetypeID = archeTypeIds[index2]; while(thisTypeCS != thisArchetypeID && index2 < archeTypeIds.size()) { ++index2; thisArchetypeID = archeTypeIds[index2]; } if(index2 == archeTypeIds.size()) { throw std::runtime_error ("System was executed against an incorrect Archetype"); } DoAction<Index+1>(elapsedMilliseconds, archeTypeIds, entityIDs, t, ts..., reinterpret_cast<IthT*>(&t[index2][0])); }

First off, note that this template is only going to be instantiated, if the provided Index is not the sizeof...(Cs). sizeof...(Cs) returns the number of template parameters in the parameter pack Cs. So if we created a System<Position> the result is 1, if we created System<Position,Velocity> the result is 2.

Next, we have a variable number of parameters, again denoted by a template parameter pack, ts... This is going to be used recursively as we reinterpret the componentData arrays from Archetype into the original data types denoted by Cs.

Every time we reinterpret them, were going to add them to the end of this function and call it again. Each time we call it, we increment Index. Above the recursive call, we’re going to try and find a ComponentTypeID from the ArchetypeID (remember it’s a vector of ComponenttypeIDs) that matches the ComponentTypeID of the type in CS that’s in index Index.

Retrieving the type from a template parameter pack is provided by using IthT = std::tuple_element>::type;.

We are making an assumption that we will find a match!

Do Action 3

After calling the second DoAction enough times, there will come a point when the Index will be equal to the sizeof...(Cs), when this happens, our third DoAction will kick in.

template<class... Cs> template<std::size_t Index, typename T, typename... Ts> std::enable_if_t<Index==sizeof...(Cs)> System<Cs...>::DoAction(const float elapsedMilliseconds, const ArchetypeID& archeTypeIds, const std::vector<EntityID>& entityIDs, T& t, Ts... ts) { m_func(elapsedMilliseconds, entityIDs, ts...); }

This one has the same signature as the others, but now, we’re ignoring the t paramter (the original componentData arrays, and instead are only interested in the reinterpreted arrays of ts. These have been reinterpreted to the types of Cs.

Now that function call that we provided when we set the Action doesn’t need to know about any of the internal wrappers of the ECS.

Pretty cool huh?

If it seems confusing remember:

  1. The data is being stored in std::vector<char>
  2. In order to provide that data to the system action, we want to provide it as the original data types
  3. These recursive templates are going through each type the System wants (ECS handled making sure that the archetype passed to the system has all the required types) and reinterpreting the std::vector<char> into the real type
    1. It does this by iterating over the types until they match (based on ComponentTypeID)
    2. Reinterpreting that std::vector<char> and appending it to the ts... variable.
    3. The ts... variables keep getting passed along in every iteration until we get to the end

Further Work

There’s a few more things that could be done to improve this:

  1. Get a component from an Entity ID – Something like template<class C> C* GetComponent(const EntityID& eid)
  2. Remove an entity and all components – instead of needing to remove all Components first
  3. Destruct – right now, you’re leaking any types you don’t remove at the end. For me this is okay; but I’d love to figure out how to keep track of types used to deconstruct them properly (and remove the entity in number 2 above)
  4. Query a list – Get all the Entities that have the fllowing components. Something like template<class... Cs> std::vector<EntityID> GetEntitiesWith()

Well, that’s it for this article. I hope you learnt something new and I hope you’ve improved your skills. If you’d like to ask questions, drop a comment below or head to the forums. I’d love to know what you think and if there’s any room from improvement on the design; like I said, it serves my needs, but I’m always interested in getting better myself!

1 thought on “An Entity Component System with Data Locality in C++”

Leave a Comment