Wednesday, June 7, 2017

Array of Structures and Structures of Arrays


It can be time consuming and painful to recode data structures in a program to better improve efficiency by optimising the data layout to group data together which will be accessed together.

The goal is to improve cache utilisation. You don't want to iterate an array of structures where you are only accessing some of the members of each item. Say each item in the array is a struct of 32 bytes in size, but we only access 4 bytes of each one as we loop. We have loaded up our cache lines with loads of data we aren't even going to touch resulting in sub-optimal efficiency. An alternative approach in this scenario is to re-arrange the data in to a structure holding arrays for each member. When accessing the one member at a time when iterating, the cache will be used more efficiently as the cache lines will be filled up with only the data we are accessing. Alternatively re-writing the code to work on each item as a whole at a time might be an option, but it also might not be possible or be sub-optimal.

Conversely if the data is arranged as structures of arrays and when iterating multiple members are accessed at the same time, there might be related problems, but it is less likely. You still may want to try both approaches to see what works best for given data structures and their access patterns.

But it's not trivial in C/C++ to rearrange the data like this without having to re-write any code which deals with this data. You can't simply flick a switch and the compiler automagically figures out how to re-arrange the code generation to access the data in the alternative layout. Being able to do that would be pretty cool. Jonathan Blow claims in his Jai programming language to have the ability to do just that. Currently Jai isn't released so all we know is from presentations of this language, but it appears to work by decorating the declaration of the array with something which says how it will be laid out and the language generates the appropriate code without the programmer needing to change the code which is accessing it.

Now I would like to be able to do something similar in C/C++, so I had a bit of a think about it and have come up with this basic idea on how it might be possible to achieve something that comes close. Unfortunately to make it work requires some macros and templates, but nothing too fancy.


Here is the basic idea we want to arrive at:


template <size_t SIZE>
struct SomeStructure
{
  int     m_someMember[SIZE];
  float  m_someOtherMember[SIZE];
  // you could put anything in here, just each one needs to have "[SIZE]" added to the end
};



Then we can create arrays either way like this:



// As an array of structures
SomeStructure<1> AoS_array[1000];

// As a structure of arrays
SomeStructure<1000> SoA_array[1];



When we want to access them we do one of these:



int x = AoS_array[n].m_someMember[0];

int x = SoA_array[0].m_someMember[n];


Notice we need to swap the indexes. But we can do that easily with a macro if we know which type of array it is. One simple way I came up with was making a macro for when declaring an array which also sets a const bool which the accessing macro can use.

So here are the macros:


#define DECLARE_STRUCT_BEGIN(StructName) \
  template \
  struct StructName \
  {

#define MEMBER(t, name) \
    t name[SIZE];

#define DECLARE_STRUCT_END(StructName) \
  };

#define MAKE_AOS(array, S, SIZE) \
     S<1> array[SIZE]; \
     const bool array##_isSOA = false;

#define MAKE_SOA(array, S, SIZE) \
     S array[1]; \
     const bool array##_isSOA = true;

#define GET(array, a, n) \
     ((array##_isSOA) ? array[0].a[n] : array[n].a[0])


Probably this could be tweaked and improved, but it gets the idea across as to how it could be possible. I think it is fairly self explanatory.

Usage example:


DECLARE_STRUCT_BEGIN(S)
  MEMBER(int, a)
  MEMBER(int, b)
  MEMBER(int, c)
DECLARE_STRUCT_END(S)


MAKE_AOS(array1, S, 100)

MAKE_SOA(array3, S, 100)


void test()
{
  MAKE_AOS(array2, S, 100)
  
  for (int i = 0; i < 100; i++)
  {
    GET(array1, a, i) = i;
    GET(array2, a, i) = i;
    GET(array3, a, i) = i;
  }
}


I think I may be able to improve on this with templates instead of macros to to make the syntax more appealing, but just having something that works is a start and makes it a synch to switch the data structures to and from AoS to SoA to try out different layouts without mammoth amounts of work.

Also referring to my other post about serialisation and de-serialization (http://inspired-code.blogspot.com.au/p/automating-object-serializationdeserial.html) you could work the macros together where I presented a DECLARE_CLASS and DECLARE_MEMBER macro to automagically generate the code for serialising and de-serializing a structure, it shouldn't be too hard to combine the macros to combine the two ideas of declaring a structure/class that generates the serialisation and deserialisation and also is easy to switch from SoA to AoS. That is another improvement direction that this could be taken.


Along the lines of using templates to reduce the use of macros, possibly this is one direction to take it to do the declaring of an SOA or AOS and the consistent accessing in to either without relying on the preprocessor:



template <typename T>
struct ArrayAccessor
{
  explicit ArrayAccessor(std::functionsize_t)> func) : m_getter(func) {}
  inline T& operator[](std::size_t idx) {
    return m_getter(idx);
  }
  std::functionsize_t)> m_getter;
};


template <template<size_t> class t, size_t SIZE>
struct SOA
{
  template <typename T>
  ArrayAccessor get(T (t::*memberPtr)[SIZE]) {
    return ArrayAccessor([&](size_t idx)->T& { return (m_array[0].*memberPtr)[idx]; });
  }
  t m_array[1];
};


template <template<size_t> class t, size_t SIZE>
struct AOS
{
  template <typename T>
  ArrayAccessor get(T (t<1>::*memberPtr)[1]) {
    return ArrayAccessor([&](size_t idx)->T& { return (m_array[idx].*memberPtr)[0]; });
  }
  t<1> m_array[SIZE];

};



Then to declare an array as SOA or AOS you would do:

SOA<S,1000> array4;
AOS<S,1000> array5;

Access can be made to be the same:



#define ARRAY_MEMBER(array, member) \
    array.get(&std::remove_reference0])>::type::member)


Usage:


    ARRAY_MEMBER(array4, a)[i] = i;
    ARRAY_MEMBER(array5, a)[i] = i;


Ideally the templates could work out the type without needing so much typing or macros, but that is left as an exercise for the reader.

No comments:

Post a Comment