Advertisement

Containers and memory fragmentation

Started by March 20, 2018 08:16 PM
4 comments, last by Codarki 6 years, 5 months ago

Hey all ,

For a few days I'm trying to solve some problems with my engine's memory management.Basically what is have is a custom heap with pre allocated memory.Every block has a header and so on.I decided to leave it like that(not cache friendly) because my model is that every block will be large and I will have a pool allocators and stack allocators dealing with those blocks internally. So far so good I figure out how to place my per scene resources . There is one thing that I really don't know how to do and thats dealing with containers.What kind of allocation strategy to use here.

If I use vector for my scene objects(entities , cameras , particle emitters .. ) I will fragment my custom heap if I do it in a standard way , adding and removing objects will cause a lot of reallocations . If I use a linked list this will not fragment the memory but it's not cache friendly.I guess if a reserve large amount of memory for those vectors it will work but then I will waste a lot memory.I was thinking for some sort of mix between a vector and a linked list , where you have block of memory that can contain lets say 40 items and if you go over that number a new one will be created and re location of the data would not be needed.There would be some cache misses but it will reduce the fragmentation.

 

How you guys deal with that ? Do you just reserve a lot data ?

 

dgi

The 'list of vectors' approach exists in the standard containers as std::deque :)

Instead of wasting memory by allocating too much, you should be able to just allocate the right (large)  amount. If you care about fragmentation, then you also care about limits (running out of memory), so you need to know what the maximum number of each type of object is. If you know those values, then you can allocate the right amount. 

Advertisement

If you are running your own custom heap, you can also try to expand any block of memory when asked for a new one. This may require some custom logic (expand instead of new) but could worth because you keep your heap as close as possible. Data fragmentation will always happen the one or other way but you could minimize it. There are technics like different heap chunks for objects of certain size where smaller objects are chunked together and larger ones too.

You could also run some kind of garbage collection and memory reordering. Instead of raw pointers give a struct that wraps arround a pointer (so its size is as same as the raw pointer; your objects wont be larger) that is possible to lock access to the pointer while your heap manager reorders your memory to kill free spaces and so on. There are tons of possibilities ;)

For games made on your own this is rarely a concern.  On 32-bit computers you've got 2GB of space and it is rather difficult for one person to create 2GB worth of meaningful data.  On 64-bit executables you've got terabytes of addressable space but you'll likely want to limit to around 16GB or so rather than dump everything to swap space.  You can fill that space with bad programming or bugs where you leak enormous amounts of resources, but barring that it can be rather difficult for one individual to fill it up.

If you're on a team or if you develop for memory-limited platforms (like mobile devices and game consoles) you're more likely to hit space constraints. You need a way to figure out your own memory budgets. You'll absolutely need to cap the worst offenders: graphics textures and audio and UI visuals and 3D models all need size constraints.

There are many books, articles, web pages, and tutorials about controlling fragmentation.  There are some simple patterns like having an area for long-term objects and another area for short-term objects, using small object allocators, patterns for when and where memory can be allocated and must be released to prevent holes, but these all come with nuance and a long list of pros and cons, tradeoffs that must be made.  Just like address space, you're unlikely to brush against those limits as a single developer or small group because fragmentation doesn't become an issue until the program is consuming a significant portion of the available address space. 

And rather than writing your own, if you decide to use pool allocators and other patterns, I'd recommend searching for well-written, documented, community supported libraries.  There are several to choose from each with their own properties.

 

I don't know which platform, so I'll talk about Win32.

On larger scale, contiguous memory block in virtual memory might not be contiguous in physical memory. The OS might reorder, and defragment the memory pages.

Reallocations causes fragmentation. I have personally used custom static_vector class which has dynamic size and static capacity. A hybrid of std::vector and std::array, where there are no allocations when you know the maximum size.

I did some experiments years ago to optimize memory management for multithreading. Goal was to have thread specific memory management without locking, and then fall back to locked shared blocks.
My approach was to create few power of two sized pages. 8kb, 64kb, 512kb, and dynamic size. Pages were put in lists of full, used and free. Memory manager would search the correct pages. I made many custom static sized containers to manage the block reservations: eg a memory_block_64 class size of 64bit (bit per index), to indicate which 4 byte blocks were allocated. And a lot of these would fit to the 32kb cache line at once. They were the structure for the memory manager finding the exact location of the allocation.

Benchmarked versus std new/delete. End result was major boost in allocations with some std containers, but deallocations were slower than std. It was my conclusion that the effort to beat the standard new/delete in performance was too high. More so if it was a team project. It would introduce very hard to find bugs, even for small program. And for a big project I dread the thought tracking a bug from deallocation to its origin.

I don't want to discourage, because I too find it fun to optimize cache lines and allocations, but just a warning it might not be productive in grand scheme.

This topic is closed to new replies.

Advertisement