🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Best Small String Optimization method, cross platform and safe

Started by
6 comments, last by JohnnyCode 3 years ago

Hi everybody!
I'm looking more closely to the SSO.

A lot of method is possible to achieve SSO:
- Additional buffer of size N next to the pointer and if-statement the size to know SSO or not SSO.
- Union to have SSO and not SSO, overlapping the memory and checking the bits to know SSO or not SSO.

There is surely more method but this is the two solid one I know.
The first one is safe and cross platform by default but the second one you must be careful with the endian and that use the undefined but used since years behavior of union.

Any experience on it?
Thanks!

Advertisement

Wouldn't you need to specify the optimization goal?

I mean, optimizing on storage gives hugely different results than optimizing on speed of finding such a string. Optimizing on creating such strings gives again a different answer, as far as I can see.

My memory tracker showed that String is where the allocation comes the most and most of them are small string or empty string. That counted around 10 000 allocation to init the system + loading the 3D test scene containing node names, mesh names and so on. SSO would reduce a lot this allocations.

class String {

    struct AllocInfo {
        uint32_t length;
        uint32_t physical;
        char * buffer;
    };
    struct SSOInfo {
        uint32_t length;
        char buffer[12];
    };

    union {
      AllocInfo ai;
      SSOInfo sso;
    } u;

    inline bool is_sso() const {
        return u.sso.length < sizeof(u.sso.buffer);
    }

public:

    String(char const *init) {
        u.sso.length = (uint32_t)strlen(init);
        if (is_sso()) {
            memcpy(u.sso.buffer, init, u.sso.length+1); // zero
        } else {
            uint32_t ru = (u.sso.length + 1 + 128);
            ru &= ~127;
            u.ai.physical = ru;
            u.ai.buffer = new char[ru];
            memcpy(u.ai.buffer, init, u.sso.length+1); // zero
        }
        assert(&u.sso.length == &u.ai.length); // guaranteed
    }

    ~String() {
        if (!is_sso()) {
            delete[] u.ai.buffer;
        }
    }

    char const *c_str() const {
        if (is_sso()) {
            return u.sso.buffer;
        }
        return u.ai.buffer;
    }
    
    size_t size() const {
        return u.sso.length;
    }
};

Copy constructor and assignment operator and such are left as an exercise for the reader.

If you have lots of strings that are 15-20 characters long, you could extend the size of the SSO buffer if you want. At some point, you can fit a second pointer into the allocated string store, too, although most of the time you don't need more than one.

Note that I assume you will never have strings whose length are greater than 32 bits, even on a 64-bit system. That's generally a fairly safe assumption :-) If you really, REALLY, care, you could verify the length of the input and throw runtime_error() if somehow someone pasted the entire library of congress into your file name picker.

enum Bool { True, False, FileNotFound };

If you are using C++ std::string probably already has SSO. Make sure this is really your problem. If you are passing stings around by value you will make a copy every time. Pass by reference where appropriate. I typically dislike std::sting for my usage since if you don't need SSO, you end up wasting a lot of memory. I mostly prefer COW strings which also alleviates the copy problem if you are lazy about passing by reference. Finally if stings are really a killer for you, you can also make a specialized heap that's optimized for them.

Edit: I should also add make sure you are using a string builder where appropriate. Concatenating strings with operators is another source performance loss. The one good thing about old C style strings is that it was obvious what was going on, so you never inadvertently wrote inefficient code. That's easy in C++, but if you understand how things work you can still make the right decisions.

Gnollrunner said:
If you are using C++ std::string probably already has SSO. Make sure this is really your problem. If you are passing stings around by value you will make a copy every time. Pass by reference where appropriate. I typically dislike std::sting for my usage since if you don't need SSO, you end up wasting a lot of memory. I mostly prefer COW strings which also alleviates the copy problem if you are lazy about passing by reference. Finally if stings are really a killer for you, you can also make a specialized heap that's optimized for them.

See, I've been wondering the same thing. There are many places where IMHO SSO is actually a pessimisation in modern code. Consider this example:

class Foo
{
	Foo(std::string&& strName) noexcept :
		strName(std::move(strName)) {}
private:
	std::string strName;
};

using FooStorage = std::unordered_map<std::string, Foo>;

FooStorage mFoos;
std::wstring strName = /*...*/;

mFoos.try_emplace(strName, std::move(strName));

I have many cases where an objects primary storage is in a map by its own name. Now, its not really optimal having a second std::wstring in the map, so in c++20 we could do:

using FooStorage = std::unordered_map<std::string_view, Foo>;

FooStorage mFoos;
std::wstring strName = /*...*/;

mFoos.try_emplace(strName, std::move(strName));

which IMHO is vastly superior in this case. However, since std::string uses SSO, we really cannot do this, since the data-storage of strName for a small string changed while the string is being passed into “Foo”. For cases where the element is stored by std::unique_ptr or similar, we can work around this by using the actual member-variable to construct the key - but if the element is stored directly in the map, there is not way to achieve this.

So its good to see that I'm not the only one who is at least a bit wary of SSO. (Hopefully this reply is still on-topic enough; otherwise I might make a separate thread to discuss this topic).

It depends as for optimizatoin how you proccess your objects, do you serach by name, or use pooled object handling. I gues more common is second one as there is a lot of situation when objects are pooled and processed. Consider mainly that and remain cache friendly.

This topic is closed to new replies.

Advertisement