Advertisement

MS C++ annoying data alignment.

Started by August 01, 2018 08:19 PM
13 comments, last by Gnollrunner 6 years, 1 month ago

I've noticed some rather annoying data alignment behavior with Microsoft C++ compiling for X64.  Perhaps it's in the C++ standard but I don't see the logic behind it. Here's the situation.

If I have a class, call it A, with a Vtable and a 4 byte integer, then sizeof(A) returns 16 which is what I would expect because of 8 byte alignment. Now if I add another four byte integer to class A, sizeof(A) still returns 16, which is again what I would expect. However if instead of adding the second four byte integer to A, I create a new class B which inherits from A and then I add the second four byte integer to B,  sizeof(B) now returns 24. This I don't get.  I mean I understand the initial 8 byte alignment, but since you can't possibly have a  B without an A, I don't see why the start of B also has to also be aligned on 8 bytes. It's just wasted space. It means that sub-classing can now incur a space penalty for no apparent reason.  Anyone have an explanation for this?

I also tested this making the class non virtual, with simply a double instead, just to see if the Vtable was causing something odd to happen. However it exhibited the exact same behavior so it seems to be some standard data alignment thing.

 


class Base 
{
public:
	int a;
	virtual ~Base() {}
};

class Derived : public Base 
{
public:
	int b;
};
int main() {return 1;}

For the above code, running the MSVC 2017 compiler to see the class layout gives following output


cl test.cpp /Zp8 /c /d1reportSingleClassLayoutBase
cl test.cpp /Zp8 /c /d1reportSingleClassLayoutDerived

class Base      size(16):
        +---
 0      | {vfptr}
 8      | a
        | <alignment member> (size=4)
        +---

class Derived   size(24):        
		+--- 
 0      | +--- (base class Base) 
 0      | | {vfptr} 
 8      | | a        
        | | <alignment member> (size=4)        
        | +---
16      | b        
        | <alignment member> (size=4)        
        +---

Looking at the alignment, it should be clear that base class are inserted into derived class without any memory space optimization.

Advertisement
57 minutes ago, Nishant Singh said:

 

Looking at the alignment, it should be clear that base class are inserted into derived class without any memory space optimization.

Here's the point I was trying to make


#include "stdafx.h"
#include <stdio.h>

class Base 
{
public:
	int a;
	virtual ~Base() {}
};

class Derived : public Base 
{
public:
	int b;
};

class Base_And_Derived
{
public:
	int a;
	virtual ~Base_And_Derived() {}
   int b;
};


int main() 
{

   printf("sizeof(Base)             = %d\n",(int) sizeof(Base));
   printf("sizeof(Derived)          = %d\n",(int) sizeof(Derived));
   printf("sizeof(Base_And_Derived) = %d\n",(int) sizeof(Base_And_Derived));

   return 1;

}

This outputs:


sizeof(Base)             = 16
sizeof(Derived)          = 24
sizeof(Base_And_Derived) = 16
Press any key to continue . . .

I don't see why the compiler can't align the data the same why it would if you put both members in the base class.

 

Interesting question. For A you can see that it wants to be 8 byte aligned, as you could create an instance of A and expect it to be aligned. For B, you are I think asking why it cannot encroach on the 'padding' in A.

Consider that A might do something involving the sizeof operator. For instance it is common (though not recommended) to do things like memset (this, 0, sizeof (A)) .. although allowing for the vtable pointer size. If you have B start encroaching on A, you are violating this principle of independence between the two. This is my best guess for the reason for this.

You also don't know where a (or other member variable) is in the structure, it may be on the end and the padding may be before it. Bear in mind also that you can cast B to be an A, and everything still has to work (TM).

22 minutes ago, lawnjelly said:

 For instance it is common (though not recommended) to do things like memset (this, 0, sizeof (A)) .. although allowing for the vtable pointer size.

This would wipe out half a class, and if it had a vtable that would be gone too. I would say this is highly illegal to begin with.  The only place where this should be done is when 'this' is actually A and not some child of A, but then who knows. Maybe they are trying to guard against this kind of abuse.  The result is to make objects bigger than they should be however, which seems like a bad trade off.

Quote

Bear in mind also that you can cast B to be an A, and everything still has to work (TM).

Casting B to A is at most a pointer shift and if you aren't using multiple inheritance, it's a NOP, as far as the computer goes. I still don't see a good reason why the compiler can't align stuff efficiently.

Advertisement
2 minutes ago, Gnollrunner said:

This would wipe out half a class, and if it had a vtable that would be gone too. I would say this is highly illegal to begin with.  The only place where this should be done is when 'this' is actually A and not some child of A, but then who knows. Maybe they are trying to guard against this kind of abuse.  The result is to make objects bigger than they should be however, which seems like a bad trade off.

I have a feeling the underlying assumption is that B should be independent of A in this scenario, because there are a number of special cases in which it can cause problems. If your goal is to tightly pack a and b in a structure, there are usually a number of ways of doing it without breaking the rules, here's an example:

#pragma pack is your friend, you can get 1 byte alignment.


#pragma pack (1)
struct a
{
int val;
void DoSomething() {a++;}
};

struct b
{
int val;
void DoSomething() {b--;}
};

#pragma pack ()

class A
{
public:
	a m_a;
};

class B
{
public:
a m_a;
b m_b;
};

In this example you can have functionality in a and b but you are not deriving the class.

This is appears to be an issue with the Visual C++ compiler (or rather the Windows C++ ABI).  Test program:


#include <iostream>


struct A {
  virtual ~A() {}
  int a;
};

struct B : A {
  int b;
};


int main() {
  std::cout << "sizeof(A) = " << sizeof(A) << "\n" << std::flush;
  std::cout << "sizeof(B) = " << sizeof(B) << "\n" << std::flush;
}

Result on g++ (Ubuntu 7.3.0-16ubuntu3) 7.3.0:


sizeof(A) = 16
sizeof(B) = 16

Same result for clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final).

 

2 minutes ago, lawnjelly said:

I have a feeling the underlying assumption is that B should be independent of A in this scenario, because there are a number of special cases in which it can cause problems. If your goal is to tightly pack a and b in a structure, there are usually a number of ways of doing it without breaking the rules, here's an example:

#pragma pack is your friend, you can get 1 byte alignment.



#pragma pack (1)
struct a
{
int val;
void DoSomething() {a++;}
};

struct b
{
int val;
void DoSomething() {b--;}
};

#pragma pack ()

class A
{
public:
	a m_a;
};

class B
{
public:
a m_a;
b m_b;
};

In this example you can have functionality in a and b but you are not deriving the class.

First off I don't want 1 byte alignment.  That can kill performance on some processors.  I think x86 allows it but it used to give a performance hit for fetches and stores if you did this. Not sure if that's still true. Other processors would crash and throw a bus error. You can always find corner cases where not padding between a base and derived class might cause a problem in some odd code. I'm just not sure any of those cases are actually guaranteed to work by the C++ standard to begin with. I would personalty rather have it pack data by it's "natural" alignment.

Also your example doesn't let you subclass and override functions. It's not really the same thing. In any case I restructured the code and put the two members together (at the cost of a bit of versatility) and my tests runs went from 47 to 38 MB memory usage. That's nothing to scoff at.

10 minutes ago, a light breeze said:

This is appears to be an issue with the Visual C++ compiler (or rather the Windows C++ ABI).  Test program:



#include <iostream>


struct A {
  virtual ~A() {}
  int a;
};

struct B : A {
  int b;
};


int main() {
  std::cout << "sizeof(A) = " << sizeof(A) << "\n" << std::flush;
  std::cout << "sizeof(B) = " << sizeof(B) << "\n" << std::flush;
}

Result on g++ (Ubuntu 7.3.0-16ubuntu3) 7.3.0:



sizeof(A) = 16
sizeof(B) = 16

Same result for clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final).

 

Thanks a lot!  At least this gives me some verification that it's probably not a C++ language standard issue.

42 minutes ago, a light breeze said:

This is appears to be an issue with the Visual C++ compiler (or rather the Windows C++ ABI).  Test program:



#include <iostream>


struct A {
  virtual ~A() {}
  int a;
};

struct B : A {
  int b;
};


int main() {
  std::cout << "sizeof(A) = " << sizeof(A) << "\n" << std::flush;
  std::cout << "sizeof(B) = " << sizeof(B) << "\n" << std::flush;
}

Result on g++ (Ubuntu 7.3.0-16ubuntu3) 7.3.0:



sizeof(A) = 16
sizeof(B) = 16

Same result for clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final).

 

Now that is interesting, I stand corrected. :)

I just tried it, with a non virtual class the sizes on my linux gcc 64 are 4 bytes for A and 8 for B, whereas with the virtual class it is 16 in both cases. Presumably this depends on the default alignment / optimization switches. If the vtable pointer is 8 bytes, and 4 bytes for A then perhaps it sounds like it is aligning both A and B to 16 byte boundary. Thus if you cast B to A then use the sizeof operator the results are a tad confusing. However the independence requirement may still be a thing (B not encroaching on A).

I can't test this more as I have to go away but interesting question! :)

This topic is closed to new replies.

Advertisement