substruct C

What is substruct C

The substruct C adds the base type inclusion syntax known from C++ to our plain old C. This does not harm the syntax of elder C - the ':'-colon syntax fits in nicely. Field members from the (single!) base type are accessible as if written directly in the declared struct itself.

The real benefit stems from another fact: the C compiler knows about the inheritance chain and allows upcasts, ie. without an explicit cast, without any warning. This provides us with static type checking for those object-oriented libraries such as GTK.

Well, the current implementation scheme of GTK does not use the base type inclusion syntax - such libraries have to work around the problem that the base type inclusion has not been available in C so far. Instead they put an instance of the base type as the first member. And the substruct C patch knows about it, and accepts this too.

So in the end, you will benefit from static type checking in plain old C with big big libraries. Your first GTK will be less writing and it will have less error; and even more, the declarations are much more compatible with C++ declarations, so we'll get a good chance that some basic type structs will get to be shared between C and C++ (as long as C++ does not put virtual members down there - or even rtti (horror))

This is also another step that shall support me later in an attempt to create a C with Classes implementation.
(The c++ file-extension '*.cc' comes from the name C with Classes, as it was called in its early stages).

The inclusion-syntax of substruct C

The inclusion follows the same basic syntax just like in C++, but without any access-scope modifier before, and only one struct-identifier can be included, so this is single-inheritance only.

so you can write that will provide a struct-definition
as if you had been writing this...

   struct B {
      int a;
      int b;
   };

   struct C : B {
      int c;
   }
   struct C {
      int a;
      int b;
      int c;
   }

Hence you can access the member `a` without needing to know in which base type it hade been introduced. Just write `c->a` when you need it.

note: C++ will read the above code as expected: all member-fields included from the base-struct are treated as public members, so the result is the same as with the C-language base type inclusion syntax here - and remember, C has no access-scope anywhere.

The substruct pointer aliasing

In order to benefit form the inclusion syntax, the compiler must be able to cast pointers from `C*` to `B*` without warning. This simply means an inheritance type upcast where the actual pointer value stays untouches, as it would be with an explicit upcast.

   struct B* p; struct C c;
   p = &c;

Now you can use all functions written for `struct B` on instances of type `struct C` since it is a base type. There will be again no warnings if you provide a `struct C*` in the place of a `struct B*`-argument to a function.

This works also with the current C-based object-oriented libraries such as GTK, that had not been using the base type inclusion syntax. The compiler will do the type matching against the first field (if that is a struct-instance).

The distributed substruct C patch includes the submorph C checking as an extensions. Using some additional compiler-switches will then tell if it may be safe to cast two structs by comparing their field memory layout. This is quite often very helpful in running programs against huge object-libraries such as GTK.

GTK-example


note: all _new-functions in GTK return 'GtkWidget*' - well this is obviously far away from object-oriented programming as far as even the least novice can tell... and that is what the explicit cast at the _new-function is needed for.

the current patch provides a matching function that does all the substruct-matching for struct-pointers, but it can not match functions all too well. Anybody wants to do the extension?

Installation

The substruct-feature is distributed as a patch against gcc-2.95. You can get the current substruct.patch right here. Then go to the directory where you unpacked the gcc-2.95 tree. Then do `cat substruct.patch | patch` and `make` a new compiler. The new compiler does not work differently unless you enable C++-extensions.

news: gcc-2.95.1 should be fine for this patch - the files used in my patch aren't touched by the update-diff from gcc-2.95 to gcc-2.95.1

Invokation

Just add `-+` (ie. enable-cplusplus-extensions) or any later gcc standard (like `-std=gnu9x`) that has the C++-extensions as a default. This will enable the substruct-syntax and pointer-aliasing. (and it will also enable '//'-comments...)

You can say `-Wcast-qual` to switch on additional warnings, that will show you when the substruct-match was invoked, and what it yielded. The output will give you fine hints what manual cast you would have to apply to your code to get rid of warnings of elder C versions.

The patch carries additional messages that are enabled with `-W`, (ie. extra-warnings), which are mainly for debugging.

Of course, you will see warnings about the use of this unusual construct, so that you don't accidently stumble about it. You can disable them with -fno-long-long (originally to suppress warnings on the use of `long long` type).

The patch installs also two new compilation-standards into the gcc:

latest news: an explicit `-fsubmorph` option does now exist too.

On the choice of no multiple inheritance

Multiple inheritance in this context would be a no-goer. Although we could include all the members of secondary base types as primary names in the new structure ... but what for? Any function you have been writing for any of the secondary base types is useless since you can not get the address where the secondary base's members start off. You would even not be able to write wrapper functions, it is just the same problem. Instead it is wiser by far to put secondary types as named instances (has-a), and write wrapper functions.

So, any upcast possible in this single-inheritance type-tree happens to be a `void*`-cast, ie. the actual address to the struct does not change. This is very different in C++, where pointers may as well be magically moved to point to inherited secondary base types. Hence
    struct C : A, B { int c };
    void* vp; B* bp; C x;
    vp = &x; bp = &x; printf ("%p\n", bp); bp = vp; printf ("%p\n", bp);
will actually yield different numbers, ... imagine that.

Discussion

I would be pleased to open a discussion on putting the inclusion-syntax for struct-definitions into the official C language standard. If people could be sure about the availability of inclusions, and their simple upcast-relations - well, I suspect it would be widely used in big projects that define very many struct-types. Including GTK of course.

I do like this patch foremost since it does not pollute my GTK-sourcecode with those macro-casts, and at the same time it provides me with static type checking at compile-time - as opposed to dynamic type checking at run-time in GTK (enabled by those very macro-casts). The static type checking is very needed in such huge projects.

Future

May be I get the hooks, so I may make my mind up and add that function-type matching. That would be quite nice.

Otherwise, I'll go to create a new front-end subdir in gcc to hold a "C/C"-frontend. Just look into your old books on "C++ 1.x", and you will know what it is heading for. Especially ... no name mangling! - it is almost impossible to be used from plain C. IMHO contemporary C++ is so different, that it is just a different language being able to read plain C header files (so why are they calling their headers still `*.h' ??)

Things noteworthy different in C with Classes from contempory C++:
- copy operations are always bitwise, not memberwise.
- must use `overload`-keyword to overload. In my C/C implementation those will be implicitly inline-functions, so they do not generate (name-mangled) procedures.

Reference

Comments

Any comments wellcome. (sorry, no guestbook here)


(C) Oct'99 Guido Draheim @gmx.de substruct C last change 04.Oct'99