[ Home | Resume | Programming | Engineering Philosophy | Family ]

C++ Coding Standards

Here are a bunch of conventions that I normally follow for C++ coding. In general, a project's coding standards should be subject to debate, but must not be violated tacitly.

I have tried to align these conventions with what I see as the prevailing conventions used in books and sources on the Internet (e.g. Boost). However, even with existing guidance, it is challenging to define a single self-consistent set of conventions that precludes no useful programming objective.

Executive Summary

Terminology

Definition: Prefer something means that you should use it unless some requirement definitively compels you otherwise.

Definition: A top-level entity is bound to a symbol at the top of some namespace (but not necessarily the root namespace), as opposed to being a member of a function or class, etc. A top-level datum is also called a global. A top-level function is also called a free function.

Text within square brackets is intended to motivate the stated conventions.

Files

C++ translation unit files have the suffix ".cpp". C++ header files have the suffix ".hpp". The base name should match the affected class name (if any) exactly. Header files that are required to compile under either C or C++ have the suffix ".h". The name of the directory in which the file is located should match the affected namespace name, if any. Namespace nesting, if any, should be reflected as directory nesting.

Normally, each file should pertain to only one top-level class. [Having a file for every class leads to a large number of files -- perhaps 100 or so for a medium sized project -- but it makes the definition of a referenced class easy to find. It also tends to make a vast improvement in parsing time, because the number of unnecessarily included headers can otherwise be enormous. The alternative is to group classes that are expected to be used together into a component, but that's usually counter-productive because well-written classes are likely to be reused in unpredictable ways.] If you have a set of tightly coupled classes that belong in the same file, then you probably ought to select one as the "principal" class, and make the others public member classes of that class.

Neither the declaration nor the definition of a given entity should exist in more than one source file, even if it is trivial. [That leads to inconsistencies that are easily provoked, and often difficult to diagnose.] Use a common header file instead.

A type's header file should define the type (but not necessarily all of its member functions) completely. Trivial member functions (that is, candidates for inlining) are defined in the header file (unless they access a PImpl, in which case they cannot be inlining candidates anyway). If all of a class's member functions are defined in its header file, then it will not have a corresponding ".cpp" file unless it has static data members to be defined therein.

The header guard macro should begin with an uppercase rendition of the affected namespace name (if any) with namespace components joined with underscores, followed by an underscore and then a lowercase "x", followed by an uppercase rendition of the filename, with capitalized words joined with an underscore, underscores replaced with "_u", and any special characters replaced with an underscore, and finally suffixed with an underscore. (Macros ending in "_H_" and "_HPP_" should not be used for any other purpose.) For example:

foo/MyClass.hpp:

	#ifndef FOO_xMY_CLASS_HPP_
	#define FOO_xMY_CLASS_HPP_

	#include "foo/MyBase.hpp"

	namespace foo {
		class MyClass : public MyBase {};
	} // namespace foo

	#endif // FOO_xMY_CLASS_HPP_

A client translation unit should #include any header that it uses directly, rather than relying on another #include'ed header that happens to #include the necessary header, because that can change. [Header guards make the parsing cost of redundant headers negligible, except in pathological cases, which you can handle by putting a redundant guard around the #include statement.] In particular, to implicitly verify that a header #include's all of the other necessary headers, the first thing that the associated ".cpp" file (if any) should do is to #include its own header.

A type's header (and its forward header, if any) should be #include'ed at the top-level of the translation unit, rather than inside a block, unless it is a C header #include'ed inside a top-level extern "C" block, or a legacy C++ header #include'ed inside a top-level namespace block.

Globals and free functions that are not intended to constitute part of a particular class's interface are declared in "namespace-name/globals.hpp". The file "namespace-name/globals.cpp" defines those globals and those free functions that are not inline.

As a rough guideline, none of your source files should exceed 1000 lines, excluding comments and blank lines. You might need to violate this if you employ highly sophisticated algorithms, but keep in mind that the quality of your program's design is usually much more important than the sophistication of its algorithms.

Template specializations may be defined in the template's header, the specialization type header, or its own header, depending on what makes sense according to the interface principle . If it lives in its own header, then the filename should be a concatenation of the template name followed by the specialization type name(s), joined with underscore(s).

The default for template code generation is to duplicate the object code for every translation unit that references the template class. It is often a good idea to use a different method for compilation efficiency reasons. However, this can lead to difficulty in reusing code from one project to another because there is no single such method that is universally accepted. (However, it is hoped that a dominant method emerges over the next few years. The "repository" method seems to be the most desirable, but most contemporary compilers don't implement it well enough.)

Header Dependencies

A client should not use the full definition of a type when a declaration will do (in particular, when the type is used only in pointer, reference, extern and function declarations), because that leads to wasted parsing effort and can create circular header dependencies. However, a client should not forward declare any class that might turn into a typedef without the client's knowledge. Instead, the client should #include the type's forward header, which lives in a file with a suffix "_fwd.hpp" if it is substantial, or more commonly in "namespace-name/types.hpp" if it is expected to remain trivial. [Having a separate forward header file for every single class is not a good idea, because most of them would have 3 lines of header guard and 1 line of declaration.]

Also, "namespace-name/types.hpp" may include typedefs, forward declarations of structs and unions, and trivial enum, struct and union definitions (unless they are intended to constitute part of a particular class's interface). This approach relies on "namespace-name/types.hpp" being inexpensive to parse, so anything that is or might become nontrivial should live in its own header instead. For example, a top-level enum with 50 values, or anything that uses the "types.hpp" of another namespace, should have its own header file. Trivial global constants may also need to live in "types.hpp" in order for its type declarations to compile, in which case "globals.hpp" should #include "types.hpp".

Because "globals.hpp" most likely has a lot of clients, it also is required to be inexpensive to parse. If any of its declarations require nontrivial type definitions, then you should consider whether such a declaration constitutes part of another class's interface, and therefore belongs in that class's header. However, this makes the declaration difficult for a programmer to locate. Furthermore, when functionality is added to a class, whether as a member or as a global, clients that do not make use of that functionality may be burdened with unnecessary dependencies. To deal with these problems, such declarations should be made public static members of an uninstantiateable struct. (This is actually just a degenerate case of the SINGLETON pattern.)

Any type that has its own forward header file should also have its own header file. [That way, a client never has to use the forward header when it really wants the complete type.] However, the converse is not necessarily true, for example, because a given type's forward header lives in "types.hpp", or is nonexistent because it is not expected to have clients. The header file, if any, of a typedef or enum normally #include's its own forward header (whether or not it lives in "types.hpp") after any prerequisite types have been completely defined.

Keep in mind that sometimes when you least expect it, trivial types turn into substantial types, typedefs turn into classes or enums and vice versa, and the set of prerequisite types of a typedef changes. Therefore, a client of "types.hpp" should anticipate needing its list of required headers to be updated occasionally to track such changes. If the number of clients is limited, then this usually isn't a big deal. Otherwise, you can avoid this problem by moving things into trivial headers proactively.

Another disadvantage to putting type declarations in "types.hpp" is that when you change them (e.g. add a new forward declaration for a class), your build system will probably think that it needs to recompile most of the affected namespace, as well as parts of other namespaces. However, if you have observed proper physical design practices, then this will take at most a few minutes, which is generally tolerable.

Prior to the first use of a function in a translation unit, all its overloads that might come into play should have been declared. Similarly, prior to the first use of a template, all its relevant specializations (as well as the primary template) should have been declared. In order to accomplish this, function overloads and template specializations for "basic" types (built-in types and types declared as a result of #include'ing "types.hpp" from the same namespace) for the same function or class should be declared in the same file (usually "globals.hpp", or the class header file in the case of a class template), and each remaining overload or specialization for that function or class should be declared in the header of the type to which pertains. Avoid overloads and specializations that pertain to multiple non-basic types.

Sometimes, in order to define a class A a declaration of a class B is required, and in order to define class B a definition of class A is required. (It bears mentioning that this is often a symptom of poor design.) In that case, inline member functions of class A that dereference a pointer to type B must be defined in A's header but outside of the class definition. For example:

A.hpp:

	#ifndef A_HPP_
	#define A_HPP_

	// Can't include "B.hpp" here, because class A is not defined yet.
	#include "types.hpp" // Presumed to contain "class B;" declaration

	class A {
		B *b_;
	 public:
		void x();
	};

	#include "B.hpp"

	inline void A::x() {
		B->do_something();
	}

	#endif // A_HPP_

Since class A's header depends on class B, "B.hpp" needs to #include "A.hpp" before its header guard, like this:

	#include "A.hpp"

	#ifndef B_HPP_
	#define B_HPP_

	class B {
		A a_;
	}

	#endif // B_HPP_

In order to prevent the ramifications of small changes from propagating uncontrollably among your header files, you should do this (even though it increases parsing time slightly) whenever B's class definition requires A's class definition, and A's header might depend on B.

Note that the distinction between definition and declaration is important here. A true cyclical definition dependency is impossible to resolve. Unfortunately, the only indication of this error is that the preprocessor hits the nesting limit (which is required by the standard to be finite). You just have to know that when this happens, it means either that you used a definition where a declaration would have sufficed, or that you're trying to do the impossible.

In the previous example, if A::x() weren't inline, then "A.hpp" wouldn't have to load B's header. That's the PImpl idiom , which can be used to reduce compile-time dependencies, and therefore prevent the compiling effort from becoming O(n2) in the amount of source code. In practical terms, judicious use of the PImpl idiom in a very large project (e.g. 2500 classes) typically reduces the compile time from a few days to less than an hour, which is a boon to productivity.

#include'ing an unnecessary type header should not change the semantics of the translation unit, unless a circular dependency is created as a result. [Otherwise, surprising consequences can result when changes to a translation unit require additional headers.] Furthermore, the order of inclusion should not affect the semantics either. However, for the sake of compilation efficiency, unnecessary headers should not be #include'ed.

Formatting

Tab stops should be 8 characters, and file width should be 80 characters. [These are not really the optimal settings for C++ programming, but they make things like enscript and sdiff happy by default. One way or another, it is very beneficial to have a consistent standard on this when multiple contributors might edit the same files.] Only the first 79 columns should be used, because not all legacy tools and printers handle exactly full lines consistently.

Use only the 96 printable ASCII characters (including space), horizontal tab and newline. Don't use trigraphs unless you're actually dealing with archaic equipment that can't handle ASCII. Don't use a tab within a literal string [because it's indistinguishable from spaces in a printout]; use the "\t" escape instead.

[NOTE: As for the remainder of these formatting rules, it's not actually very important to maintain a consistent standard, unless you publish source code as part of your usual business model. Still, it's useful to have a workable and self-consistent set of baseline formatting rules that you can fall back on, especially during a transition from C to C++.]

The body of a block should be indented by a full tab stop. It is preferred that the open curly-brace be on the same line as the associated "if" or "while", etc., because that way you can fit more code on each page. Non-bracketed (single statement) loop or conditional bodies are discouraged, but should occupy the line following the loop or conditional if they are used. Empty bodies should always appear as "{}" without any whitespace between the brackets. Bracketed bodies may also appear as a single line if they contain only a single short statement.

A blank line should separate function definitions that appear outside of a class definition.

Access specifiers and labels should be indented by 1 space.

Wrapping literal text should be avoided by tabbing out by one or more stops and indenting 1 space. Nesting blocks more than 7 deep is discouraged (use functional decomposition to avoid it), but can be dealt with by tabbing out at least 2 stops and indenting 1 space. Any further indentation should remain 1 space to the right of a tab stop.

Wrapping expressions should be avoided by splitting them into multiple lines with closing parentheses aligned vertically and sub-expressions indented by 2 spaces per level of expression nesting. [Very long expressions tend to be common when you use STL, which is recommended and encouraged.] Similarly, argument and parameter lists should be indented by 2 spaces when they are split, and the closing right parenthesis or angle bracket should be vertically aligned with the first character of the construct that it closes.

The contents of a namespace need not be indented. However, the closing curly bracket that terminates the namespace must be commented as such.

Here's an example illustrating some of these conventions:

	namespace foo {

	bool func() {
		cout <<
	 "This is a really long string that didn't fit on the previous line\n";
		for(int i=0; i<4; ++i) {
			for(int j=0; j<4; ++j) {
				for(int k=0; k<4; ++k) {
					if(
					  the_name_of_a_function(
					    this->data_member
					  )
					) {
						switch(i+j+k) {
						 case 1:
		 // Preferable to put this for() loop in a sub-function
		 for(int l=0; l<4; ++l) {
			 cout << l << endl;
		 }
		 break;
						 default:
							return false;
						}
						// Preferable to use {} here
						if(k)
							return true;
					}
				}
			}
		}
		return false;
	}

	inline void nothing() {}

	} // namespace foo

For pointer or reference declarations, put a space between the referent type and the '*' or '&' (rather than between the '*' or '&' and the identifier). However, complex type specifiers should not contain discretionary spaces except within template argument lists. For example:

	int *a(bool&(*)()), b; // Same as "int *a(bool&(*)()); int b;"
	int& c=b, d=b; // WRONG! Looks like "int &c=b; int &d=b;", but it ain't

The template keyword and its parameter list may occupy the line(s) preceding, and be vertically aligned with, the template declaration. If the member initialization list of a constructor doesn't fit on the same line as the parameter list, then it should appear on the following line(s) indented by 2 spaces. In that case, the colon preceding the initialization list belongs on the same line as the right parenthesis closing the parameter list, and the left curly bracket opening the constructor body begins a new line. For example, this:

	template<
	  typename T1, typename T2, typename T3
	> class ThreeThings {
		T1 thing1_;
		T2 thing2_;
		T3 thing3_;
	 public:
		ThreeThings(
		  thing1_P, thing2_P, thing3_P
		) : thing1_(thing1_P), thing2_(thing2_P), thing3_(thing3_P) {}
	};

could also be written like this:

	template<typename T1, typename T2, typename T3>
	class ThreeThings {
		T1 thing1_;
		T2 thing2_;
		T3 thing3_;
	 public:
		ThreeThings(thing1_P, thing2_P, thing3_P) :
		  thing1_(thing1_P), thing2_(thing2_P), thing3_(thing3_P)
		{}
	};

If you have a template name with two adjacent >'s, then you must put a space between them to make the tokenizer happy. When that happens, there should be a space after the corresponding '<' for readability. For example:

	Foo< A<int> > // good
	Foo<A<int> > // bad

Nested preprocessor directives should be indented by inserting a number of spaces equal to the level of nesting between the '#' and the first letter of the directive. However, directives that are intended to be easily added or removed (such as header guards or "#if 0" directives) should not be considered as a level of nesting for the purposes of indentation. For example:

	#ifndef PARANOID_H_
	#define PARANOID_H_

	#ifdef WINDOWS
	# ifndef NDEBUG
	#  define PARANOID
	# endif
	#endif

	#endif // PARANOID_H_

Namespaces

Except where there is no alternative (e.g. 'main' and 'errno'), all symbols should live in some namespace with a globally-unique name. [Using a namespace allows you to keep identifiers short without risking naming clashes. If there happens to be a namespace name clash, it is relatively simple to correct.]

Although a given namespace may have a very large number of contributors, the set of contributors who are empowered to allocate new symbols (a.k.a. "pollute the namespace") should be limited to about 10 or fewer individuals. This makes the namespace much easier to manage.

Identifiers recited by the C++ library standard or the Boost libraries are to be avoided for other uses, such that they can be imported into other namespaces as needed without causing clashes.

Avoid "using" directives, unless you actually control the namespace being imported. In particular, never put "using namespace std;" in production code. [If you do that, then all of your symbols are in jeopardy of clashing with future versions of the library.] Instead, import only the symbols that you plan to use.

A namespace alias can be used to globally switch among similar namespaces without obscuring any of them completely. The include directory corresponding to a namespace alias should contain a trivial wrapper for each of the target namespace's include files. For example:

lib/AClass.hpp:

	#include "the_library_v2_07/AClass.hpp"
	namespace lib = the_library_v2_07;

Such directories can and should be generated automatically.

Identifiers

[There are two main reasons to use different identifier styles for different kinds of identifiers: to avoid collisions between identifiers that would otherwise "want" to have the same name (e.g. type, datum, "get" method, and "set" parameter), and to make it easier to glean information about an identifier from its invocations without having to examine its declaration. Another important consideration is that maintenance costs are reduced if two kinds of identifiers use the same style where an identifier is likely to change from one kind to the other. (For example, since a member function might become a member functor, it is convenient for data and function members to use the same style.)]

Uppercase identifiers capitalize the first letter of each word, and do not contain underscores. Lowercase identifiers use underscores to separate words, and do not capitalize any letters. All-caps identifiers are like lowercase identifiers, except that they use only uppercase letters.

Acronyms always count as a single word. Here are a few of examples:

CORRECT INCORRECT
NfsMount NFSMount
nfs_mount n_f_s_mount
NFS_MOUNT N_F_S_MOUNT

User-defined types (typedefs and classes) are capitalized. Don't use suffixes (e.g. the "_t" suffix that is popular in C).

Namespace names are lowercase, in order to distinguish them from class names. [For example, foo::Bar is defined in "foo/Bar.hpp", whereas Foo::Bar is a member type of class Foo, which is defined in "current-namespace/Foo.hpp".] A namespace name should be especially terse, since you'll probably have to type it a whole bunch of times. Namespace names should not contain underscores.

Struct, union and enum names are type names in C++, and are therefore uppercase. If C compatibility might be required, then append an underscore to the struct, union or enum name, and create a typedef without the underscore for it. Enumeration constants are lowercase. (An enum should usually live inside a class, because otherwise its enumeration constants tend to pollute the namespace.)

Template type parameters should be short (e.g. "T", "U", "V", "T1", "T2", ...) if they are fully abstract, and should be prefixed by some descriptive name (e.g. "ResultT") if they are required to satisfy special properties. (Such properties should be documented in comments.) Type parameters should be declared "typename" rather than "class." Integral template parameters should be lowercase.

The name of a member type sometimes "wants" to match the name of a type at the top of the namespace. For example, one might define a base class to satisfy the requirements of a template parameter, and it might be intuitive for the template class to have a typedef of the same name as that base class, which resolves to the parameter. The compiler won't have any problem with this, but it can be confusing to somebody deciphering the code, so it should be avoided. In this case, if the base class is abstract (uninstantiateable), then append "Base" to its name. Otherwise, append "Type" to the typedef name.

Member functions and data members are lowercase. They end with an underscore if and only if they are private or protected. "Get" methods are usually named after the data member that they return, but without the trailing underscore.

[Another popular convention is to use underscore for data as opposed to functions. In the majority of instances, the data members are private and the functions are public, so it doesn't make any difference. However, in the cases where it makes a difference, I think that it's better to be able to infer the access level from the name, since you can usually infer whether it is a function by context. The disadvantages to this are that it's more work to make a previously private member public, and that the name of a protected "get" method needs to be prefixed with "get_" to avoid a name clash with the data member it returns.]

Globals are lowercase. A global (or class static) object should not be used unless you are sure that its constructor and destructor will never access another global (or a static of another class) that also is or might become an object [because there is no general way to avoid the nasty creation order sensitivity problems that are likely to result, without sacrificing functionality, violating the interface principle , or using library-building tricks and never exposing the individual object files]. Otherwise, use the SINGLETON pattern instead.

Preprocessor macros are in all-caps (with the exception of special lowercase letters in header guards, as noted in Files). They should be used only when there is no alternative, such as for #include guards. Prefer inline functions, traits (template) classes and global constants to macros wherever possible. Macros that are required to act like functions may be lowercase instead, but don't do that. (I just said to prefer inline functions, didn't I?) Using a macro to redefine a C++ keyword is absolutely forbidden.

Formal parameters that correspond to a data member should be named after the data member with an underscore appended. Because the C++ standard prohibits consecutive underscores in an identifier, a "P" should be appended instead if the data member name already ends in an underscore.

Here's an example:

	template<typename T>
	class MyClass {
	 public:
		typedef T DataType;
	 private:
		DataType data_;
	 public:
		DataType data() const { return data_; }
		void setdata(DataType data_P) { data_=data_P; }
	};

When you use the TEMPLATE METHOD pattern, sometimes the name of an abstract operation "wants" to match the name of the template method itself. When that happens, you should prefix the name of the abstract operation with "x". You may need to apply this rule recursively, because the abstract operation might also be a template method.

Comments

Comments may take any form that parses correctly, doesn't wrap lines, and most importantly gets the point across. C++ line-style comments using '//' are preferred over the C block-style comments using '/*' and '*/' (except when the comment might need to parse under C) because of the unterminated comment problem.

Prefer "#if 0" over commenting out lines of code [because it nests], especially when disabling more than one or two consecutive lines.

Comments are generally good, but they can also be a maintenance hassle.

Comments should not explain what should be evident from the code itself. For example, don't list the other classes that a class references.

Comments should not be used if they can be replaced with code of equal or greater meaningfulness. For example, don't comment that the copy constructor shouldn't be used; make it private instead. Don't put an invariant in a comment when you could write check_invariant() instead.

An outstanding required change should be described in a comment that begins with "TBD: ". An issue that is serious enough to warrant preventing compilation should instead be described with "#error TBD: ...". Any misleading or controversial aspect of the code that does not require corrective action should be described in a comment beginning with "NOTE: ", as this helps to prevent people from "fixing" that which isn't broken.

Comments should not describe things that are likely to change without breaking the code that is being described. For example, don't list a class's clients, because additional clients should be free to use it.

Comments should not explain well-known design patterns . Just cite the pattern, and the reader can look it up if he doesn't know it.

Comments should be used to explain anything that is potentially important, but otherwise not evident. In particular, any potentially misleading code construct should be commented, especially if it might be mistaken for an oversight. Examples of such constructs include non-empty fall-through case labels, empty blocks (other than constructor and destructor bodies), and early return statements.

Comments should spell out any extraordinary guarantees made by a service (i.e. a class or function), such that clients can utilize them and modifications to the service do not violate them. Similarly, extraordinary requirements should also be spelled out (or preferably eliminated if possible).

Comments may use the following abbreviations:

Abbreviation Meaning
arg Argument
ctor Constructor
dtor Destructor
XXref Copy constructor
elt Element
idx Index
iff If and only if
iter Iterator
Basic The basic (exception) guarantee
Strong The strong (exception) guarantee
NoThrow The no throw (exception) guarantee

Verification

[Verification is an often overlooked, yet increasingly important, aspect of software development. As a rule, the vast majority of bugs exist and are relatively easily detected at the class level, but many bugs are difficult to detect at the system level. Moreover, bugs detected at the system level are more difficult to reproduce and subsequently isolate. As if that weren't bad enough, by the time the system can be integrated for testing, other classes might evolve to rely on existing defects, intentionally or otherwise. Without an adequate verification strategy, you will inevitably be faced with the dilemma of either being late to market or shipping a defective product.]

The requirements and guarantees of a class should always be clear, either through idiom, comments, or other documentation. [Otherwise, it becomes impossible to tell whether a class or its client is at fault for a bug. Furthermore, multiple clients might make conflicting assumptions about the class, and that's usually costly to resolve.]

The best person to document a class is its principal author.

The best time to document the responsibilities of a class is just before you implement its header file.

The best time to document a method is just before you implement it.

Where practical, requirements and guarantees should be monitored via assert(), as well as other constructs that are free of side-effects, that are compiled out if NDEBUG is defined, and that call abort() if a defect is detected. [This definitively specifies properties of the class, automatically verifies those properties over the life of the code, and does not adversely affect the production build.]

Each nontrivial class should be tested by a C++ file with the suffix "_t.cpp" that has a main() routine that exits with zero status if and only if the test passes. The test program should run normally without any command line arguments; however, the presence of arguments may affect the behavior of the test.

The best person to test a class is its principal author.

The best time to write the test program for a class is just after you implement it. [That way, the class is still fresh in your mind, so it's easier to think of things to test, and easier to find bugs that the tests expose.]

The test for a given class should assume that all other classes are free of defects. [If another class is suspect, then that class's test file should be augmented.]

Implementing classes from the bottom up facilitates immediate testing. [Otherwise, you need to write throw-away stubs for the as-yet unwritten classes that you depend on, and those stubs might need to be nontrivial in order to meaningfully verify the class.]

A test case for a newly discovered bug should be written before the bug is fixed. [This verifies that the bug is correctly understood and is correctly addressed by the fix.]

The build system can be used to rerun only those tests that depend on a file that changed since they were last run.

As a rule, the verification infrastructure needs improvement if the fraction of actual bugs that escape testing at the lowest possible level exceeds 25%, or if the fraction of bugs that cause incorrect behavior at the system level, without provoking a test or assertion failure, exceeds 5%.

General

Any accurate warnings produced by the compiler must be eliminated if possible. For example, cast unused parameters into "(void)", and replace "if(a=b)" with "if((a=b)!=0)". A warning should be disabled only if there is general agreement that it is extraordinarily difficult to avoid and that its failure mode is harmless or extremely improbable ever to occur. (As a rule, any warnings that would be reported by "g++ -Wall" ought not to be disabled.)

Do not rely on undefined behavior. In particular, don't rely on endian-ness, packing order, int size, two's complement, or shifting by the bit width or more. If the efficiency of native processing is really necessary in such cases, then hide the built-in type behind a typedef whose necessary requirements are documented. Using "#ifdef" or numeric_limits<> for conditional compilation, the typedef can then resolve instead to a class when you port to a platform that does not satisfy those requirements.

Use const and mutable. [They help the compiler find mistakes.]

Write exception-safe code. (In particular, destructors mustn't throw.) See Guidelines for Error Handling in C++ .

Catch exceptions by reference. [This avoids implicit up-casting, and it's more efficient to boot.]

Prefer passing and returning nontrivial objects by reference. [This usually improves performance.] However, beware the common mistake of returning a reference to something that goes out of scope by the time the function returns.

Make destructors virtual (unless you know that the class will never be used with dynamic binding).

Make sure that the referent type is defined (not just declared) where you delete. [Otherwise, the destructor may not get properly invoked.] (The Boost smart pointers have a built-in check for this.)

Observe the "Big Three" rule: If you override the implicit destructor, copy constructor or copy assignment, consider whether you need to override each of them.

Use operator overloading (except for copy assignment) only to abstract the distinction between the class and some built-in type (such as an integer, pointer, or function), for example, for the purposes of using the class as a template argument. In other words, an observer shouldn't have to read the overloaded operator code in order to know what it does. Otherwise, use ordinary functions.

Make local declarations as late as possible. (C++ allows this because it's a good thing.) In particular, use loop scope declarations.

Prefer pre-increment and pre-decrement over post-increment and post-decrement, respectively. [The former are sometimes substantially faster, and never significantly slower.]

Prefer automatic allocation over dynamic allocation.

The ownership of every dynamically allocated object, as well as the responsibilities of that owner to make the object available to other users, should be made clear. In particular, function names should reflect the passing of ownership between caller and callee. See A Pointer Discipline .

Use "0" instead of "NULL". [NULL might not be defined correctly after a C header is #include'ed.]

Use a typedef if the implementation type might change. [This improves readability, and it makes it much easier to change the implementation type.]

Prefer many simple classes over few complex classes. (This is called "cohesion" in the literature because it physically collocates logically related entities.)

Prefer avoiding "friend". It's usually a symptom of poor design. In particular, extending friendship to something outside of the present header file is especially dangerous.

Prefer avoiding exception specifications, with the possible exception of non-inline functions with empty exception specifications. [See GOTW #82 .] (This is a rare case in which comments are preferable to code.)

Prefer references over pointers.

Prefer smart pointers (std::auto_ptr, boost::shared_ptr) over dumb pointers.

Prefer collections (std::vector, etc.) over arrays.

Prefer composition (aggregation) over inheritance.

Use multiple inheritance only with a great deal of care. [It leads to ambiguities, and it scales poorly.] Normally, each "extra" base class should be either a code sharing class, in which case nothing other than the extra base class itself should up-cast, or a protocol class, in which case the extra base class should contain only pure virtual functions and no data members. In either case, every constructor of the extra base class should be protected.

Prefer templates over dynamic binding. (Use dynamic binding if and only if it is necessary in order to meet design requirements, or the additional build cost and executable size associated with using a template could become prohibitive.) [Template error messages are a little cryptic under g++, but they're still better than weird runtime errors. Templates also avoid costly virtual function calls and down-casts.]

Prefer dynamically bound classes over unions.

Prefer avoiding implicit object conversions. In particular, make single argument non-copy constructors explicit.

Prefer <iostream> over <stdio.h>.

Concept checking (which is just a fancy term for raising a compiler error if the requirements of a template are violated) should occur within some scope whose name is indicative of this purpose (such as a method called "check_concepts()"). [This makes it easier to correctly interpret diagnostics as a problem with the template's arguments, rather than with the template itself.]

Write code for people, not for computers. In particular, strive for cohesion, use descriptive names (especially for entities with nontrivial scope), and try to have relied-upon properties (in particular, the validity of all in-scope references and dereferenced iterators and pointers) guaranteed by some minimal enclosing scope.

Prefer using proven library classes over inventing new classes.

Prefer building classes from proven library components over writing classes from scratch.

Prefer using proven design patterns over inventing new class relationships.

Design before you code, but remain flexible to changes both in the requirements and in the design throughout the product's life cycle. Generally, this means building the design out of orthogonal elements, such that a "granule" of change in the requirements is likely to result in a small design change, which in turn leads to minimal code changes. That's basically what design patterns are all about. See On Planning .

Links

Here are some links that offer lots of good advice that transcends coding conventions:

Anders Johnson, last modified $Date: 2004/02/25 $

[ Home | Resume | Programming | Engineering Philosophy | Family ]