Value Types, Reference Types, and writing with clarity!

Recently I served as a technical review editor for a book on C# and .NET. Among other issues, I noticed that the explanation "bullet points" describing "pass by value" and "pass by reference" semantics were not only unclear, they appeared to contradict each other. Quite annoyed, I wrote up a particularly scathing review comment and the authors took my advice (plus, I hope similar advice from other tech reviewers).

Unfortunately, when I read the section in the final published copy of the book, I suspect the authors may have jumped from the frying pan into the fire - they added more content, which instead of clarifying the issue and the major points, served to muck it up even more, in my opinion. A clear and unequivocal understanding of value types vs reference types in .NET is of the utmost importance.

Therefore, I present my own attempt. Einstein said that a theory should be as simple as possible, but no simpler. With that in mind (and to their credit, much of this relies on the MS Patterns and Practices whitepaper):

Value Types and Reference Types

All .NET Framework data types are either value types or reference types.

Value Types
Memory for a value type is allocated on the current thread's stack. A value type's data is maintained completely within this memory allocation. The memory for a value type is maintained only for the lifetime of the stack frame in which it is created. The data in value types can outlive their stack frames when a copy is created by passing the data as a method parameter or by assigning the value type to a reference type. Value types are passed by value by default . "By Value" is when an argument is passed into a function by passing a copy of the value. In this case, changing the copy doesn't affect the original value,

If a value type is passed to a parameter of reference type, a wrapper object is created (the value type is boxed), and the value type's data is copied into the wrapper object. For example, passing an integer to a method that expects an object results in a wrapper object being created.

Reference Types
The data for reference type objects is always stored on the managed heap. Variables that are reference types consist of only the pointer to that data. The memory for reference types such as classes, delegates, and exceptions is reclaimed by the garbage collector when they are no longer referenced. It is important to know that reference types are always passed by reference. "By Reference" is when an argument is passed to a function by passing a reference to the actual value. In this case, if you change the argument in the function, you also change the original.

If you specify that a reference type should be passed by value, a copy of the reference is made and the reference to the copy is passed *.

Additional Notes on VB.NET:

Boxing in Visual Basic .NET tends to occur more frequently than in C# due to the language’s pass-by-value semantics and extra calls to GetObjectValue. Use the DirectCast operator to cast up and down an inheritance hierarchy instead of using CType. DirectCast offers superior performance because it compiles directly to MSIL. Also, note that DirectCast throws an InvalidCastException if there is no
inheritance relationship between two types.

Further, it should be noted that in the .NET Framework 2.0, Generics provide for a much more efficient mechanism to avoid the overhead of boxing, particularly with Collections.

I think the above is both simple and elegant. It has sufficient information to cover the most important points, but not "too much information". It is presented clearly, and it does not assume that the reader already knows the definitions of key terms that are used. I can understand what I wrote, and I suspect most others can.

Why can't book authors learn to do this? Developers buy and read technical books in the hopes of receiving clarity, not muck.

* Note that in the case of the difference between passing a value object by reference and a reference object by value, as noted by Bruce Wood in his comment below, MVP Jon Skeet (whose writing I much admire because he understands the word "clarity" as it applies to writing) illustrates here. In particular the finer point is, as Jon describes, "This difference is absolutely crucial to understanding parameter passing in C#, and is why I believe it is highly confusing to say that objects are passed by reference by default instead of the correct statement that object references are passed by value by default."

Bruce's other comment clarifying the finer distinction of where memory is allocated for value types based on whether they are class fields vs. local variables or method arguments should also be noted.