As I started to plan this series out I realized a couple of things. Firstly that OOA&D is a subject that I am particularly fond of and secondly that aside any courses taken, to get good at it, you need experience and it is precisely due to this that I have found myself teaching new developers about this subject more than any other. In this first post in the series, I am going to talk about some general tips and rules about designing your classes.
I would like to again remind you that this series is meant for developers which are already reasonably fluent in an object oriented language like C# or Java.
Classes and state
I am obviously going to be talking about classes and interfaces a lot throughout this series. I am going to start however with something that I do not see mentioned enough in my opinion: the relation between classes and state. I am using the word state in a broader meaning to describe both the set of data that we are acting upon and also the data that may describe the temporary configuration of the class.
Since in OO languages everything can be a class (and very often is), classes with function can 1) act upon or process a state or 2) have a state.
- Classes that act upon a state are classes that do not hold references to a state, but are simply passed the state (a set of data) through a public method or delegate and process it. Examples of this is a command processor, with a public method that accepts a queue of commands as a parameter which dequeues each command and executes it, returning the results.
- Classes that contain a state are classes that contain (have a hard reference to) a set of data that are used for calculations or configuration of the instance. Examples of this is a repository which holds a reference to the entity context and a set of methods to use the context for their functionality.
By state I mean a class or structure which represents a set of data, like a date or a list of purchases.
None of the above is “wrong” nor is any “correct” or a good practice. On the other hand while designing your solution and assigning responsibility to the classes, it is best to be fully aware in which of the above categories your new classes will be as each is best suited for different purposes.
A class that acts upon a state without holding a reference to any state allows objects of that class to be more easily shared (between threads and objects) as there are no concerns about synchronization. Classes with state however must handle synchronization to protect against state corruption and race conditions. For classes that act upon a state, it is best to pass the state in the constructor and store the state in a read-only immutable data type to bypass these issues.
While the above is reason enough to make you carefully consider which to use in a scenario, I have found out that this is something best thought about and designed early in your development cycles.
It is important to name your classes so as to be easy to understand its function. A class with a name like MessageParser is pretty explanatory (depending on the situation), while a class simply named Handler leaves too much for the imagination. Naming may not seem very important, but it is my strong opinion that it is often one of the most important (and often difficult) decisions about a class. A well-chosen name can mean time off when debugging and significantly less time when revisiting the code after some time. Do not underestimate naming.
About method visibility and static
One of the questions that I hear more often is what visibility to use for the methods of a class and when to mark them as static. When designing a class I always follow the following process:
- Mark the methods relating to the interface of a class as public.
- Mark helper methods as private static. Static methods, do not have access to any state the class may hold or any instance methods, so they are inherently thread-safe. Marking them as private prevents use by other classes which is essential for static methods, as they can lead to spaghetti code.
- Mark the method as private when the method needs access instance methods or state.
- Mark the method as internal when the method makes sense to be accessed within the declaring assembly, but not outside. This is more often the case with properties rather than methods since properties represent a state, to which we may want to prohibit public access.
Anything else is code smell. I will talk a bit more about this when talking about the single responsibility principle when talking about SOLID.
Avoid static classes
Static classes are a problem since they are accessible from everywhere in the assembly. Apart from the issue with crosscutting concerns that arises and the spaghetti code that may be the result from excessive use of static classes, another very important issue is with testability. Since methods in static classes are accessed by class name, there is a hard dependency on the class which cannot be abstracted to an interface. This in turn makes creating a stub of the static class impossible, which means that any classes that use it cannot be isolated for testing.
Keep methods short
One of the more important rules that you can follow is this: Keep the body of your method short. I see in lots of places some numbers (like 10 lines max) for your method body, but I do not believe that setting hard limits is practical. It is however good when you do not have the experience to recognise how long is too long. I usually say that developers inexperienced with OOA&D should be afraid to write methods longer that 10 lines. Tips to keep your methods short:
- Put your loop bodies into a new method (see visibility tips above).
- Put the body of if.. statements into a new method.
- If you have a series of actions that essentially does one things (for example getting the top and average score from the db) then even if this is just 5-6 lines, if you have other code in your method, then put them into a new method. (We will see the repository pattern as it applies to the EntityFramework soon)
Avoid nested if statements
I avoid nested if statements like the plague. I can handle two nested if's (rarely and barely) but any more than two nested if’s is a no-no. The trouble with nested if’s is not so visible at first. The code works (presumably) and stepping through the code during debug is an easy task to follow. However, in reality it is very easy to create code that will be easy to follow if you revisit the code after some time. The reason of course is due to branching, and our brain is not too good at keeping a large map of possible paths of execution in memory. Even if you are capable of such a task, it doesn’t mean you should. Instead of branching internally, send execution off to a method with an easy-to-understand name. This way you will be able to follow the execution easily in place and step into the method to drill down deeper without the need for keeping the context (previous operations) in mind.
Again, if you don’t agree with this point, just write a couple of methods with more than 2 nested if’s and talk to me again when you revisit the code…
Also, on this subject, please see my previous post about a stack overflow question with more if’s that I could stand.
Avoid long switch statements and large series of if…else statements
This is another problem that I see semi-regularly from new\junior developers. I guess this happens both due to a lack of experience to help identify why exactly this is bad, but also due to the fact that if…else and switch statements are so easy to implement that you rarely even think about alternatives. Some of the problems that these lead to are:
- They are hard to follow. It is significantly easier for the switch statement, where conditions are always constants, but only if the case bodies are short, few in number and goto is not used. As a rule of thumb, if I see 2 else’s I consider it as a candidate for refactoring, and 3 or more else’s prompt an immediate action. Similarly, if I see a switch statement taking more than 10-15 lines or uses goto, I consider it for refactoring.
- Since all cases in the switch share a scope and all if and else statements all have access to the method scope variables, they can be tightly coupled with the variables declared inhibiting reliability.
- They are error prone. Missing an else in a series of if… else if… else if… else, very often will result in radically different results than expected, and it will be very difficult to detect, as there is no help from the compiler about this. If you do not use goto statement (pro tip: DON’T), then switch statements are not as error prone, however, they still have their problems.
A much better solution would be to create a chain of responsibility or a pool of typed handlers to process each case (ideally using IoC to load the handlers into place), both are patterns we will explore with greater detail in later articles. This will greatly increase readability and maintainability, and abstract code to a dedicated,easy to find class.