Writing unit tests is an essential part of development in RAMCloud. We don't consider a coding project "done" until it has good unit tests, and in particular you shouldn't push changes to the main repository unless they have proper unit tests. Our goal is to get (nearly) 100% test coverage by lines of code. Here are some of the conventions we follow when writing unit tests:
Unit tests should be isomorphic to the code
This is the most important rule. The structure of the tests should match the structure of the code:
- There should be one test file for each code file. If the name of the code file is Foo.cc, then the name of the test file should be FooTest.cc.
- Within a test file, there should be one collection of tests for each method in the code file, and the tests for each method should be in the same order as the methods in the code file.
- If there is more than one test for a particular method, then the order of the tests should reflect the order of the code they are testing in the actual method.
Making the tests isomorphic to the code makes it much easier to maintain the tests as the code evolves: when you modify the code, it's easy to find any existing tests for that section of code, so you can see whether you need to add additional tests. If you restructure a code file by moving methods around, you must restructure the test file to match.
Tests are white-box, not black-box
We write unit tests to test the specific implementation of a feature, not just its interface. Typically, you should be looking at the code when writing tests, and you should write unit tests to exercise each piece of the code. This means that, in general, you can't write the tests before you've written the code (but, see the following rule for an exception).
When fixing a bug, write the test first
Whenever you fix a bug, there should be a test for that bug fix. Furthermore, you should write the test before fixing the bug: write the test, make sure it triggers the bug, then fix the bug and make sure that the test now passes. If you wait until after fixing the bug before writing the test, it's easy to end up with a test that doesn't actually exercise the bug. Or, alternatively, you can write the test after the bug has been fixed, but then you should back out the bug fix temporarily to make sure that the test really fails.
Tests should be microscopic
It's okay to have a few unit tests that exercise overall functionality, but most tests should be very focused. Each test should validate a specific condition or section of code. It's better to have many individual small tests, rather than one large test that validates many different conditions and/or code sections. Tests that do long sequences of operations are hard to understand and maintain, so split them up.
Write utility methods for setup
Sometimes there's a fair amount of setup that must be done before you can run the actual test, and the setup is the same for every test. This might lead you to believe you should simply combine all of the tests together into one mega-test see you don't have to repeat the setup code over and over, but that's not the best way to handle this. Instead, write a method in the test class that does the setup, then just invoke this one method from each of a collection of smaller tests. Or, do the setup in the constructor for the test class.
Use mocking to isolate classes
Ideally, each class should be tested in isolation. However, some higher-level classes have complex dependencies on lower-level classes that can make unit testing hard (e.g. we don't want to have to set up an running RAMCloud cluster to run unit tests, but some classes will invoke RPCs to other servers). In situations like this, write mock classes that can be used instead of lower-level classes to simplify writing unit tests. We have already written many such classes. For example, if a class invokes system calls, there is a MockSyscall class that can be used to avoid actually making system calls.
Write tests that don't break easily when unrelated code changes
It's easy to write tests that are extremely sensitive to the overall structure of the system, such that the tests will break if almost any code changes. When this happens, it becomes hard to make code changes, because we have to update a lot of tests afterwards. Try to avoid such tests. Ideally, a test is sensitive to the behavior of a particular piece of code (the code it is validating), but not sensitive to other code in the system. One way to create more robust tests is to setup a test by invoking API methods of a class, which are less likely to change, rather than by directly modifying fields of structures, which are more likely to change.
Tests should run quickly
Please avoid tests that take a long time to run (e.g. hundreds of milliseconds or longer). Our entire unit test suite currently runs in about 10-20 seconds, and we'd like to keep the running time from growing much beyond this, so that it's not painful to run the tests. Occasionally long-running tests are unavoidable, but to avoid them as much as possible.
Avoid timing-dependent tests
When testing asynchronous behaviors of the system, it's tempting to write tests that look like this:
Unfortunately, the timing for the asynchronous action may not very predictable, which creates a dilemma. If you set the sleep time low, then the test may fail because of variations in running speed (e.g., due to a high load on the machine). If you set the sleep time high enough to be absolutely certain that the action can complete, then the test will take a long time to run.
It's better to structure such tests like this:
This has the advantage that the test will almost always complete quickly. You can be conservative in allowing the loop to execute many iterations before failing, since this will almost never happen.
Sometimes unit tests aren't feasible
Although in general we expect all code to have unit tests, there are a few situations where it isn't really possible to write unit tests. One example is our low-level network drivers. These classes tend to be highly dependent on the NIC hardware; it typically isn't possible to run tests using actual NICs, and if the NIC is mocked out then the test isn't particularly useful (the code might work with the mocked NIC but not with the real one). In addition, drivers tend to have fairly straight-line code, so almost all of it gets exercised immediately in a real RAMCloud cluster. Don't give up easily on writing unit tests, though; in most cases iit's possible to find a way to write meaningful tests.