When I first started programming, I clearly remember feeling I had to add comments, that would repeat exactly what the code below was doing, as if it were the script for some sort of voice over. I want you to know like I now do that it’s not the way to comment one’s code. 😅
An important goal of good code is to be readable so that future contributors can build with and upon it as needed. Good commenting is part of the toolset for reaching that goal. In this post we shall first present principles of code commenting, and then a few tips.
Principles for commenting
Comment your code as little as possible. 😉 Now, this does not mean to not care for readability and clarity at all, on the contrary. Pack as much information as possible in the code itself. For instance, naming one variable well can allow you to skip telling the reader what it is.
The tidyverse style guide states “use comments to explain the"why” not the “what” or “how”". The what and how should be deduced from your code.
In “The Art of Readable Code” by Dustin Boswell and Trevor Foucher, the chapter on knowing what to comment starts with the key idea “The purpose of commenting is to help the reader know as much as the writer did”.
Code comments should be viewed as little flags, little alerts. There should be as few of them as possible. Otherwise, your reader will get used to ignoring them. Furthermore, you’ll get extremely bored writing them. 💤
Example of a recently encountered useful comment:
# This query can not be done via GraphQL, so have to use v3 REST API
Code comments should not be used as a band-aid for bad code design. If it’s very difficult to explain a piece of code, possibly more time should be spent on said code. Or, you could add a comment… “#TODO fix code debt”. 😇
Tips for (not) commenting
Name things well
Using concise but precise names will help make your code readable. In the book The Programmer Brain, Felienne Hermans presents Feitelson’s three-step model for better variable names:
“- Select the concepts to include in the name.” “- Choose the words to represent each concept.” “- Construct a name using these words.”
Isn’t this a life-changing tip?
Use helper functions or explaining variables
The post “Explaining variable” by Pete Hodgson (found thanks to a tweet by Jenny Bryan) was truly eye opening for me so I recommend reading it.
The principle is to replace a piece of code with a well named Boolean variable for instance, or a function.
Here’s an example with a function. Instead of writing some code à la
the idea is to write something like
is_non_empty_string <- function(x) {
!is.na(x) && nzchar(x)
}
if (is_non_empty_string(x)) {
use_string(x)
}
where is_non_empty_string()
could even be defined in a separate R script (called utils-blabla.R
or so).
Wrap external functions with a nicer interface
This is very related to the previous tip. Say you want to use a function with an unclear name. Do not hesitate to wrap it in a function with a better name if you think it’ll improve readability. (You can also use this technique to switch the argument order.)
# in utils.R
remove_extension <- function(path) {
tools::file_path_sans_ext(path)
}
# in other files
remove_extension(path)
Think twice before adding a comment on your own PR
In my own experience, aspects I want to add in a line comment on a GitHub Pull Request are often prime content for actual code comments (or a reason for refactoring my code!). It’s not always true, but I try to pay attention to whether the comment should stay in the GitHub history only, or alongside the code.
Have someone review your code
As much as you try to think about what a collaborator (or future you) would like to know when reading the code, it’s handy to have an actual collaborator tell you where a comment might be warranted. The collaborator might not ask for a comment, but they might ask a question whose answer should be tracked in the code.
Spare no effort on roxygen2 comments
Thanks to Mark Padgham for insisting on this subsection!
Some comments are special: the roxygen2 comments that create documentation! Documentation is good!
Even internal functions can be documented using the same syntax, although you’ll want to add the #' @NoRd
tag for making sure no manual page is created. This convention is encouraged in the rOpenSci packaging guide and in the Tidyverse style guide.
Use comments for the script’s outline
In RStudio IDE at least, there’s an outline (table of contents) on the right of the script that you can expand to navigate the code. Functions are used for organization, but you can also add comments like
# header level 1 ----
bla
## header level 2 ----
blop
to have “header level 1” and “header level 2” appear in the outline. Having an explicit structure can help code readers.
Conclusion
In this post, I explained why to comment your code as little (and as well!) as possible, with a few ideas of how to do that. I’d really recommend reading “The Art of Readable Code” if you can get your hands on a copy. Furthermore, unsurprisingly, practice helps, I myself need a lot more of it. Feel free to comment 😉 with your own commenting tricks.