Posted on September 15, 2010 by Jaime Buelta

Commenting the code

I always find surprising to find out comments like that regarding code comment. I can understand that someone argues about that writing comments on the code is boring, or that you forget about it or whatever. But to say that the code shouldn’t be commented at all looks a little dangerous to me.

That doesn’t mean that you’ll have to comment everything. Or that adding a comment it’s an excuse to not be clear directly on the code, or the comment should be repeat what is on the code. You’ll have to keep a balance, and I agree that it’s something difficult and everyone can have their opinion about when to comment and when not.

Also, each language has it’s own “comment flow”, and definitively you’ll make more comments on low level languages like C than in a higher level language like Python, as the language it’s more descriptive and readable. Ohhh, you have to comment so many things in C if you want to be able to understand what a function does it in less that a couple of days… (the declaration of variables, for example) #

As everyone has their own style when it comes to commenting, I’m going to describe some of my personal habits commenting the code to open the discussion and compare with your opinions (and some example Python code):

I put comments summarizing code blocks. That way, when I have to localize a specific section of the code, I can go faster reading the comments and ignoring the code until getting to the relevant part. I also tend to mark those blocks with newlines.

# Obtain the list of elements from the DB
.... [some lines of code]

# Filter and aggregate the list to obtain the statistics
...  [some lines of code]

UPDATED: Some clarification here, as I think that probably I have choose the wrong example. Of course, if blocks of code gets more than a few lines and/or are used in more than one place, will need a function (and a function should ALWAYS get a docstring/comment/whatever) . But some times, I think that a function is not needed, but a clarification is good to know quickly what that code is about. The original example will remain to show my disgrace, but maybe this other example (I have copy-paste some code I am working right now and change a couple of things)
It’s probably not the most clean code in the world, and that’s why I have to comment it. Latter on, maybe I will refactor it (or not, depending on the time).

               # Some code obtaining elements from a web request ....

                # Delete existing layers and requisites
                update = Update.all().filter(Update.item == update).one()
                UpdateLayer.all().filter(UpdateLayer.update_id == update.item_id).delete()
                ItemRequisite.all().filter(ItemRequisite.item == update).delete()

                # Create the new ones
                for key, value in request.params.items():
                    if key == 'layers':
                        slayer = Layer.all().filter(Layer.layer_number == int(value)).one()
                        new_up_lay = UpdateLayer(update=update, layer=slayer)
                        new_up_lay.save()
                    if key == 'requisites':
                        req = ShopItem.all().filter(ShopItem.internal_name == value).one()
                        new_req = ShopItemRequisite(item=update, requisite=req)
                        new_req.save()

I describe briefly every non-trivial operation, specially mathematical properties or “clever tricks”. Optimization features usually needs some extra description telling why a particular technique is used (and how it’s used).

# Store found primes to increase performance through memoization
# Also, store first primes
found_primes = [2,3]

def prime(number):
    ''' Find recursively if the number is a prime. Returns True or False'''

    # Check on memoized results
    if number in found_primes:
        return True

    # By definition, 1 is not prime
    if number == 1:
        return False

    # Any even number is not prime (except 2, checked before)
    if number % 2 == 0:
        return False

    # Divide the number between all their lower prime numbers (excluding 2)
    # Use this function recursively
    lower_primes = (i for i in xrange(3,number,2) if prime(i))
    if any(p for p in lower_primes if number % p == 0) :
        return False

    # The number is not divisible, it's a prime number
    # Store to memoize
    found_primes.append(number)
    return True

(Dealing with prime numbers is something that deserves lots of comments!) EDIT: As stated by Álvaro, 1 is not prime. Code updated.

I put TODOs, caveats and any indication of further work, planned or possible.

# TODO: Change the hardcoded IP with a dynamic import from the config file on production.
...
# TODO: The decision about which one to use is based only on getting the shorter one. Maybe a more complex algorithm has to be implemented?
...
# Careful here! We are assuming that the DB is MySQL. If not, this code will probably not work.
...

UPDATE: That is probably also related to the tools I use. S.Lott talks about Sphinx notations, which is even better. I use Eclipse to evelop, which takes automatically any “TODO” on the code and make a list with them. I find myself more and more using “ack-grep” for that, curiously…

I try to comment structures as soon as they have more than a couple of elements. For example, in Python I make extensive use of lists/dictionaries to initialize static parameters in table-like format, so use a comment as header to describe the elements.

# Init params in format: param_name, value
init_params = (('origin_ip','123.123.123.123'),
               ('destiny_ip','456.456.456.456'),
               ('timeout',5000),
              )
for param_name, value in init_params:
    store_param(param_name, value)

Size of the comment is important, it should be short, but clearness goes first. So, I try to avoid shorting words or using acronyms (unless widely used). Multiline comments are welcome, but I try to avoid them as much as possible.
Finally, when in doubt, comment. If at any point I have the slightest suspicious that I’m going to spend more than 30 seconds understanding a piece of code, I put a comment. I can always remove it later the next time I read that code and see that is clear enough (which I do a lot of times). Being both bad, I prefer one non-necessary comment than lacking one necessary one.
I think I tend to comment sightly more than other fellow programmers. That’s just a particular, completely unmeasured impression.

What are your ideas about the use of comments?

UPDATE: Wow, I have a reference on S.Lott blog, a REALLY good blog that every developer should follow. That’s an honor, even if he disagrees with me on half the post 😉

On one of my first projects on C, we follow a quality standard that requires us that 30% of the code lines (not blank ones) should be comments.

Category: python Tags: C, english, python, software engineering

16 Comments on “Commenting the code”

joaquin abian
September 15, 2010

Good points. I personally follow closely the same strategy than yours. Except for keeping lines shorter than 80 chars long! :-).

Reply
- khelben
  September 15, 2010
  
  In my current job we use a limit of 120, not 80. Anyway, I think that the 80 characters limit makes a lot of sense in Python as it produces more “compact” and cleaner code. I think it’s more related to the whitespace, indentation, etc…
  
  Reply
  - joaquin abian
    September 16, 2010
    
    Ooops! 120 looks too much to me. I’m currently cleaning up a messy perl code that expands often more than 100-120 chars. This has showed me in the painful way how thoughtful PEP 8 is in average. My main screen is still a 4:3 and probably this also makes 80 chars more appropriate for me…
EOL
September 15, 2010

Doing exactly the same thing here, with the addition of a few temporary comments for temporary debug tests, and parts that I’m working on and tag with comments:

#!!!!!! Update, after updating calling function

#!!!!!! Temporary debug message

Reply
IceBrain
September 15, 2010

I think comments should describe “why” and not “what”.

In your example:
“Obtain the list of elements from the DB”

I’d rather use Extract Method to write something like:

def obtain_list_elements_from_db():

And then call that from the original code. It makes the code cleaner and your less prone to forget to update the comment if you happen to switch from a DB to a file, for example.

In your other example:
“Store found primes to increase performance through memoization”

you’re actually saying _why_ you’re storing the primes, so it makes sense to have a comment.

Jeff Atwood’s post on Code Smells makes some interesting points: http://www.codinghorror.com/blog/2006/05/code-smells.html

Reply
- khelben
  September 15, 2010
  
  Although I think the basic premise is “describe ‘why’ and not ‘what'”, sometimes I feel it’s necessary to also describe the ‘what’.
  
  Reply
- Ben Finney
  September 17, 2010
  
  Definitely concur. If you have a chunk of lines in among another sequence that you think are discrete enough to need their own comment describing their higher-level purpose, then you’ve just discovered a bunch of lines that deserve their own function, *and* a name for that function.
  
  Reply
Tim Ottinger
September 15, 2010

My entire philosophy on code comments is found on this card:

http://agileinaflash.blogspot.com/2009/04/rules-for-commenting.html

Consider the balance that gives you. If you make a structure with named elements, tuple structure comments disappear. Likewise, if you break blocks into functions with names, block comments disappear.

I suspect you’ll find my rules to be a little different balance than your own, and of course I’m interested in hearing how they work for you if you want to try them for a week or two.

Tim

Reply
Tim Ottinger
September 15, 2010

Finally, I wanted to mention that I will add comments as I learn what the code is doing, then attempt to obviate them by refactoring until the code is crystal clear.

I don’t mind adding copious comments as an intermediate step, but I like them to be (mostly) gone from the end state. 🙂

Reply
Kenny Meyer
September 15, 2010

> Finally, when in doubt, comment.

I really liked this point; I guess if all programmers followed this simple rule most code would be much better and a lot of bugs would never happen.

Great article.

Reply
alvaro
September 16, 2010

Good article, But 1 is not a prime number

Reply
- khelben
  September 16, 2010
  
  You’re right! Code updated 😉
  
  Reply
khelben
September 23, 2010

I’ve updated a couple of things, in case someone finds that interesting

Reply
zyczenia wielkanocne
April 8, 2011

Just desire to say your article is as amazing. The clearness in your post is just great and i could assume you are an expert on this subject. Fine with your permission let me to grab your RSS feed to keep updated with forthcoming post. Thanks a million and please continue the gratifying work.

Reply
Pingback: Vim speed is not really the point | Wrong Side of Memphis
Pingback: Compendium of Wondrous Links vol VII | Wrong Side of Memphis