Explicit Code requires no comments - Only bad code does

Following the recent discussion on commenting source code by Brandon Savage and Marco Tabini, I wanted to add upon Marcos code example to show that it not even needs any inline comments and can be greatly enhanced in readability with taking 2-5 minutes of time and refactoring the code. The original code sample was:

function __construct(array $pathInfo) {
    $section = $pathInfo[0];
    $file = $pathInfo[1];
 
    // Assume that the location of this file is inside the main trunk, and that pathInfo has already been filtered.
 
    $path = dirname(__FILE__) . '/../../' . $section . '/xml/static/' . $file . '.xml';
 
    if (!file_exists($path)) {
        Controller::singleton()->doError(404);
    }
 
    parent::__construct('main/xsl/template.xsl', $path);
}

This sample is somehow related to the response and view rendering of an application and its pretty understandable. For me there are some squirks though that would make me have to scroll to other classes to see how they interact. For example the path information falls from heaven. Why is it an array of 2 values and are these sections set the same all the time? And what exactly happens when the front controller does the 404 error? Is there a die() or exit in the doError()

The Agile movement postulates the phase of refactoring for a finished piece of code. Rebuilding the hacked solutions that did it to make them more understandable. I applied this to Marcos code snippet. The refactoring produces about double the code, but its done entirely by using NetBeans nice Rename variable and copy paste on extracting methods, which takes not more than 5 minutes.

class StaticXmlXsltRendererImpl extends AbstractXsltView
{
    function render(array $filteredPathInfo) {
        $templateValuesXmlPath = $this->getTemplateValuesXmlFile($filteredPathInfo);
 
        if(!$this->templateExists($templateValuesXmlPath)) {
            Controller::singleton()->doError(404);
        } else {
            parent::render('main/xsl/template.xsl', $templateValuesXmlPath);
        }
    }
 
    private function getTemplateValuesXmlFile($filteredPathInfo) {
        $section = $this->getSection($filteredPathInfo);
        $file = $this->getFile($filteredPathInfo);
 
        // Assume that the location of this file is inside the main trunk
        return dirname(__FILE__) . '/../../' . $section . '/xml/static/' . $file . '.xml';
    }
 
 
    private function getSection($pathInfo) {
        return $pathInfo[0];
    }
 
    private function getFile($pathInfo) {
        return $pathInfo[1];
    }
 
    private function templateExists($path) {
        return file_exists($path);
    }
}

In this code snippet the class name, variable names and methods clearly communicate what is done, which wasn't the case in the previous example. The only change for the clients is the change of using the constructor to using the render() method, which communicates the intend more. Also the render method now clearly communicates that either the error or the parent rendering is executed and leaves no doubt about it.

The getTemplateValuesXmlFile() method still uses the comment to show the assumption about relative pathes, but this is only the case because application configuration is made implicit into the code. This has to be extracted to be an explicit configuration constant and the comment can go.

private function getTemplateValuesXmlFile($filteredPathInfo) {
        $section = $this->getSection($filteredPathInfo);
        $file = $this->getFile($filteredPathInfo);
 
        return APPLICATION_ROOT . '/' . $section . '/xml/static/' . $file . '.xml';
    }

In my opinion commenting code is necessary only for non-refactored code that has been hacked into existence and is hard to understand. Either the programmer has to get it done and has no chance to clearly communicate the intend. Or what is even worse the to be changed legacy code is hard to understand but you can't take the chances to refactor it because its also already in production and has no tests. Now from the second point it is obvious that upon ugly code you put only more and more ugly code to fix the problems. This is what leads to the legacy maintenance problems that pretty much every programmer faces. And then commenting comes into play: You have to add a new feature X and it has to be done fast. You don't really understand the code or how it works together but you know you can put the new code for the feature into existence but its really unintuitive. So you begin to comment it excessively, because its the only way to clearly show its intend.

There are two mindestting factors that help to write code that is understandable without having to excessivly comment it:

  • From the beginning, do not write code for the computer but for developers. Changing this attitude really helps to write understandable code like the one above.
  • Only leave code better than it was before, never worse.

There are five technical practices that - when followed - allow to write clearly communicating code from the beginning:

  1. Giving classes, variables and methods good names. This is a no-brainer but few people seem to follow it anyways.
  2. Following the object-oriented Single Responsibility Principle by never giving a class more than one responsibility. Macros example seems to follow this one.
  3. Methods should never switch in the level of detail. Micro-work at the datastructure level should never be mixed with macro level delegation to executing large chunks of code. Macros code violates this by mixing the path building micro-level work with the macro-level work of rendering the XSLT template. The path building code can be hidden behind a method to communicate intend more clearly.
  4. Exchange if conditions with private methods that explain the condition being checked for. In Marcos example this is not really necessary, because file_exists already is quite a good description to the condition. But in cases of logical combinations of conditions the method extracting is a superior way to explain the conditions intend without having to write a comment.
  5. Seperate Query and Command: A method never should do a query which returns the state of an object and a command which executes a set of rules on the state.

These practices sum up to one guideline: Make code explicit. This obviously requires less commenting since a comment of explicit code would be duplication and duplication is bad. What if you have a project that does not follow this guidelines? Then of course comments should be used to explain code, but in the long run this should be refactored to self-explaining code. Additionally every new feature should be programmed explicitly to follow the "leave code better than before" principle.

In my opinion two refactoring tools are missing that would greatly help PHP programmers write nice to read code: Extract method and Replace magic value with constant. Can someone integrate them into NetBeans please?

Update: Fixed a creepy copy-paste code bug, thanks to azeroth for pointing out. Moved methods around a bit to be more reading friendly.

Share This Post

  • Share on Twitter
  • Facebook
  • Share on deli.cio.us
  • Share on Digg
  • Share on reddit
  • Share on StumbleUpon

Comments


azeroth
May, 01. 2009

Am I the only one who thinks that the original code sample is much easier to read and edit than the huge piece of one-liner functions that came out of it?



azeroth
May, 01. 2009

One more thing, shouldn't it be
if(!$this-templateExists($templateValues)) { ... ?



beberlei
May, 01. 2009

@azeroth: thats right, i got the condition wrong, fixed it thanks :)

The one-liner methods are not really what you need to read unless you really need to make a change to their functionality. They are not necessary for understanding the render() method which is why they should be extracted.

You can inline them again into the getTemplateValuesXmlFile function if you want, but this would still leave a much more clean communication of the path building process than before.

The benefit is in being able to see the macro level at a much more concentrated level and understand it without having to pick-guess on the micro level datastructure work also. Its not that you always need to grasp all the detail on all quirks of a method, but that fast understanding is required to allow a qualified decision on where to make a change.



wolfgang stengel
May, 01. 2009

I agree with azeroth, it seems like total overkill to blow up the code this much. I'd rather read two lines of comments.



fqqdk
May, 01. 2009

You don't have to read the whole class, that's the fun in it! Just the part that you need! The concerns are separated, and there are multiple places to look for things, but the call hierarchy here is pretty much obvious, and it's harder only, when you don't have a scroll wheel on your mouse, and your IDE can't fold the methods.
But of course if you have to hack this live via ssh, using mcedit, than I can understand your problem about blown-up code. But that also means, that you are not interested in automated testing and deployment, or anything like that, and that is why you don't appreciate this kind of "bloat".



jack
May, 01. 2009

"variable names and methods clearly communicate what is done"

They don't.

I agree with azeroth. The code above is now way harder to read for me too. It may be easier to maintain and test, due to the separation of concern into smaller chunks of code, but ultimately, you have to read more to get the overall picture.

In my opinion, the code provided is a fine example of why refactoring does not help to document code and it just proves that self commenting code is indeed a myth. You are trading maintability for readability. And to cope with that, you should add comments/docs.

To prove my point, add the following to your render method:
/**
* Renders a template with the values given in the provided XML file
* @param Array $filteredPathInfo
* @return void
*/

Not only will this visually distinguish your code blocks more easily, it will also give all the information I need to actually use the code. I don't have to know how the code's inner workings. Just what to pass in to get the described effect or output. That's programming to an interface.

If there is any room for discussion about commenting, then it should be about inline comments. But regular documentation is a must.



beberlei
May, 01. 2009

I dont say that regular PHP docs are not necessary. They are actually required alot if you do lots of OO programming, you need to hint Zend Studio, Netbeans or PDT the correct type for auto-completion. What is unnecessary however is the description at top since the class name and the method name and parameters already say that XsltXmlView is rendered by taking a file path. The Zend Framework for example requires excessive documentation of getters and setters that are just a duplication of what the method name already describes. This is the case for most doc blocks.

Additionally in java or C#/.NET the whole doc is just bloat, since you already hint in the explicit method signature. This can be achieved in PHP also, if you replace the $pathInfo array in the render method with an actual PathInfo object. Then you could move the getSection() and getFile() methods to that object and simplify even more. The problem with phps arrays, while very hand for almost all tasks of processing, they don't communicate their structure too well.

@jack: of course you have to read more to get the complete picture in my case. but who ever said that you have to know everything? if you change a method do you really need to know the inner working of each and every utility method that contributes? if they communicate intend clearly you can skip to understand them fully. This allows to focus on different aspects. Humans are just not equipped to keep every detail in mind. That is why we talk so much about abstraction and models.

What I hate about hard to maintain code is the fact that you spend lots and lots of hours to only understand how it works, because you have to. And this pattern always keeps coming back. You might have to understand the same huge code block several times in your life, because you have to change little different things. This is just unnecessary.

In general the example code is probably not good enough, there is lots of open source code out there that can probably serve as an better example of how to write unmaintainable code that has to be excessively commented (probably even code by myself).



jack
May, 01. 2009

@beberlei I completely agree that in cases of getters/setters it is redundant to state the obvious. But we cannot expect that what is obvious to us when we write the code is obvious to others as well.

The short description in the Doc Block reads naturally to humans. Code is suboptimal for expressing ideas. This is already proven by the fact that we mimic natural language in our code by assigning meaningfulCamelCasedMethodNamesThatMakeMyEyesBleed() along with $verboseParamNames just to be better able to understand what's going on. When doing so, you already admit that natural language is easier to grasp, even for a seasoned developer.

Simply put, we are not code parsers. The grammar offered by programming languages is inferior to that of natural language in terms of conveying meaning to us. Code is harder to read. No one speaks code in daily life.

A completely different question: is the bot protection result a reference to THX-1138?



beberlei
May, 01. 2009

yes its a reference. that number is stuck in my head, its used in sw ep4 also.



richard thomas
May, 01. 2009

Your kidding right? Even if your code was easier to read which is debatable your saying twice as much code is better then a simple line or 2 of comments!

I dub you the "Bloat Master".. Please go work for microsoft



dagfinn reiersøl
May, 01. 2009

Hmmm...interesting.

This may not be a typical example of something I would want to refactor, but it's an interesting exercise. There's more than one way to do it, and I would probably do something different. Your class is not really having one responsibility only. Should a renderer class really be messing with file system paths? I don't think so. I think extracting a class (something like a TemplateXmlFile) might be more appropriate.

Something like this (let's hope it formats correctly):

function __construct($filteredPathInfo) {
$this-templateFile = new TemplateXmlFile($filteredPathInfo);
}

function render() {
if ($this-templateFile-exists()) {
parent::render(
'main/xsltemplate.xsl',$this-templateFile-getPath());
} else {
Controller::Singleton-doError(404);
}
}

It still mixes levels of abstraction too much, but I would say this is better.



dagfinn reiersøl
May, 01. 2009

The blog software stole the greater-than signs in my method calls and my indentation, but I hope it's readable, sort of.



infolock
May, 01. 2009

I'm sorry but I do not agree with this method of breaking pieces apart. Creating functions to return array values like are being done, or one-line statements is over-kill and a waste of processor speed.

All that's being done is re-defining already existing functions that are built-in to PHP, causing further headache and more processor speed to do simple applications.

For example, these 3 function definitions are just redundant and poorly designed functions:

private function getSection($pathInfo) {
return $pathInfo[0];
}

private function getFile($pathInfo) {
return $pathInfo[1];
}

private function templateExists($path) {
return file_exists($path);
}

All 3 of these functions are useless, waste of processor/memory usage, and redundant... it's re-invinting the wheel with a larger wheel covering the smaller wheel.

And as for removing __construct for render, that depends on the class, it's purpose, and overall if it's necessary. If the only thing required is for the class to be instantiated and execute on it's on based on params (or no params), then creating a render function is, once again, unnecessary and redundant.

Redundancy is one of my pet peeves. If you're recreating the wheel just for the sake of making it "easier to read for other developers", then you need to re-evaluate what you consider "easier to read". To me, proper commenting styles (such as PHP DOC) is as easy as I'm going to go.

If the developer looking at my code cannot read my comments and then follow the flow from there, then they should not be making changes to that application due to their inability to follow fundamental understanding of PHP.

I understand where this article is coming from in a way, but to me it's a novice's perspective that hasn't been thoroughly thought through.



signe
May, 01. 2009

I have to agree with what many others have already said. The "refactored" code will be much slower due to the extra, redundant function calls, and is no cleared to read than the original was. It's much more like a body that's a week in the water. Bloated.

Even the names chosen for the redundant functions aren't clear unless you read the code for them. What is a "section"? Does getFile get the contents? getPath and getFilename would be more appropriate, but still useless.



g-rant
May, 02. 2009

It's a big jump to go from "commenting code is necessary only for non-refactored code" to "understandable without having to excessivly(sic) comment"

Excessive means "more than what is justifiable, tolerable, or desirable", so the latter is like saying "let's not comment more than we need to." But that's the whole crux of the discussion: "how much do we need?"

A fine point I'd like to throw out is that sometimes more important than commenting *what* code does, is commenting *why* we did it this way. Most of us have gone back to refactor code only to find, after spending some time, that there was a reason we did it the original way that precludes some of the refactoring.



marcel esser
May, 03. 2009

I can read the first code sample and understand it in a heartbeat. The second one is insanely complicated.



hilo;jo;
May, 03. 2009

@azeroth, no you are not.



checks
May, 03. 2009

Two pieces of code which work the same - one is 14 lines long and the other is over 30 lines long.

Adding complexity doesn't make anything easier to understand.



G-rant - commenting on WHY vs. WHAT - you are spot on.




ilia jerebtsov
May, 05. 2009

private function getSection($pathInfo) {
return $pathInfo[0];
}

See, this bothers me. If you want to make helper functions that work on a specific type of path, and furthermore, you want to absolutely, verifiably ensure that you're working with that specific type, then why not make it into a class?

class FilterePathInfo {
public $Path; // encapsulate as needed
public function getSection() {
return $this-Path[0];
}

public function getFile() {
return $this-Path[1];
}
}

Then, you can just use type hinting to ensure that you're passing this particular class, making it obvious, intellisensable, etc.

private function getTemplateValuesXmlFile(FilteredPathInfo $path) {
// Assume that the location of this file is inside the main trunk
return dirname(__FILE__) . '/../../' . $path-getSection(). '/xml/static/' . $path-getFile() . '.xml';
}

If you want to make things really explicit, then you can add a flag to the class to say whether the path is filtered or not, and assert when needed.

Most code clarity issues can be resolved simply by putting things into their right places and giving them the right name, and that's one thing that OOP is great at. It's a pity that not many people seem to use it correctly.



fqqdk
May, 29. 2009

do this blog have an rss feed?



fqqdk
May, 29. 2009

foundit :)



wimple pimple
Jun, 25. 2009

The good things with comments is that they do not blow the code bigger while adding more and more information.

Imagine: You can refactor (improve) your code by commenting sothat it stays put in the machine and readability for the team increases. Additionally you can come back here years after and even add comments by ensuring the code still runs - without any additional tests.

So I can only strongly remind that you should not ideally apply your remember-rhymes but to handle them practically. If you can reduce code you can reduce complexity. comments help there a lot.

there is one downside: if your code becomes too readable and the project is public and popular, even the dumbest coders can join in and hack.


Write a Comment