Thursday, December 13, 2007

Commentstipation

When it comes to commenting the code, the cat gets the typing fingers of even creative and knowledgeable programmers. Often we are focused on just getting the code working (We'll comment later! The code has to work first!) and then we forget (what was that again? it's obvious, also time to go home...)!

Code should be self-explanatory and hence self-commenting, as far as possible. But this definitely does not mean that there should be no comments.

Good comments are hard to come by because of -
a) I-Am-A-Programmer-Not-A-Writer attitude
Well, you are a writer too.

b) "I don't know English that well" excuse
Grammatical errors are fine, spelling mistakes are also OK. Write in your native tongue and translate it, if need be. And, if what the code does is that tough to explain, then it's also the more reason to document it, it probably needs to be documented!

Remember the programming adage: Documentation is like sex, something is better than nothing! ;-) But, it must be said, inaccurate documentation is worse than no documentation. An addition to the adage is perhaps needed...having it the wrong way is probably worse than not having it at all! ;-)

TIPS
--------
* A comment should, at the very least, explain and focus on what is special in the block of code. A very good comment also explain why it is being done.

* The best comments explain the what and the why succinctly.

* How to comment
1. Ask yourself "What?"
2. Ask yourself "Why?"
3. Write what you would say to someone so that he could do what you have done. Use simple language. (This point intentionally abstruse, to serve as a memory-aid)
4. Review and edit what you have written, the best you can.

* Do not state the obvious! But also remember that what is obvious to you may not be obvious to another person. It's a thin line between advising and preaching.
Code like:
x=0; // Assigning the value 0 to the variable x
is a big no-no. Do not insult the intelligence of the reader. Plus, you'll have to scroll more while editing the code!

Monday, December 10, 2007

Coding Nirvana - Cryptic Mystic Droppings

CRYPTIC MYSTIC DROPPINGS
FROM THE ELEVATED STATE
OF
CODING NIRVANA

1. Code is just symbols.

2. Code needs a Thread of Life to achieve its Purpose.

3. It's all about the data.

4. It's all virtual. It can be faked but it doesn't really matter; it isn't really matter anyway.

More on each of these droppings (shit!) later.

Saturday, December 08, 2007

Coding Nirvana - Four Noble Truths

CRYPTIC MYSTIC DROPPINGS
FROM THE ELEVATED STATE
OF
CODING NIRVANA

-----------------------------------------------------------
FOUR NOBLE TRUTHS
-----------------------------------------------------------
1. There is suffering
2. There is a cause of suffering. The cause is "Copy-paste"
3. There is the cessation of suffering - "Abstraction Of Commonality"
4. There is a way leading to the cessation of suffering — the Noble Eightfold Path

------------------------------------------------------------

"Copy-paste is the root cause of all programming suffering. Copy-paste is evil. The Eternal Conflict is between copy-paste and the Abstraction of Commonality."
- The Virtual Mystic

Sunday, November 25, 2007

C Powershot - Pointers

INTRO
---------
How should one interpret the following lines of C?

int *p;
"easy! p is an integer pointer!"

int **p;
"p is a pointer to an integer pointer" or perhaps you might say "p is a double pointer to an integer"

int ***p;
"hmm...er..ahem..why would anybody use such a thing! *@#$ ?"


POWERSHOTS - Interpreting a pointer declaration
----------------------------------------------------------------
The interpretations given i n the preceding section, even if somewhat correct, do not scale and could inhibit our ability to understand alien (written by other people) code. The words influence the way we think, so it's necessary that we choose the right abstractions. For example, if a pointer is 4 bytes, why shouldn't a double pointer be 8 bytes? :-) The right abstraction would not even allow us to stray down such lines of thought!

Here's a better way to interpret pointers.

SNN1.1 Pointers are variables which can hold the address of a memory location, usually the address of a variable.

Pointer Part
Consider the statement
int *p;
What the "*p;" portion of the statement says is only this : p is a pointer variable

Let's call the "* p" portion here the pointer part of the pointer declaration. It tells us this much p is a pointer and * operation can be applied on it.

CPS1.1 Whatever follows the * symbol is the pointer variable.

SNN1.2 Pointer variables in C have the * operation (fancier name: dereference) defined on them.

The deference operation gets the contents of the memory location held in the pointer. That is, if you dereference a pointer, you get what the pointer points to.

****

Type Part
Consider, again, the statement:
int *p;
What the "int " part means is this: when you apply the * operator on p, what you get will be interpreted as an integer. It can be used as an integer.

Let's call the "int " portion here the type part of the pointer declaration.

CPS1.2 Whatever remains in the statement after you blank out the pointer portion will be the type of what you get when you dereference the pointer.

FINGER-HIDING TECHNIQUE: Just hide the *p section with your finger, what remains is the type part. This finger-hiding technique can come in handy in other situations as well. It's nifty and mighty useful. It is an application of what I call the typedef principle, we will come to that in a later episode.

Summary:
Pointer declaration = pointer part + type part.

To reiterate, int *p means: p is a pointer, which will be dereferenced as "int"

EXAMPLES
---------------

EX1
int **p;
Pointer part: *p ====> p is a pointer
Type part: int * ====> when you dereference p, what you get should be interpreted as"int *".
You already know what int * means according to the power-shots! This has to be done repeatedly.

EX2
int *p[6][6];
Pointer part: *p =====> p is a pointer
Type part: int __ [6][6] ====> when you dereference p, what you get should be interpreted as "int [6][6]".
This is an array of integers with 6 rows and 6 columns. int

EX3
int (*p)(int i, int j);
Pointer part: *p =====> p is a pointer. The parentheses are required because otherwise due to precedence rule, the * would be associated with int and not p.
Type part: int __ (int i, int j) ==>when you dereference p, the type of data you'll get is "int (int i, int j) ".
This is an integer function which takes two parameters.
Yup, p is a function pointer. (But you do know better now, right? p is just a pointer, when you dereference it you will get something that can be used as a function)

EX4
int (*p(int a)) (int *b);
Pointer-part: *p ====> p is a pointer
Type-part: ( __ (int a)) ==> *p is a function.
So, p is a pointer to a function.
The remaining part is the type of the function.
Finally, p is a pointer to a function, which takes an integer, and returns a function which takes an int* parameter and returns an int!

Aside: Actually, the type part of the declaration is what is within the parentheses enclosing the pointer-part, but that would have confused you; also, this is not needed in the vast majority of cases. Parentheses always rule and dictate, as you should have guessed from the previous example as well!

You'd be much better off using typedefs for complex declarations like this one. But that does not mean that one should not know how exactly it is being interpreted. :-) More on typedefs later. For the time being, referring you to http://www.gotw.ca/gotw/046.htm where this particular example was taken from.

POSSIBLE GOTCHAS
-----------------------------------------
1. Function-pointers can, on some architectures, require more space than normal pointers. If code memory uses a different addressing size/scheme, for instance. Have not encountered this though.
2. Please use parentheses liberally(but judiciously!) inside declarations and the * operator while dereferencing the pointer. These are often skipped and lead to confusing (nah, 'misinterpretable') code.

Wednesday, October 03, 2007

Looking For A Function

I am looking for a function
y = f(x1,x2...,xn)
such that given y and n, it would be possible to uniquely determine each of the x-factors.

1) It's OK for n to have an upper bound k, if k>=10 or so.
2) It is also essential that y be small

Any pointers would be welcome.

Wednesday, September 26, 2007

Code Is Prose



Coding is a form of expression. We can draw many parallels between coding and writing; we write code, we are the authors of the code. When we code, we are actually translating our ideas and understandings into the the programming language. By that line of reasoning, a program is a user manual (or essay or poem, take your pick) that we write in a programming language.

Code is for reading
(and execution too)
-----------------------------
The computer doesn't care how we write the code as long as it works - no indentation is fine, cryptic lines are fine. We should try to get as close to natural language as possible. Do not go overboard though! Code like checkWhetherTheQueueIsEmpty() are no-no's though! Advise, but do not condescend.

Code is read many more times than it is written; code is WORM (Write Once Read Many). Debugging will be done on the code that you write more times than you write actual functionality into it. The code-maintainer will curse you less (he will curse anyway!) if the code you wrote, even if it is wrong, is easily understood. So code with meaning and gain some good karma!

But the clincher argument would be that we would have less of those pesky comments to write!

"Programs must be written for people to read, and only incidentally for machines to execute."
- Abelson & Sussman, SICP

N COMMANDMENTS
----------------------------
* Avoid meaningless names.

* The meaning should not be ambiguous.

* Do not ever use a name that will not occur naturally to a person debugging the code.

* Do not abbreviate unnecessarily. Even if the abbreviation is logical to you, it might not be to another person.

* Vowel-swallowing is not desirable, nt_dsrbl at all.

* Be consistent. If you use underscores in your names to separate words, please don't use camel-case elsewhere and vice versa.
a) Capitalization -
b) Abbreviations - If you have to abbreviate (more often than not, this is a case of "I like to, hence I will") , then at least abbreviate consistently.


Recommended reading
-----------------------------
Literate programming
Writing Unmaintainable code (original)
Writing Unmaintainable code(expanded)

Wednesday, August 29, 2007

Saved By A Unit Test

Consider a function called openCDTray( ) which ejects a CD from the drive.

This function should be operated only when the CD tray is closed. The function also has the following constraint(for effect). Attempting to open the tray when it is already open could result in the tray falling off and reattaching the tray is a cumbersome activity! :-)

The system maintains the status of the tray in a global variable/object called gCDStatus. openCDTray( ) should check the status and then only attempt to eject; otherwise all hell would break loose. But does the function implementation take care of this? Maybe the developer thought that nobody in their right mind would do such a thing and omitted the check. It's so obvious!

A unit test-case to check the response of the function in such a scenario could simply do the following:
1. Set gCDStatus to TRAY_IS_OPEN.
2. Call the function.
3. Check the result. The function should not have succeeded.

But our developers would never skip such basic checks! But even in such cases, unit-test can catch errors that could be missed in a cursory inspection of the code.
1. Typos in the assertion-check.
if ( gCDStatus = TRAY_IS_OPEN) throw ExceptionAlreadyOpen;
or if you consider that to be improbable too
if( gCDStatus = CD_EJECTED) throw ExceptionAlreadyOpen
, where CD_EJECTED is a similar-looking, but different valid value for gCDStatus

2. The openCDTray( ) function might have been modified (copy-paste!) and the programmer inadvertently does something that causes a change of the gCDStatus value.

The unit-test also helps to find out whether the assertion-check has in fact been skipped. This becomes crucial during integration and has to be guaranteed before functional testing starts.
1. The user of the function may not be aware of all the preconditions and hence may not ensure all of them before calling the function.
2. The definition of another part of the system may have changed.

In the example of openCDTray( ) , we would be saved a trip to the CD repair shop by the unit-test!

Sunday, August 19, 2007

Unit Test Scripts

A unit-test function correspoding to each unit-test case broadly consists of the following sections.

1) Precondition Tweaking (Prologue)
Prepare the conditions necessary for the function to execute. Do this for all preconditions not mandated by the test-case; not supplying precondition(s) to see if the function fails may be the test-case.

2) Invocation (Test)
Call the function.

3) Post-condition Verification (Epilogue)
Check whether all the post-conditions have been enforced by the function. Check whether the result tallies with the expected result of the unit-test case.

4) Logging
This may be part of the unit-testing framework itself, if you are using one.

Link the test-function with the test-harness and the unit to be tested. Call these test-functions from a driver program to see how your unit copes!

Friday, August 17, 2007

Unit Test Environment

A Unit Test Suite consists of code which verifies and validates the unit within the unit-testing environment. It consists of -
1) Test Script - Code which calls different functions of the unit under different conditions.
2) Test Harness - In order to achieve its functionality, the unit under test might need the help of other modules. Substitute all such external functions with dummy versions, stripped to the bare minimum.

The unit under test should be run under this unit-testing environment. For the unit of a complex, heterogeneous system, it is more practical and useful to have a unit-testing environment which is much simpler than the actual deployment environment.

Aside: It is tempting to make the unit-test environment "more real, just to see if it works too; anyway I am testing, so why not?". So if you do this, you will end up doing something that's neither unit- nor functional-testing, and you won't get the possible benefits of either!

More about test-scripts and test-harnesses later.

Wednesday, August 15, 2007

Limits Of Unit Diagnosis

Unit-testing can uncover many kinds of deficiencies, gaps and faults. It is not the responsibility of a unit test to verify whether the function actually works or not, that is the domain of functionality-tests. Indeed, a unit may be successfully tested by the unit-test suite and still not work!

1) Inaccurate definitions - If the function succeeds even if some preconditions deemed necessary are not fulfilled, then probably the designer has made an error in the specification.
2) Unaddressed requirements - That lengthy switch-case might be missing a few cases!
3) Certain kinds of bugs - Typos in parts of the code that affect the post-condition enforcement can be caught.
4) Programmer indiscipline - Perhaps that developer might be missing a few basic but essential checks?

If nothing, it provides a way of ensuring complete coverage and execution of the developed code, an objective that would be difficult or time-consuming to achieve otherwise.

Monday, August 13, 2007

Unit Testing As Claim Validation

"Give me a lever long enough and a fulcrum on which to place it, and I shall move the world." - Archimedes (Mathematician and inventor of ancient Greece, 280-211BC)

A function is much like Archimedes's lever, a block of code that claims that it will do something if some things have already been done. A unit test should ensure that the function can actually do exactly what it claims to, when all its conditions are met.

The definition of a function, no matter how complex it is, can be decomposed as
1) Preconditions (Conditions/Environment)- Things which must be done or conditions which must prevail when the function is called. These might include constraints on the input parameters and global state variables.
2) Body (Action) - The main block of code which achieves the functionality.
3) Postconditions (Side-effects)- Things that the function will ensure by the time the body of the function finishes executing. These can be the function's return value or the modification of the state of the system.

Unit-tests should determine whether all the postconditions are fulfilled when all the preconditions are fulfilled by the caller of the function. It is important to remember that unit-tests are more like mathematical verification of the function-model; they check whether the code conforms to the definition of the functions.

Saturday, August 11, 2007

Unit Testing Primer

(Breaking this article into smaller, more readable fragments. Blog-style!)

Unit-testing deals with the verification of what an unit can achieve on its own, without help from other units. In other words, assuming all the other modules required by the unit are available, can the unit achieve what it claims to do? This is the question that unit-testing seeks to answer.

I started off a skeptic, regarding the usefulness of having unit tests, seemed to be such a waste of time. What won me over? Simply put, it helps us to avoid stupid, elementary mistakes. But god knows (as do we!) how common those are, especially during bug-fixes!

My primary motivation to write this series of articles is to-
1) define the scope of a unit-test - There seems to be misconceptions about what unit-testing is all about, what a unit-test should and should not do,
2) list the benefits of unit-testing - there's more to it than meets the eye at first sight. All love is not at first-sight!
3) share my understanding(s) - I had great fun exploring unit-testing, especially regarding what could be achieved with dummy or mock versions of external functions (Harnesses)

Here goes...

Unit testing = Claim Validation

Diagnosis Limits

Unit Testing Environment

Test Scripts

Test Harnesses

Example (Read this first if you will!)

Unit-Testing Frameworks

Run the unit-test suite frequently and regularly, especially at integration-points to ensure that development is proceeding in the right direction. Unit-testing can go a long way in -
1) making your team's code bullet-proof and,
2) importantly, avoiding last-minute-release-time-chaos-induced bloopers.

I must also add here that I am not a fan (yet?) of test-driven development. I would recommend unit-testing for finding errors that could easily escape visual checks or code walk-throughs..

- Thomas Jay Cubb

Please also read Learning To Love Unit Testing.

TODO - Diagrams

Wednesday, July 25, 2007

Draw The Line

While coding within code-frameworks, we will be required to insert functionality into boilerplate code. There might be lots of TODOs marked in the framework code. When doing so,if what you are doing would be long-term, call hook functions to your own non-framework code rather than coding the functionality in-place.

Call your designed functions through a "hook" function and don't code the functionality within the generated code. Draw the line!

Advantages:
1. It will be easier to move to another framework. Otherwise, there'll be lots of last-minute copy-pasting and hence errors introduced.
2. Better visibility. When something goes wrong, you know who to blame!
3. More room for experimentation. You might want to tweak the framework's configuration and generate new code.

Tuesday, July 03, 2007

Principles Of Codeline Management

Might seem like common-sense, but learnt the hard way!

1 Build Inequality
a) What is working on one build might not work on another. We must retain information regarding important working builds to facilitate later investigation!
b) Some codelines are important than others. 10 units of work on a build close to the mainline is probably worth more than 20 units of work on a distant line.

2 Fix Early
Find errors and fix problems at the earliest. Yes, getting things working your build is very important.

3 Merge Early
Get close to the mainline, at the earliest possible stabilization point.

4 Merge And Fix Early
Find errors and fix problems on a build that is as close to the mainline as possible. To paraphrase a proverb, “Early to merge and early to fix makes the release smooth and nice!” ?

5 Useful Artifacts
If an artifact is worth keeping, save it. If it is not worth keeping or it is not likely to be used, remove it.

6 Docubits
Documentation: every bit helps, and a little bit of documentation at every step doesn’t hurt!

Hopefully I will be writing more on each of these points later.

Wednesday, June 20, 2007

CM Synergy Crash Course

Bits and pieces of CM Synergy info picked up. Hope this supplements the sparse Synergy documentation. Also posted at the CruiseControl website

Introduction

CM Synergy aka Telelogic Synergy is, "a task-based configuration management system". But the philosophy of the tool is radically different from other version control systems around. These differences sometimes add up and hit you hard; especially if you have prior experience with any of those other tools!

Synergy-ese (Essential)

CM Synergy uses different terminology from other CM tools. Some of these "new" terms are ill-chosen and may take some getting used to. If you are moving to CM Synergy from another tool or are forced to adapt (new job etc), then these rough translations may help you get to speed faster.

Task (almost)= objects
A task is just a collection of versioned objects. Synergy adds the idea that these objects are grouped together for a purpose and hence it is a "task".

Check-out
You check out objects and make changes to them. Synergy recommends that the checkouts must be made against a task. When you check out an object is that it gets added to the task. In effect, you are making the task as you work. When you check out stuff, it becomes a modifiable "working" copy on your hard disk.

Check-in
After you make those changes, you check them in. You can check in the task, Synergy tracks the objects associated with the task and checks in each of those objects. Alternatively, you can also check in objects individually.

Folder = Label, Tag (Do not draw the "directory" parallel!)
Tasks are put in folders. Since a task is a collection of objects, the folder will, effectively, contain the objects that make up those tasks.

Reconfigure = Update
Reconfigure is how you get files from the repository. A set of folders would have been allotted to you within Synergy; when you reconfigure, you would be getting the contents of those folders. The fancy Synergy-ese term for what you get when you reconfigure is the not-for-mortals-sounding "reconfigure template".

Project
A project is a list of the versions of all the files (objects) being used by you.

Finding Synergy

(TODO)

Advanced Synergy-ese (Non essential)

(TODO)

Sync

Reconcile

Get Out Of Jail Free!

(TODO)

Conclusion?

(TODO)

- Thomas Jay Cubb

Monday, April 23, 2007

Root Or Fruit?

As far as bug-fixes are concerned, there are broadly two ways to go:
1) Remove the cause (Solution!)
2) Mask the effect (Hacks!)

For complex systems, before we can possibly arrive at a complete solution, we will need to work at the problem both ways. Attack the root or attack the fruit? Though it is obvious that fixing the root is the ideal solution, often we have to make do with a combination of the two.

Attacking the cause straightaway may not be such a great idea.
1) Incomplete understanding of the system - You might be familiar only with a part of the system.
2) Time-pressure- How many lines of code can you possibly go through before the deadline?

Magnify the effect! Change the effect into something more drastic. This will trigger new lines of thinking.
Remove the effect! Bugs are caused by code; remove the code!
Take breaks. The a-ha moment will often emerge when you are away!