This page will document conventions I use when writing computer code. Hopefully folks will find them helpful. I will try to explain the reasoning behind each convention clearly. I have found the rationale omitted from many coding convention guidelines and found that to be frustrating.
See also Google's Python Style Guide. Overall it is spot on and goes into good detail and examples. There are a few points I disagree with, but they are not super important. My main gripe is there are some points they assert without explaining the underlying reasoning.
All these conventions reinforce certain core tenets.
elif svd['method'] == 'some.literal.string' and filter(lambda x: type(x) == type((0,)) and x[1], svd.results.get('test_totals',{}).items()):.Clear naming is absolutely critical, in my opinion. This applies very broadly: names of products, projects, directories, files, classes, methods, functions, variables, modules, packages, etc. Clear names make all the difference. I often will spend ten minutes thinking about the best name for a key class or method. The fact is, naming is something you can't avoid. You can get away without writing comments or documentation, but every file needs a name and so does every variable. Therefore, the absolute minimum you can do is make the names clear. And it goes a long, long way. Conversely, naming that is confusing or unclear from the beginning, or that becomes confusing through a refactoring without the accompanying renaming, is wasteful. The maintainer is going to waste time (and therefore money) acting on confused assumptions based on your bad or broken names. If you have a variable called serverIP, which initially contains just a string IP address in dotted quad notation, and then later you refactor the code so this variable contains ip:port, you need to rename the variable to serverIPPort. It's worth the effort to keep the code straightforward and not full of nasty surprises and tricks.
See also Andy Lester's article on the two worst variable names.
message->msg, index->idx, value->val, createDispatcher->crtDisp. I find these highly problematic and irritating. First, they don't follow a single clear rule about how the abbreviation is achieved (sometimes truncation, sometimes dropping just vowels, sometimes dropping certain consonants). Secondly, they aren't clearly pronouncable. Pronouncability helps when discussing code and thinking about it in an audible voice in one's own mind. Thirdly, the premises that originated this convention (presumably ease of typing or length limits imposed by early languages and tools) are no longer relevant. All decent editors have word completion and/or code completion. Modern languages and tools don't have tiny eight-character length limits anymore. Also, as a native English speaker I find it hard enough to parse these things. I assume this is especially difficult for non-native speakers. Editor's note: Never abbreviate the word "password" in code. Don't use "pass". Don't use "passwd". Don't use "pwd". Don't use "pword". Don't do it. I will hunt you down. You must be stopped. The following exceptions are accomodating because of their extreme popularity: database->Db (so connectToDatabase->connectToDb), identifier->Id.
camelCase boundaries. Examples (how I prefer it): startHTTPDownload, leaveURLAlone, disconnectTCP. This is just because acronyms must always be capitalized by their nature. It's part of what makes them an acronym.
For the most part, I follow PEP 8, so review that and follow it for the basic formatting stuff. See also PEP 20. Note that python's convention for module names being all lowercase supercedes my guideline about acronyms always being capitalized.
One brief aside here regarding the "A Foolish Consistency is the Hobgoblin of Little Minds" section of PEP 8. I feel it is worth noting that even though when it comes to formatting and style I do tend toward the extreme of consistency, but hopefully not past that into foolishness. However, when it comes to the actual python standard library itself, there is no such thing as foolish consistency. Even in PEP 8 they admit "The naming conventions of Python's library are a bit of a mess". The python standard library is riddled with blatant inconsistencies that reveal that we are dealing with a product of dozens of authors and pretty bad consistency (much worse that Java in many cases). Examples abound, but just look at os.mkdir() vs. os.makedirs. I have so many times typed os.mkdirs() only later to get a AttributeError. I mean, WTF? It's in the same module for crying out loud. I have my opinion about how this should be (os.makeDir() that behaves like os.makedirs()), but I don't care that much as long as they are consistent. If a library is consistent, I'm flying. At this point I rarely need to read documentation. I can use most common libraries for IO, date, filesystem, networking just by looking at the API and assuming it does what makes sense. If there is no consistency though, it totally gums up the works and slows me to a frustrating crawl.
camelCase names for variables and methods. Do not use lowercase_with_underscores. This makes switching between java and python easier and seems to be the overall winner in the OO languages I am familiar with. Note that this is in contradiction with PEP 8, but in my experience camelCase is just the winner across multiple OO languages and at this point trying to convert to lowercase_with_underscores just seems like an uphil battle. I could potentially convinced to stick to PEP 8 here, but as of now I use camelCase, as do many python libraries.
from somepkg import * line in a module, the reader may have to do annoying busywork to track down which module contains a particular function. (Readability Is King).
% operator over using + to build strings. I just find it more elegant and easier to change the string later. I use this exclusively in accordance with the Make One Choice principle.
if statement. (Make One Choice)
When building lengthy inline data structures such as dictionaries or lists, prefer multiple statements (separate initialization and population code) to overly long inline data structures. This adheres to the Fewer Statements Per Line principle. For example,
original:_platformCfg = {
"FedoraLinux" :
{ "releases" : {"1":0,
"2":0,
"3":0},
"longName" : "Fedora Core Linux",
"dfCmd" : "df -k",
"sttyUnset" : "stty noflsh echo",
"netstatCmd" : "netstat -na | grep \":%s \" | grep LISTEN",
},
"RHLinux" :
{ "releases" : {"6.2":1,
"7.1":1,
"7.2":1,
"7.3":1,
"8.0":1,
"9":0,
"2.1WS":1,
"2.1ES":1,
"2.1AS":1,
"3WS":1,
"3ES":1,
"3AS":1,
"4WS":1,
"4ES":1,
"4AS":1},
"longName" : "Red Hat Linux",
"dfCmd" : "df -k",
"sttyUnset" : "stty noflsh echo",
"netstatCmd" : "netstat -na | grep \":%s \" | grep LISTEN",
},
"SuSELinux" :
<REMAINED OMITTED FOR BREVITY>
preferred:
_platformCfg = {}
_fedoraCfg = {}
_fedoraCfg["releases"] = {"1": 0, "2": 0, "3": 0}
_fedoraCfg["longName"] = "Fedora Core Linux"
_fedoraCfg["dfCmd"] = "df -k"
_fedoraCfg["sttyUnset"] = "stty noflsh echo"
_fedoraCfg["netstatCmd"] = "netstat -na | grep \":%s \" | grep LISTEN"
_platformCfg["FedoraLinux"] = _fedoraCfg
_rhCfg = _fedoraCfg.copy()
_rhCfg["releases"] = {"6.2": 1, "7.1": 1, "7.2": 1, "7.3": 1, "8.0": 1,
"9": 0, "2.1WS": 1, "2.1ES": 1, "2.1AS": 1, "3WS": 1, "3ES": 1,
"3AS": 1, "4WS": 1, "4ES": 1, "4AS": 1, "5SERVER": 1, "5CLIENT": 1}
_rhCfg["longName"] = "Red Hat Linux"
_rhCfg["netstatRe"] = "tcp.+:%s.+LISTEN"
_platformCfg["RHLinux"] = _rhCfg
Why?
Collection.isEmpty() over Collection.size() == 0 because it is more directly expressive of the intent
isEmpty and special case that as needed.if statement or other block beginner. (Make One Choice)
"${MY_VAR}". This can avoid bugs when the value has embedded spaces. In certain circumstances you will need to omit the double quotes to get the correct behavior, but this will usually behave properly and handle values with spaces properly. (Make One Choice)
UPPER_CASE_WITH_UNDERSCORES for all variable names because that seems to be the clear convention.
This article pre-dates my blog, but you can post any comments you have on this article on the corresponding entry on my technology blog.