On Software Engineering

I recently read an article in The Atlantic which argues that the iterative process of software development commonly used for, and particularly suited to, Internet based distribution, are not compatible with the rigour and discipline required for its practitioners to be considered “engineers”.

As the article explains, software engineering was invented as an aspirational and not descriptive title. Programming is one activity in the creation and production of software, but even this one small aspect of the activities that may be undertaken by someone with the job title of software developer or engineer can range from largely mechanical to highly creative. The article attempts to address this creative side of software by comparing it to a more colloquial definition of to engineer, meaning “skilfully, artfully, or even deviously contriving an outcome.” and then dismisses it as something no reasonable person would want when building software. Yet the best code if often sincerely described as a work of art, and this is why many people would rather have one good programmer on their team than even three or four mediocre ones.

However a beautiful building must also be functional and not fall down. It is the engineering that ensures that. Similarly while software development can be incredibly creative and fun, it also requires a large number of less exciting activities such as checking the functionality matches what the customer requested or expected and ensuring the product is free of bugs and security flaws. When working on larger projects which require multiple teams it is also necessary to have processes to detect and prevent unwanted functionality, for example malicious back doors and inappropriate hidden “easter eggs” containing adult material. This latter aspect of the job is something that demands the professionalism of a true engineer.

Whether many self-described software engineers are worthy of the title is debatable but what the author fails to realise is that the iterative software methodology he describes as a move away from software engineering is in fact the opposite—it intrinsically requires a high degree of discipline and rigour to execute well—the very thing that he says is lacking. Perhaps as an industry we are on the right path to realise our aspirations after all.

BCS Magazines: All Gloss, No Substance

I have been a member of British Computer Society (BCS) since undergraduate when the regular magazines from BCS gave me a glimpse into how working in IT might involve more than than the algorithms and computational theory being taught in lectures.

In recent times I find it is rare that the BCS’s increasingly glossy publication, IT Now, contains anything of interest. Too often an article appears to be a modified version of a corporate marketing piece with the specifics of the company’s product removed so as to maintain the illusion of being editorial rather than advertorial. The result is even worse than an actual sales pitch since all substance is lost!

A recent special issue on open source software was especially disappointing. The articles were not the usual marketing speak, but they were still vague and anecdotal rather than informative and analytical. For example, an article entitled “Cracking the Desktop” fails to mention Firefox, one of the most successful open source desktop applications. 1 A case study of the challenges faced when deploying this faster and more secure alternative to older versions of IE to corporate desktops would have been informative, yet the article looks at the cost benefits of switching to OpenOffice, an obsolete technology compared to online office tools such as Google Docs.

“Can Open Source Be Secure?” also exemplified the lack of editorial rigour in IT Now. The phrase “Experts do not agree” should not be allowed without referencing at least two sources (i.e. the “experts” on either side of the argument) yet the article contains no citations at all. The label ‘Journalist hiding their own opinions…’ from http://www.tomscott.com/warnings/ should perhaps be applied here.

I have renewed my BCS membership for another year on the basis that my local branch and Specialist Groups provide some value. The new Academy of Computing project should be given a chance to demonstrate that it can be the UK’s Learned Society of Computer Science, but the chance of another copy of IT Now not going directly to my recycling bin is slim.

BCS members can participate in a discussion about the future of IT Now.

  1. As of today, Mozilla reports 127 million downloads since 21st January 2010 versus 48 million OpenOffice.org download since 11th February 2010.[]

Technology in Motion

TIm Bray observes that several previously stable technologies are currently in a state of flux, with many of them on the cusp of potentially changing the way that applications will be engineered in the future.

It’s a great survey of what’s hot right now, but I don’t think that this is a particularly unique situation within IT—there are always new technologies pushing the boundaries of current thinking—what makes this moment different is that several core technologies are moving simultaneously. What potential for a confluence?

Retrofitting Abstraction

We interrupt our regular programming to speak geek.

Recently there’s been a lot of discussion about programming languages at work and Python is receiving a lot of attention for many reasons. I’ve been a big fan of Python for a long time, and the more Perl I see out in the wild, the less I like it: “There’s More Than One Way To Do It” doesn’t scale.

Since it has been two years since I wrote any Python in anger, I am feeling a bit rusty, but this example came to my attention today that reinforced my belief in Python’s coolness. Conventional wisdom is that a programmer should abstract away the implementation from a published API so that are no downstream dependencies to manage if the underlying implementation changes. In a language like Java, this typically means making all object attributes private and providing public accessor methods 1.

In a large program this makes sense since setx(value) will perform additional operations such as validation behind the scenes. However, in a small program this involves writing huge amounts of boilerplate code that will have no benefit until the day the program reaches the size and importance where validating a value before setting the attribute becomes a useful. And even if you know in advance that accessor methods are going to be useful, they result in large amounts of unimportant code that future maintainers will need to read and understand.

The Python solution is typically elegant. In the beginning all public attributes are accessed directly: C.x = "foo" is more readable and quicker to type than C.setx("foo"). If you then decide that reading and writing to x needs to be controlled, then you write you accessor methods, define “x” as a property and Python ensures all those instances of C.x are handled using the correct accessor. Here’s an example from docs.python.org.

Class C(object):
    def __init__(self): self.__x = None
    def getx(self): return self._x
    def setx(self, value): self._x = value
    def delx(self): del self._x
    x = property(getx, setx, delx, "I'm the 'x' property.")

  1. aka “getters” and “setters” since reading a variable is usually accomplished using getx() and updating a variable is often called setx(value)[]

THE Security Engineering Textbook now available online (for free!)

Professor Ross Anderson at Light Blue Touchpaper writes:

With a single bound it was free!

My book on Security Engineering is now available online for free download here.

Professor Anderson’s book is an invaluable reference guide for anyone wishing to implement “secure” computer systems, or simply gain a better understanding of the field. This is great news.

Python v Ruby

Although I have not read this comparison paper[pdf] in detail, I found looking at their example code side-by-side to be most illuminating. Ruby’s pureness (mainly in object-orientation) may give it a theoretically superior (“cleaner”) syntax, but from a practitioner’s point of view, the Python’s syntax appears to be refreshingly uncluntered and easy to use. Whether you come from an imperative or functional background, Python appears familiar—and with computing, conquering the fear of the unknown is half the battle!

An example:

Ruby Python
p = proc do |x|
       print "Hello ",x ,"\n" 
    end 
p.call "Fred!"

p = lambda x:
         print "Hello ", x
p("Fred!")

Ruby’s pure object syntax leads to the ugly p.call ... while Python’s p("...") looks pleasing to the imperative and functional camps. Object-orientation is a great paradigm, but as anyone who has tried to “knock-up” a quick program in Java will know, trying to shoe horn objects into a small program is a waste of effort.

The paper concludes:

Overall there is not much different between Ruby and Python, though Ruby offers some cleaner syntax due to its object oriented model. […] The disadvantage of Ruby is the fact that it is very poorly documented. Our advice is that you choose Ruby, if you are new to both the languages, otherwise weigh the option of continuing with language you are using.

So, in other words, Ruby has little to offer over Python except its theoretically purer syntax, and at the same time it is poorly documented which is likely to mean that the learning curve is a steep and painful climb. My advice: choose Python [1].

[1] This discussion has been entirely free of Rails. I haven’t used it and the paper I am responding to here doesn’t mention it—but I hear it’s a strong contender in the web app arena.

Death by laser pointer

It seems that people have now been using Powerpoint for sufficiently long that “Death by Powerpoint” is a rare event at conferences these days. Alas, this morning I felt the life being sucked from me by misuse of a laser pointer.

The two most obvious problems that inflict laser pointing users are that:

  1. it causes the speaker to turn their back to his/her audience so they can point at the screen;
  2. a bright spot whizzing around the screen in a random manner is very distracting.

I think a more fundamental problem is that the ability to point at one’s slides also encourages the speaker to talk to the slides rather than using them as a visual aid supporting one’s talk.

Thesis Submission!

I actually finished binding the third copy of my thesis at 15:55 yesterday afternoon. Unfortunately, this being Cambridge, the Board of Graduate studies closes at 4pm so I had to wait until this morning to actually submit my thesis… but now it’s DONE! 🙂 I shall have to prepare for the viva at some point, but for the moment I am looking forward to a bit of a rest and preparing for my trip to Australia.

The final word count was 42,776. Many thanks to everyone who proof-read chapters for me!

Highlighting “TODO” Items in LaTeX

Several people have asked me how I make TODO appear in the margin of my thesis wherever I want to highlight an area that needs further work. The trick is to define a new LaTeX command called “todo” in your pre-amble:
\newcommand{\todo}[1]{\marginpar{\textsf{\textbf{TODO}}}
\texttt{\small{TODO:#1}}}

(all on one line)

TODO items can then be marked up in the text as:
\todo{Add a reference here.}

S5: A Simple Standards-Based Slide Show System

One of the problems of using LaTeX for presentations[1] is that positioning graphics is annoyingly fiddly. The web’s cascading style sheets — with it’s highly flexible and very powerful layout abilities — ought to be ideal for producing presentation slides, and indeed somebody has produced an impressive framework for doing so.

[1] Don’t get me started on Powerpoint — producing anything with more than one mathematical formula is as annoying as positioning graphics in a LaTeX slide.

Searching the ACM Guide to Computing Literature

Tracking down references for my background chapter recently, the ACM Guide to Computing Literature has been very useful. Unfortunately its search feature is frustratingly useless. For example, searching for Access Control Policies XPath returns no hits, whereas googling for the same terms and restricting the search to acm.org returns the paper I was looking for as the first hit.

Given the simplicity of the the query (four keywords from the title of a paper published in an ACM proceedings), I really don’t understand why the search engine is so bad. My current workaround is to use Firefox’s bookmark keyword feature to search Google instead. Just create a bookmark to:
http://www.google.com/search?&q=site:acm.org%20%s
set the keyword to be something easy to type like “acm”, and then typing “acm <keyword (s)>” in the location bar executes your search.

Thesis Titles

Apparently the title of my thesis has to be fixed in advance of my submitting the dissertation itself. Unfortunately choosing exactly the right title is proving harder than writing the thing!

Possibilities are:

  • Trust and Risk in Access Control for Global Computing
  • Trust and Risk in Access Control for a Global Computing Infrastructure
  • Using Trust and Risk for Access Control in Global Computing
  • Trust- and Risk-Based Access Control: Access Control for the Global Computing Infrastructure
  • Trust/Risk-Based Access Control: Access Control for Global Computing

I doubt this will make much sense to anyone reading this but comments always welcome. 🙂

Boy Band filter — Coming to a computer near you soon?

Christophe Rhodes’ interesting JCSS talk on detecting musical structure has suggested that there is hope that one day computers may be able to automatically detect and filter out boy band music — yay!

More relevantly for my own research, one of Christophe’s motivations is the poor quality of musical meta-data from collaboratively assembled databases such as freedb.org. A trust-based system would allow a user to favour entries submitted by authors they have previously to provide consistently formatted and accurate data.