Archive for the ‘Books’ Category

Coders At Work Review

Coders At Work

Once in a while, you read a book that is filled with ‘aha’ moments. If you have written complex software for a while or want to become a good programmer then ‘Coders at work’ is a must read. This fantastic book interviews 15 master programmers. Some of the people interviewed in the book are well-known names such as Don Knuth, Ken Thompson, Jamie Zawinski and Peter Norvig.

Some comments on the content of the book:
Programming languages
Many of programmers interviewed started with BASIC and considered it an okay language. What is probably more surprising is the universal hatred of C++ in this group. In fact several people such as Peter Norvig and Ken Thompson (who goes on a tirade against C++) consider it a downright ugly and cumbersome language to work with.

Jamie Zawinski – C++ is just an abomination
Brad Fitzpatrick – The syntax is terrible and totally inconsistent and the error messages, at least from GCC, are ridiculous.
Ken Thompson – - By and large I think it’s a bad language. It does a lot of things half well and it’s just a garbage heap of ideas that are mutually exclusive. Everybody I know, whether it’s personal or corporate, selects a subset and these subsets are different. So it’s not a good language to transport an algorithm—to say, “I wrote it; here, take it.” It’s way too big, way too complex. And it’s obviously built by a committee.

On Programming and Curiosity
Almost everyone interviewed still programs (some do occasionally) and enjoys hacking and taking things apart. Many were misfits and took unusual career paths to get to where they are today. There is a rebel and hacker streak in all of the them. Most of them stumbled into programming and discovered that they were good at that at some point. Everyone emphasized the practice of writing good code readable code. Everyone laments that you cannot understand a system from the bottom upwards as systems have become more and more complex and layers of abstraction have multiplied manifold.

On categorizing programming and building software
The opinion is pretty much evenly split on whether programming is a science, art, craftsmanship or engineering with a slight bias towards craftsmanship.

On Recommended Books
Among the books recommended, “The Art of computer programming” by Don Knuth topped the list for obvious reasons. Another books which was recommended by several people was the “Psychology of computer programming” by Gerald Weinberg.

On the state of computer science
The mood on the state of developments in computer science was fairly pessimistic and most people pointed to the fact that many of the breakthrough ideas for computer science were conceived in the ’70s (with the notable exception of the internet and web programming)

The only downside here is the interview of Fran Allen. It should not have made the book. I got the distinct feeling that much of the work that she claimed credit for is implemented by others and she was the manager of those projects (probably a good one but that is hardly the same as being a good programmer).

I have added some notes (for further reading) and quotes from the book on the wiki

Supercrunchers – How Data Analysis is Changing our Lives


I had been planning to read Supercrunchers by Ian Ayres for a while. The theme of the book is how you can educe information from raw data by applying various statistical techniques such as regression analysis and corelation algorithms. The author calls this process as “Supercrunching”.

The book starts with an introduction to supercrunching with the example of Orley Ashenfelter who devised a mathematical formula to determine the quality of wine (and hence it’s price). Wine connoisseurs initially dismissed the idea that a formula could beat their intuition and years of experience but the formula has stood the test of time and outperformed the experts in the field by a large margin. Another example is the loss of Kasparov to Deep Blue but the author fails to mention that Kasparov not only lost because of the huge amount of data that Deep Blue had at it’s disposal but also the team of scientists who were continually refining the algoritms to get to the right moves.

The first chapter ‘Whos doing the thinking for you ?’ dwells on the topic of recommendation engines such as the Netflix, Amazon, and Pandora. (IEEE spectrum article on the winning algorithm by the current leaders of the Netflix prize). Recommendation engines and collaborative filtering has become almost as norm on the web today with almost every news site (most emailed, most read, most shared), shopping site (people who bought this also bought … ), media sites (recommendations for artists, songs, albums) and social networking sites (such and Linkedin and Facebook’s ‘People you may know’ feature) having such features built-in. A slightly different example in the book is that of Walmart which uses answers to certain questions to filter out (non-conformist) candidates for certain job profiles based on the data they have gathered from current employees. Internet search giant Google recently began crunching data from employee reviews and promotion and pay histories in a mathematical formula which it says can identify which of its 20,000 employees are most likely to quit. Fraud detection is another area which this chapter skims but you can write a book on just data mining and fraud detection.

The second chapter ‘Creating your own Data with a Flip of a Coin’ deals with testing hypotheses by running random trials to test the efficacy of a prediction especially in cases in which the cost of running the trial is low. Lots of companies are known to use this. For example, when Google or Yahoo change their front page or layout, different users see different versions of the page (which are slightly different from each other often in one single aspect) and the user behavior is then mapped and ranked based on a whole bunch of criteria. This goes on in several iterations till the design is perfected. This technique is called A/B testing. The efficacy of trail run can be improved by using the Taguchi methods. I found the author’s approach to using randomised trials to find answers similar to what Nassim Taleb’s advocacy of using ‘Monte Carlo Methods‘ in ‘Fooled by Randomness’ for random sampling.

The third chapter ‘Government by chance’ elucidates how randomised trial is helping change and improve social policies. Examples include Mexico’s Progresa initiative and MIT’s Poverty action Lab.

The chapter ‘How should physicians treat evidence based medicine ?’ shows how evidence-based medicine is profoundly changing medical practices. The most interesting story in the chapter is of Ignaz Semelweiss who used statistical techniques to figure out that chances of transmitting puerperal fever (a form of septicaemia) could be prevented by having doctors wash their hands on chlorinated solutions. Unfortunately he was not only ridiculed by many, admitted to an asylum and an in an ironic twist of fate died due to septicaemia only a fortnight later. All this before Louis Pasteur developed the germ theory of disease. The chapter also relates stories of how rule based system such as Isabel are helping avoid misdiagnosis of patients and assisting doctors to help treat patients faster.

The chapter ‘Experts Versus Equations’ expands on the underlying and subtle theme of the book – that supercrunching is slowly but surely defeating the experts in various fields (given accurate and adequate data) and how experts’ intuition is not always right. It goes on to explain how biases and emotions cloud our judgement and reduce the accuracy of our predictions. (A good aside at this point is reading up on Positive confirmation Bias – our tendency to search for data that confirms out perception and Cognitive dissonance – the uncomfortable feeling caused by holding two contradictory ideas simultaneously). The example given is how a very simplistic algorithm outperformed legal experts in determining how the supreme court justices of the US voted.

In the chapter ‘Why Now ?’, the author makes a compelling case of why supercrunching is becoming more and more relevant and accessible to ordinary people due to the rise in data crunching ability (Moore’s Law), falling cost of storage for large datasets (Kryder’s law) and ease of distribution and verification (The Internet). The author introduces neural network in this chapter and relates the story of the Epagogix – A firm that uses neural networks to improves the commercial gross of a film by tweaking movie scripts. Yes – unbelievable but who knew. It gives a whole new meaning to formulaic films. :)

In the chapter ‘Are we having fun yet ?’ the author touches upon the topic of education and how direct instruction is changing education in America. This was probably one of the most counterintuitive examples in the book. Direct Instruction replaces the discretion of the teacher with a behavioral script for teaching students. Teachers have resisted this method of teaching despite overwhelming evidence that this is the best way for teaching students. Again the conflict between intuition and hard numbers crops up in this chapter. Another theme in this chapter is how megacorps are using data (such as our buying decisions & browsing behavior) generated us to make decisions to increase their profitability. A recent NYTimes artcile touches upon this topic as well. (What Does Your Credit-Card Company Know About You?)

The Final Chapter ‘The Future of Intuition (and Expertise)’ explains some basic statistics in layman terms such as the 2SDrule, Bayes theorem and margin of error in statistical surveys. I wish this section were more extensive though.

Overall a good read (3/5), but I wish it had more examples and were more comprehensive. Ian Ayres also writes on the freakonomics blog. Also I recently read that IBM is working on software to analyze trends based on realtime data.

More @ Google Talk by Ian Ayers.

97 Things Every Software Architect Should Know

97 Things every sofware architect should know

I recently contributed to a book called ‘97 Things every sofware architect should know‘. It is a collection of axioms by various architects around the world including some famous names such as Allison Randal (The lead developer for Parrot), Bill de hÓra (Co-editor of Atom publishing protocol) , Michael Nygard (who wrote “Release It! Design and Deploy Production-Ready Software” – a 2008 Jolt Productivity Award Winner), Neal Ford (who wrote “The Productive Programmer”) and Rebecca Parsons (Thoughtworks CTO)

The book is now available for purchase from Amazon. The content of the book is licensed under Creative Commons Attribution 3 license. You can read it on the wiki if you are not inclined to buy the dead-tree version.

The book is edited by Richard Monson-Haefel and Mike Loukides. Interestingly the Amazon editorial review quotes the axiom that I have contributed (“For the End-user,the interface is the System”). Unfortunately there was a limit of 300 words for the axioms and no companies or products could be mentioned as the editors wanted a certain timeless quantity for the axioms (and rightly so). I plan to explain what I meant in greater detail in a blog post soon.

Some more interesting trivia about the book.
* Currently the book has 5 star rating from 3 reviewers.
* It is #1 in the “Design and Architecture” Category and #5 in The Software Development Category in a short time frame. (It’s been about 2 weeks since it was released.)

The cover has the photos of every author (whose contribution was accepted) on it. My photo is 2nd from the leftt on the last row :) .

Get Adobe Flash player