Posts Tagged ‘Technology’

SANOG XVI Conference in Paro, Bhutan

Recently in the last month, I traveled to Bhutan to attend the SANOG Conference. Bhutan is a small country nestled in the Himalayan mountains surrounded by Indian to the East, West and South and China to the North. It was a good opportunity to meet some like-minded network geeks and also visit an exotic country.

SANOG (South Asian Network Operators Group) is a conference where various stakeholders from the Internet infrastructure ecosystem can come together, share operational experiences and learn from each other. SANOG is targeted towards the SAARC Countries (India, Pakistan, Nepal, Sri Lanka, Bangladesh, Bhutan and Maldives). SANOG is loosely modeled on APRICOT conference with 5 days of Workshops, 2 days of tutorials and 2 days of conference.

The 16th edition of SANOG was held in Paro, Bhutan. This was the second time that Bhutan was hosting SANOG. SANOG was held in the the Paro Engineering College beside the Paro river in very picturesque settings. The earlier edition of SANOG in Bhutan was held in the capital, Thimphu. This was the 2nd time I was attending SANOG, having attended an earlier edition of SANOG in Mumbai in 2006.

I attended the workshop on Network Security by Gaurab Raj Upadhyay and Johhny Martin from PCH. It covered the basics of security specifically for ISPs and large network providers. There were some good discussions on how to manage the different security audit process as well as an incident management program in case of network security breaches. The hands-on part of the workshop concentrated heavily on securing backbone routers and exchanging routes information securely. Some aspects of filtering and verifying network traffic were also covered. The last day had demos of several tools such as nessus and nmap. The slides can be downloaded from the SANOG Program page. Also in between the workshop breaks and during one of the days of the workshop, Devdas and I wrote a improved whois server that is hopefully in production now at the NIC website.

In the tutorials part of SANOG, I was giving a half-day tutorial on application-level performance measurement [Slides,PDF]. There was an small but interested crowd in the tutorials. I ended up covering a lot more of the web-facing and measurement tools as many of the participants were application developers who had written quite a bit of PHP code. It was the first time I was giving a tutorial on this topic and it helped that it was interactive. In addition to the material on the slides, I talked a bit about front-end performance and tools such as Yslow (Yahoo), Pagespeed (Google) and Webpage Test (AOL). There was a lot of whiteboarding and veered a little away from the slides. I also spoke about the network measurement work being done in the IPPM, BMWG and PMOL working groups in the IETF. The feedback was pretty good and I plan to give a longer version tutorial at later versions of SANOG/APRICOT. I skipped the second day of the tutorials and went to Chele La pass.

The conference had several talks that I was looking forward to and I was not disappointed. The standout talk were on long distance wireless network deployment by Matt Peterson and F-root update by Pete Losher. Both has interesting networking insights and interesting traffic data. I also gave a talk on my IETF fellowship experience [Slides,PDF]. Some of the slides were liberally lifted from “The Tao of the IETF” written by Paul Hoffman. As (good) luck might have have it, I ran into Paul Hoffman at the IETF 78 and told him about it :) . There were a few questions about the fellowship after the conference so I hope it would inspire more people to apply to the IETF fellowship.

Cloudcamp Bangalore 2010 and Hadoop Summit

The 2nd CloudCamp Bangalore was held at Dayanand sagar College of Engineering. It was co-located with the First Hadoop summit in India. The Hadoop summit was interesting and more relevant to me as I am using a Hadoop cluster for Analytics at Inmobi. Dave kicked off Cloudcamp with signature “unPanel”. I was on the Unpanel this time and answered some questions on mobiles, netbooks and smartphones as access devices for the cloud and the on impact of Google patent on MapReduce.

The corridor discussions with a bunch of Hadoop committers were insightful. I also found out more about Mahout. Mahout is a Apache project to build scalable machine learning libraries. It is not restricted to Hadoop implementations, but much of the current activity seems to be around Hadoop.

Notes and embedded slides from the sessions I attended follow:

Hadoop summit Keynote

Data Management on Grid

Notes:

  • Y! uses a HDFS replication factor of 3 (the hadoop default) in most cases. Exceptions are big clusters with large number of applications running simultaneously.
  • Y! does not use Avro yet due to large amount of legacy data. Twitter uses Avro.
  • Data ingestion layer uses MapReduce for heavy lifting and format conversion for storage.
  • LZO is used for compression. gzip (not ideal due to non-block-level indexing) and bzip2 is also used. There are problems with slowness of bzip2 decompression but bzip2 delivers better compression ratios.
  • Data ingestion layer also oversees policy for data retention and purging.
  • Underlying filesystems is rarely a bottleneck for Hadoop. Mostly the synchronization semantics of HDFS is a bottleneck. A file operation is not successful until all the replicas are in sync.

Machine Learning using Hadoop

Notes:

  • There are clear differences between data mining and machine learning.
  • ML is harder to implement efficiently on Hadoop. Improving efficiency is still a research problem.
  • Hadoop creates one map job / block creating too many empty files and also many reducers.

Optimizing and Benchmarking Hadoop

Notes:

  • As a thumb rule, adding as much memory as money can buy is a a good idea for Hadoop
  • Consider Network connections as shuffle stage does heavy network I/O
  • Solid state disks might make sense at certain price/performance ratios. They are also more power efficient.

Tuning Hadoop To Deliver Performance To Your Application

Notes:

  • Several parameters to tune Hadoop but must be used in conjunction with each other.
  • Set number of map jobs slightly more than number of cores to ensure better utilization. Makes sure that data is processed in waves. Also better network utilization (as shuffle phase happens parallely with Map phase) along with CPU scheduling
  • Choosing a good HDFS block size is important. Number of HDFS blocks is directly proportional to number of Map tasks generated

Links to all presentations

ACM Compute 2010 and ACM India launch

ACM Compute 2010 concluded yesterday. It is the flagship conference of the ACM Bangalore chapter. This year was the 3rd edition of the conference and more than 500 people attended the conference. The highlight of this year’s conference was the launch of ACM India. ACM wants to increase it reach in India and ACM India Council consisting of 18 leading computer scientists from academia and industry are heading this initiative.

The ACM India launch was addressed by 3 Turing Award Winners – Barbara Liskov, C.A.R Hoare (Tony Hoare) and Raj Reddy. The ACM Turing award is “The Nobel Prize for Computing” and it is rare to see three Turing Award winners address the audience at any event. Barbara Liskov is the most recent awardee of the Turing award (the 2nd woman to win it) and she spoke on the power of abstraction. She spoke about the problems early programmers faced when writing large and complex programs. She explained how she tried to solve it using abstractions similar to (what is now called) Object-oriented programming. She talked at length on how her insights and experiences with these programming problem led to design of the CLU language. CLU was the first language to implement iterators and generators (as well as exception handling). It was a good lesson in computer history listening to her. I learned later that she was the first woman to get her PhD from a Computer Science Department. (Her doctoral advisor was the legendary John McCarthy). Her presentation and the mentioned references in it make for good reading.

Dr Raj Reddy is the only Indian who has won the Turing award for his contributions to field of Artificial Intelligence. Incidentally, his PhD advisor was also John McCarthy – AI Pioneer and Turing Award winner. Dr Raj Reddy spoke about the growth of computing over the years and the challenges of reaching the “bottom of the Pyramid”. He explained why there was need to move from the WIMP-paradigm in user interfaces to the SILK (Speech, Image, Language and Knowledge) to increase the reach of computing. His Turing award lecture (“To dream the possible dream”) makes for interesting read as well.

C.A.R Hoare (Tony Hoare) was the next speaker. He is a living legend in computer science. I was looking forward to hearing him speak as I had studied the Quicksort algorithm (which he invented) and Communicating Sequential Processes paper in college. He was remarkably witty and his enthusiasm for computer science shone through in his talk. In particular he spoke about the Verified Software initiative which he contended was similar in scope and impact (for Computer Science) to the Hubble Telescope and the Human genome project.

The following 2 days, we had the ACM Compute 2010 conference and there were several hands-on Tutorials on Cloud Computing, Rich Internet Applications and Web 2.0 apps, Widgets and Mobile Applications. The RIA tutorial was conducted by Mrinal Wadhwa (slides embedded below) and the Facebook connect tutorial by Prateek Dayal (of Muziboo).

(Disclosure:I am the secretary of the Bangalore Chapter and am on the program committee for ACM Compute 2010.)

Tweetup with Alexis Ohanian – Reddit Cofounder

Tweetup with Alexis Ohanian Tweetup with Alexis Ohanian
Tweetup with Alexis Ohanian Tweetup with Alexis Ohanian

Alexis Ohanian ( kn0thing on twitter) – the co-founder of reddit (and the creator of the beloved Reddit Alien) was in Mysore for the TED conference. He took a break from the TED conference to meetup with a bunch of redditers. For those who don’t know he is also the publisher of XKCD books and all the proceeds from the book go to building a school in Laos. It was interesting talking to him about startups, startup school, Paul Graham, ycombinator, traveling in India, the startup scene in India, Social media [link to TED Presentation] and of course reddit.

We gave him a sampling of Indian food (Coconut Groove) and sweets (K C Das). Thanks to @dhempe and @pswam for organising this tweetup.

The Vasa – The Titanic of Sweden

T-shirt Design (back)
Back of the IETF 75 T-shirt featuring the Vasa (Parody of the ISO Model)

The Vasamuseet (“Vasa Museum”) is a maritime museum located on the island of Djurgården which displays the only almost fully intact 17th century ship that has ever been salvaged. The Vasa sank on her maiden voyage in 1628, just like the Titanic. The difference was that Vasa sank even before it could leave the Stockholm harbour.

Vasa Museum Vasa Museum

Political decision leads to Engineering failure

The Swedish King Gustavus Adolphus ordered the construction of the Vasa and was impatient to see the completion of the ship so that it could join the Thirty years war with Poland. In doing so, several poor engineering decisions were made to satisfy the king as no one wanted to incur his ire by pointing out the several design blunders. The 64-gun upper deck of the Vasa was not offset by the ballast and it made the ship unstable as it was top-heavy. The Vasa was initially designed to have only one gundeck but it ended up with two adding to further instability of the ship (feature creep existed in the middle ages :) ). Also there was no way to estimate scientifically the stability of the ship and it all depended on the experience of the shipbuilder leading to the sinking of many ships and wasted effort (Software estimation still isn’t good enough – We are stuck in the middle ages of software development :) ).

Vasa Museum Vasa Museum

The Catastrophic Maiden Voyage

On 10 August 1628, the Vasa to set sail on her maiden voyage to the naval station at Älvsnabben. It was a bright calm sunny day. There was only a light breeze blowing across the sea. At the first hint of the gust the Vasa swerved but was stabilized by the sailors aboard. Soon another gust of wind followed but this was fatal as the tilting of the ship caused the water to rush through the open lower gun ports which added to the instability of the ship and the eventual sinking of the Vasa. As the Vasa was an extraordinary ship, there were thousands of people who were present to witness the maiden voyage including ambassadors of different countries. The Vasa sank ingloriously in front of the gathered crowd.

Vasa Museum Vasa Museum

The bronze cannons of the ship were salvaged in the 17th Century but the Vasa was forgotten till the 1960s when the ship was salvaged from the bottom of the sea in a major operation.

Vasa Museum Vasa Museum

Get Adobe Flash playerPlugin by wpburn.com wordpress themes