In my post about devopsdays I left out Matt Rechenburg's presentation on OpenQRM, a tool for provisioning appliances within any kind of virtualisation, or just on bare metal. The company that created QRM stopped developing it just after Matt had convinced the owners to make the product open source. Matt stayed on as coordinator for the open source project. Currently OpenQRM does not have commerical support, but the developers are available on a time and materials basis.
Devopsdays was a small conference about a couple of emerging themes combining Development and Operations:
The first theme is the realisation that if you want to build a scalable infrastructure, you need to automate deployment and administration of that infrastructure and the applications that run on it. System configurations becomes just another type of code to be developed, tested, integrated and deployed. Deployment becomes Release, Configuration becomes development, the ITIL processes for Incident, Problem Management become debugging. Change management becomes release management.
Second, another recent development in the last ten years has been the advent of Agile. I've only recently encountered the Agile Development movement, and although it's far from a Silver Bullet, it does appear to address some of the essential issues in software engineering in terms of Fred Brooks' original analysis.
Originally intended as a light-weight alternative to the waterfall model of software development, Agile transposes the stages of the waterfall model into concurrent processes, introducing feedback everywhere. Requirements analysis continues long after coding starts. Rapid prototyping, user stories, continuous integration, test first design are just a few methods used in Agile to shorten the feedback loop for developers. But while methods and processes are important, the real focus of the Agile movement is on communication and collaboration, in the end making developers and users jointly responsible for the end result.
The Devops Concept (for want of a better name) is about merging the two approaches and how to apply Agile principles to System Administration and how to get people in Operations and Development to collaborate on deployment.
At the conference, the two day programme was split in two: Talks and Presentations in the morning, and free-form discussion/presentations in OpenSpace format in the afternoon.
I would say that the talks and discussions focused on three themes:
- (Open Source) Tools for automating IT Operations
- Collaboration between Development and Operations
- Agile methods and principles for Operations.
Lindsay Holmwood explained his work on cucumber-nagios - combining 'cucumber', a tool/language for expressing tests in almost human readable scripts with the nagios monitoring tool, resulting in behaviour-driven monitoring. This was a very fast-paced presentation.
Teyo Tyree of Reductive Labs talked about the principles behind Practical Infrastructure Automation, referring to the "James White" Manifesto on Infrastructure (now up at github). Of course he focused on tools like Cfengine, Chef, and Reductive Labs' own Puppet, but he also sketched the challenges for the big enterprise with a multitude of services, commercial application stacks, and many platforms. He strongly suggest to start with baby steps: Implement Configuration tools like Puppet in a reporting state first, and use its reporting mechanism to create a history of change from within the system. Leverage the legacy CMDB, and work within established change control policies.
Agile coach Rachel Davies focused on the Agile principles, methods and tools, in particular about User stories and how to use those to identify non-functional requirements (requirements that do not add measurable value to the product, but that improve the product by reducing risk.)
Mattias Skarin presented a case study of using Kanban that helped to get Operations and Development collaborating closer. The key to Kanban is twofold: visualise task planning and put a hard limit on the amount of work in progress. The important thing here is that there is no single best design for a Kanban board - the team has to create what's best for them. After a couple of iterations, or sprints, the team may decide to add a category of work, or drop a phase from the progress axis.
Chris Read of Thoughtworks told us about Build pipelines, and how to take Continuous Integration several steps further into Continuous Deployment.
During Openspace, Jochen Maes started a hot discussion about the merits of distributed version control - and how to minimise the risk of branching: Even when developing in their own copy of the repository, he expects his developers to check in frequently, and at the same time rebase with the main repository at least once an hour. This way, individual developers run their own tests frequently, and then merge in any updates to the main source tree from other sources into their own copy without polluting the upstream. This requires strict discipline, but the result is that any time a change causes the build to fail, you can always fall back to the previous build. Also, because merging code in a distributed version control system like git or mercurial involves merging the complete history (and not just the current state), it is easy to identify which code change was responsible.
UPDATE: It so happens that George Neville-Neil just posted an article about this in his Kode Vicious column at ACM's Queue: Merge Early, Merge Often.
De eerstvolgende bijeenkomst van de NLOSUG (OpenSolaris User group NL) zal plaatsvinden op donderdag 8 october, waar we weer te gast zijn bij Sun in Amersfoort.
Om onze gastheer en de catering een goede schatting te geven van het aantal bezoekers, graag aanmelden via email op firstname.lastname@example.org, onder vermelding van: Aanmelding-NLOSUG-okt2009
- Update on NLOSUG
Operating Systems Ambassador - Sun Microsystems
- What's New in OpenSolaris
Jan E. Kuba van Bijnen
Unix/Solaris system & network consultant
- Confused by Solaris-es
Operating Systems Ambassador - Sun Microsystems
- Contributing to OpenSolaris Repositories
Eric R. Reid
Staff Engineer at ISV engineering - Sun Microsystems
Sun Nederland B.V.,
3824 ME Amersfoort ( Route )
NL Opensolaris User Group
What did it in? Maybe it was a schedule problem - file systems require a lot of testing - and rewriting all the other bits took precedence. NIH - Not Invented Here - syndrome is another possibility. Or perhaps the uncertainty of Sun’s future led Apple to pull back.
Or maybe they just decided customers wouldn’t know enough to care, so why bother? Whatever the reason it is a major step backwards for the PC industry.
I can think of a few practical reasons myself.
For now, I'll try to focus on one: Apple's not ready for it. And perhaps, neither are the users.
Rule #1: Apple designs and sells systems that are supposed to just work. No hassle, no jumping through hoops, no bells, no whistles. Pure form, pure function.
ZFS was designed to do one thing really well. You give it your storage and your data, and it will go to extreme ends to protect your data. It will need at least two disks in order to do that.
Apple's desktop systems and notebook computers still come with only one disk inside.
Use an external disk for ZFS redundancy? The ultimate Rule #1 violation. The whole point of an external disk is that it can be disconnected. The whole point with ZFS redundancy is that you don't want to even create a hint that one of its disks could be disconnected.
After all, there is only one storage pool, and ZFS will take care of that, thank you kindly, sir. The firewire/USB/eSata cable is just the rope that the user needs. Allow them to disconnect the drive, and friendly as Mac OS is, you can provide sufficient automation to recognise that the cable was disconnected, show a kind Applely warning that "Mac OS cannot protect your data if you do not reconnect the external volume."
People are just not ready for this yet. You don't want to run ZFS with h/w that can be disconnected on a whim, or purely by accident, it's just asking for trouble. After three or four friendly warnings, people will ignore them. Yes, I know! Stop nagging me! The ease with which ZFS could recover from this will only encourage people to become careless, annoyed, or both.
ZFS will be ready for consumer use when all the volumes in a storage pool will reside together in the same device. Detachable storage is great for backups, especially with a notebook, but would have to be redundant itself. So now we're talking about at least four disks: two inside the computer, and two outside to protect against physical loss. Let's just stop there.
My conclusion is that Apple probably has taken the right decision business-wise, but I hate them for not having the hardware to support it. Maybe they will get back to it, and I look for the day when they will have notebooks and iMacs with an even numbers of disk slots.
So far, I've found the following:
How many languages can you use for product development without things getting too complex?
Some of the languages have a specific focus, like sql for db queries, sh for system adminstration (start/stop components), and the data description languages XML for SMF and ASN-1 for describing snmp mibs.
These tools are essential if you want to scale up deployment of servers, especially now that more services are being hosted on virtual servers, making deployment of the hardware, O/S, and application completely independent.
"One senior systems engineer at Digg.com was able to rebuild 60 [virtual] machines from scratch in two hours [using Puppet] that would have taken two full days of work if done manually. 'And I was largely a spectator,' said that engineer, Paul Lathrop, of Digg. 'Now that’s automation.' ”Puppet is not the only game in town, if you're interested in these tools, you also need to look at cfengine, isconf, and bcfg2. Interestingly enough, two of projects appear to use trac for release management. These are all open source projects, with various levels of commercial support. The commercial nova edition of cfengine, adds some extra features for a price.
The value of these tools is that they offer automatically enforced policies, with implicit reporting of any exceptions.
Het doosje is wat steviger van opzet dan de gebruikelijke Home NAS appliance. Zo bevat het vier Gigabit Ethernet pporten, 2GB aan DDR2 Ram geheugen en een Intel Atom N270processor, geklokt op 1,6 GHz. Er is ruimte voor twee harde schijven in een raid0-, raid1-, of jbod-configuratie. Het prijskaartje komt op een 700 dollar.
Het systeem draait onder Linux, vanuit een 512MB Flash geheugen. Hoe veel moeite zou het kosten om dat te vervangen door OpenSolaris en ZFS?
Kris Straub has revamped his web comic Starslip - a science fiction comic about a museum starship that's been recommissioned for battle. As he describes on webcomics.com, he has gone through a long and painful process going back and forth about the decision and finally decided to get done with it.
The comic has a new look, a new title, the artwork has been improved, and the storyline is restarting as well. I've long been wanting to add it to my daily web comic routine, but as of today, it is in.
Go have a look!
Best wishes to all for 2009!