Page 1

Andrew Fecheyr-Lippens!

CAC Actividad 5

Andrew Fecheyr Lippens CAC: Actividad 5 12 November 2009

O P E R AT I N G S Y S T E M S For the comparison of Linux Operating Systems for use in computer clusters we will take a look at five popular distributions: Ubuntu, CentOS, FreeBSD, Gentoo and Debian. I choose these because I have some experience with Gentoo, FreeBSD and Ubuntu, and because Debian and CentOS are often used for servers requiring reliability. Two remarkable LiveCD distributions aimed at quickly building clusters will also be mentioned: PelicanHPC and Cluster By Night. A comparison table can be found on the final page.

1. Ubuntu Ubuntu was announced in September 2004. Although a relative newcomer to the Linux distribution scene, the project took off like no other before, with its mailing lists soon filled in with discussions by eager users and enthusiastic developers. In the few years that followed, Ubuntu has grown to become the most popular desktop Linux distribution and has greatly contributed towards developing an easy-to-use and free desktop operating system that can compete well with any proprietary ones available on the market. The project was created by Mark Shuttleworth, a charismatic South African multimillionaire and former Debian developer, whose company, the Isle of Man-based Canonical Ltd, is currently financing the project. Ubuntu had learnt from the mistakes of other similar projects and avoided them from the start - it created an excellent web-based infrastructure with a Wiki-style documentation, creative bugreporting facility, and professional approach to end users. On the technical side of things, Ubuntu is based on Debian "Sid" (unstable branch), but with some prominent packages, such as GNOME, Firefox and OpenOffice.org, updated to their latest versions. It has a predictable, 6-month release schedule, with an occasional Long Term Support (LTS) release that is supported with security updates for 3 - 5 years, depending on the edition (non-LTS release are supported for 18 months). Other special features of Ubuntu include an installable live CD, creative artwork and desktop themes, migration assistant for Windows users, support for the latest technologies, such as 3D desktop effects, easy installation of proprietary device drivers for ATI and NVIDIA graphics cards and wireless networking, and on-demand support for non-free or patentencumbered media codecs. Ubuntu uses APT with DEB packages for software management and has a special Server Edition.

Pros Fixed release cycle and support period Beginner friendly Good documentation

!

Cons Some of Ubuntu's own software (e.g. Rosetta) are proprietary Lacks compatibility with Debian

1 of 6


Andrew Fecheyr-Lippens!

CAC Actividad 5

2. CentOS CentOS is a community-supported, free and open source operating system based on Red Hat Enterprise Linux (RHEL). CentOS is a 100% compatible rebuild of RHEL, in full compliance with Red Hat's redistribution requirements. The only technical difference between the two is branding - CentOS replaces all Red Hat trademarks and logos with its own. It exists to provide a free enterprise class computing platform and strives to maintain 100% binary compatibility with its upstream distribution. CentOS stands for Community ENTerprise Operating System and is targeted for people who need an enterprise class operating system stability without the cost of certification and support. Despite its advantages, CentOS might not be the best solution in all deployment scenarios. Those users who prefer a distribution with the latest Linux technologies and newest software packages should look elsewhere. Major CentOS versions, which follow RHEL versioning, are only released every 2 - 3 years, while "point" releases (e.g. 5.1) tend to arrive in 6 - 9 month intervals. The point releases do not usually contain any major features (although they do sometimes include support for more recent hardware) and only a handful of software packages may get updated to newer versions. The Linux kernel, the base system and most application versions remain unchanged, but occasionally a newer version of an important software package (e.g. OpenOffice.org or Firefox) may be provided on an experimental basis. As a side project, CentOS also builds updated packages for the users of its distributions, but the repositories containing them are not enabled by default as they may break upstream compatibility.

Pros Based on the commercially most successful enterprise distribution Extremely well tested Stable and reliable

Cons Running behind the latest software Patches can take longer to be released than RHEL Slow release cycle

3. FreeBSD Although FreeBSD is technically not a Linux distribution, it does share the common Unix foundation. FreeBSD is one of the most used Operating Systems for webservers because of its high reliability, stability under load and highly optimized network stack. The aim of FreeBSD is to produce an operating system usable for any purpose. It is intended to run a wide variety of applications, be easy to use, contain cutting edge features, and be highly scalable on very high load network servers. FreeBSD is free and open source, and the project prefers the BSD license. However, they sometimes accept non-disclosure agreements (NDAs) and include a limited number of closed-source HAL modules for specific device drivers in their source tree, to support the hardware of companies who do not provide purely open source drivers. To maintain a high level of quality and provide good support for "production quality commercial off-theshelf (COTS) workstation, server, and high-end embedded systems", FreeBSD focuses on a narrow set of architectures. A significant focus of development over the last five years has been fine-grained !

2 of 6


Andrew Fecheyr-Lippens!

CAC Actividad 5

locking and SMP scalability. Other recent work includes Common Criteria security functionality, such as mandatory access control and security event audit support.

Pros Very robust and reliable Updating is easy and well tested Great documentation

Cons Does not support all architectures No easy GUI configuration tools

4. Gentoo Gentoo Linux is a computer operating system built on top of the Linux kernel and based on the Portage package management system. It is distributed as free and open source software. Unlike a conventional software distribution, the user compiles the source code locally according to their chosen configuration. There are normally no precompiled binaries for software, continuing the tradition of the ports collection, although for convenience, some software packages are also available as precompiled binaries for various architectures. The development project and its products are named after the Gentoo penguin. Gentoo package management is designed to be modular, portable, flexible, and optimized for the user's machine. Gentoo describes itself as a metadistribution, "because of its nearunlimited adaptability". Gentoo uses a BSD ports-like system called Portage. Portage is a package management system that allows great flexibility while installing and maintaining software on a Gentoo system. It provides compile-time option support (through USE flags), conditional dependencies, pre-package installation summary, safe installation (through sandboxing) and uninstallation of software, system profiles, configuration file protection amongst several other features. With Gentoo you can build your entire system from source, using your choice of optimizations. You have complete control over what packages are or aren't installed. Gentoo provides you with numerous choices, so you can install Gentoo to your own preferences, which is why Gentoo is called a meta-distribution. Gentoo is actively developed. The entire distribution uses a rapid pace development style: patches to the packages are quickly integrated in the mainline tree and documentation is updated on daily basis.

Pros Excellent package manager Supports every architecture Fully customizable and very flexible Latest software is available fast

!

Cons Updating can be hard and brittle Occasional instability No official binary packages available

3 of 6


Andrew Fecheyr-Lippens!

CAC Actividad 5

5. Debian Debian GNU/Linux was first announced in 1993. Its founder, Ian Murdock, envisaged the creation of a completely non-commercial project developed by hundreds of volunteer developers in their spare time. With skeptics far outnumbering optimists at the time, it was destined to disintegrate and collapse, but the reality was very different. Debian not only survived, it thrived and, in less than a decade, it became the largest Linux distribution and possibly the largest collaborative software project ever created! Debian is distributed with access to repositories containing thousands of software packages ready for installation and use. Debian is known for strict adherence to the Unix and free software philosophies as well as using collaborative software development and testing processes. It can be used as a desktop as well as server operating system. The actual development takes place in three main branches (or four if one includes the bleeding-edge "experimental" branch) of increasing levels of stability: "unstable" (also known as "sid"), "testing" and "stable". This progressive integration and stabilization of packages and features, together with the project's well-established quality control mechanisms, has earned Debian its reputation of being one of the best-tested and most bug-free distributions available today.

Pros Very Stable Remarkable Quality Control Supports almost every architecture

Cons Conservative Slow release cycle

LiveCD 1: PelicanHPC - a cluster in a few minutes. PelicanHPC, formerly named ParallelKnoppix, is a linux live CD image that let's you set up a high performance computing cluster in a few minutes. A Pelican cluster allows you to do parallel computing using MPI. You can run Pelican on a single multiple core machine to use all cores to solve a problem, or you can network multiple computers together to make a cluster. The frontend node (either a real computer or a virtual machine) boots from the CD image. The compute nodes boot by PXE, using the frontend node as the server. All of the nodes of the cluster get their filesystems from the same CD image. Packages can be added to the frontend node on the fly, thanks to aufs. The CD image is created by running a single script, which takes advantage of the Debian Live infrastructure." It is very easy to create a custom version with new packages installed in standard locations by adding package names to the script and then running it. PelicanHPC is developed by Michael Creel, a professor of economics at the Autonomous University of Barcelona, Spain. LinuxMagazine released a cover article on PelicanHPC in June 2009: http://www.linux-magazine.com/w3/issue/103/030-035_pelicanHPC.pdf • The LAM-MPI and OpenMPI implementations of MPI are installed. Both 32 and 64 bit versions are available. Debian testing (Lenny) is the base for both. • Contains extensive example programs using GNU Octave and MPITB. Also has the Linpack HPL benchmark. !

4 of 6


Andrew Fecheyr-Lippens!

CAC Actividad 5

• You can use any Class C network you like. By default, the cluster is on 10.11.12.* • Contains xfce4 window manager, konqueror for browsing and file management, ksysguard for monitoring the cluster, kate and nano for editing. As noted, it is very easy to add packages. Pelican is a bare-bones framework for setting up a HPC cluster.

Pros Have a cluster ready in minutes! Based on Debian Live: easy to build a custom image Active development and forums

Cons Needs to modify the network No hard disk installation

LiveCD 2: Cluster By Night - a temporary cluster in minutes. A similar project called “Cluster By Night” can be used on a network of computers without modifying the network (i.e. cluster nodes can receive their ip address from the existing dhcp server). It runs Open-MPI 1.3.3 and is distributed as two 14MB live-booting iso images. Very useful to turn a few idle computers into a temporary cluster. The website includes a screencast where the developer turns a couple of university computers into a cluster to run a DNA sequencing program, all in less than eight minutes. Cluster By Night (CBN) is based on Tiny Core Linux to make it really small and uses Bash scripts to build the images and Ruby scripts for maintenance and starting the cluster.

Pros Easy to modify to your needs Temporarily bundle existing idle systems together as a cluster !

!

Cons Does not have many features No hard disk installation Only one developper

5 of 6


Andrew Fecheyr-Lippens!

!

!

!

!

!

!

!

CAC Actividad 5

Comparison of Linux distributions Ubuntu

CentOS

Gentoo

FreeBSD

Debian

PelicanHPC

CBN

Current version

9.10

5.4

10.1

7.2

5.0.3

1.9.2

0.1

Released on

29/10/2009

21/10/2009

10/10/2009

04/05/2009

05/09/2009

19/10/2009

02/09/2009

Kernel version

2.6.31

2.6.18

2.6.30.5

BSD kernel

2.6.26

2.6.26

2.6.29.1

Base distribution

Debian

Red Hat Enterprise

independent

independent

independent

Debian Live

Tiny Core Linux

Package manager

apt

yum / up2date

portage

ports

apt

--

Extensions

Package format

DEB

RPM

ebuilds

DEB

DEB

TCE

Package type

binary

binary

source

binary and source

--

--

Released cycle

Stable: 6months

Follows RHEL

binary and source

Unstable / continious Unstable: ~6 months Slow: 1~3 years

Author’s Subjective and Opinionated Scoring Support

++

-

+

+

-

Documentation

+++

+

+

+++

-

Reliable / Stability

++

++

-

+++

+++

Flexible / Customize

-

-

+++

+

+

Maintainable

++

+

--

++

+

Newest Software

++

+

++

-

-

6 of 6

Comparison of Linux Distributions  

A Comparison of Linux Distributions for a Clustering Course.

Read more
Read more
Similar to
Popular now
Just for you