Page 1


Vision Systems Look to ATCA for Power

The magazine of record for the embedded computing industry

July 2007


Multi-Core Rises to Challenge DSPs Linux Muscles in on More Territory

An RTC Group Publication

GE Fanuc Embedded Systems

V7865 Intel® Core™ 2 Duo single board computer

© 2007 GE Fanuc Embedded Systems, Inc. All rights reserved.


Small Boards Doing Big Jobs



18 Blade with two 3.4 GHz Xeon processors

Virtex-5 FPGA with ANSI C on a PMC Card

StackableUSB Boards

JULY 2007


5 Editorial What Is it About Linux? Insider 7 Industry Latest developments in the Embedded

Technology in Context

Industry Insight

Vision and Inspection Systems

DSP vs. Multicore Systems

Vision Platforms Based on 10 Advanced AdvancedTCA Architecture

and x86 – Getting Past the Hype 28 DSP in Processor Architecture

Stephen Huang, Adlink Technology

Brian Peebles, Dialogic


& Technology 40 Products Newest Embedded Technology used by Industry Leaders

Featured Products Kit Fires Up DSP Development 18 Design VMetro Combines Virtex-5 FPGA with ANSI C on a PMC Card

80 DSP Cores on a Single ATCA Blade RadiSys Packs Quad Cores on Modules with 10 GigE

Software & Development Tools Linux and FOSS: End-to-End 34 Linux (and Top-to-Bottom, Too) Bill Weinberg, LiPS Forum & Linux Pundit

Solutions Engineering PC/104 Stacking for PC/104: 20 New USB Grows Up - and Up and Up Jim Turley, Micro/sys

Data Acquisition 24 PC/104 for Industrial Applications Robert A. Burckle, WinSystems

Cover Photo: With analog input, analog output, and digital I/O combined on a single PC/104 board, WinSystems’ PCM-MIO provides Automated Guided Vehicles (AGV) with the required functions for high-accuracy closed loop motor control. Shown here is an automatic vehicle for handling paper in a printing plant.

July 2007

Editorial July 2007

What Is it About Linux? by Tom Williams, Editor-in-Chief


ere’s a riddle for you to use to amuse your friends: What do Windows CE, Java and Linux all have in common? Answer: Not one of them was originally conceived for use in real-time and embedded systems. No, Windows CE was originally designed for use in a generation of half-laptop-size machines that never took off in the market. It then went through a couple of versions before it became the very popular embedded operating system we know today. Java, of course, was to be the “write once run anywhere” programming paradigm, which by virtue of its virtual machine seemed to disqualify it from use in embedded and real-time applications. Linux arose out of the frustration of the programming community with the fragmentation and proprietary turf wars around Unix. It was to be “Unix the way we always wanted it to be.” Unix was at the time a very desktop-oriented operating system with a number of disparate attempts to give it real-time characteristics. Of these three software entities, two are largely controlled by single companies: Microsoft and Sun Microsystems. As such, their evolution has tended to be guided by the interests and perspectives of those companies. Even “independent” Java vendors must be guided by much of what Sun decides. There are variations among kernel code for processor support and even supersets and subsets of APIs, which result in programming paradigms that have a single name but many variants under that one tent. Linux is different if not absolutely so. It did not arise from a corporate womb; it was started by a guy who wanted to do it because it seemed a right and useful thing to do. He did not claim to be the single fount of wisdom and therefore did not try to own it all or control every aspect of its development. The result has been a path of evolution of almost biological character with Linus Torvalds as the “intelligent designer” who knew when to keep his mitts off it and when to make pronouncements about it. That benign shepherding has spawned a community of highly motivated intelligent and enthusiastic participants who have not only refined Linux but also made it possible for it to migrate from

the desktop and mainframe worlds into telecommunications systems, mobile phones and myriad embedded devices—largely with a single code base that spans processor architectures and application domains. Of course there are differences among commercial distributions, which are actually open source code that has been brought together, verified and tested and is given support by the Red Hats, MontaVistas and SuSes of this world. What you are buying with one of these distributions is not a license to someone’s proprietary code, but the assurance that the code that is available for free has been checked out to work together and that there is a number to call for support—so you don’t have to start from scratch. Linux has survived dark threats by companies that claimed original rights to it and by others who claimed that they had found their own misappropriated source code in the listings and whose legal action would bring this whole unruly gaggle of software fanatics to heel and impose respectable fees. Harrumph! Still, Linux goes on and is today doubtless so valuable to the growing new world software infrastructure that such efforts will never get anywhere. Then there’s that penguin. “Serious” marketing people have dismissed it as silly or frivolous or cutesy, but little chubby Tux is the perfect symbol for what the Linux community has been about. Anyone who develops a Linux-based product is free to use the penguin at no charge. The origin of Tux is said to be connected to Linus Torvalds’ affection for penguins. Here again, we have initiation without insistence on control—just influencing direction. Today, Tux is universally recognized as the symbol for Linux and all this was done without charge by Larry Ewing using GIMP (GNU Image Manipulation Program) that comes with many GNU Linux distributions. And yet for all this seemingly free and easy attitude, Linux developers are creating compelling applications and products, many are becoming quite prosperous, and Linux itself is becoming an indispensable part of the world’s technology infrastructure. July 2007


Four decades of quality manufacturing, design innovation, and process control has earned us the time-honored trust and confidence of a global network of customers.

DDC IS YOUR DATA NETWORKING SOLUTION • MIL-STD-1553 COTS Boards and Components • Fibre Channel • Gigabit Ethernet • High Speed 1760 • Enhanced Bit Rate 1553 (EBR) • ARINC 429

Toll Free: 1-800-DDC-5757

Industry Insider

July 2007

PICMG Firmware Upgrade Capability Supports ATCA, AMC and MicroTCA PICMG has released a new specification that defines an open mechanism for systems to upgrade the resident management software and firmware on the various components and subsystems. Designated PICMG HPM.1, the specification was developed by the existing PICMG 3.0 subcommittee but released as a separate specification so it can also be applied immediately to other PICMG-defined architectures, specifically Advanced Mezzanine Cards (AMC) and MicroTCA systems. This specification is the first from PICMG to augment the hardware platform management (HPM) layer of all three architectures in a single document. The Intelligent Platform Management Interface specification, which ATCA uses as the basis for its hardware platform management infrastructure, does not provide any generic mechanism for upgrading management controller firmware. This new specification defines an advanced architecture and corresponding interfaces so that a single upgrade agent can update the firmware in the many management controllers of an entire system, even if the modules in the system come from different vendors. “HPM.1 adds very useful functionality to ATCA and MicroTCA systems, along with the AMC modules they include,” said Mark Overgaard of Pigeon Point Systems and the Chair of the HPM.1 effort. “This framework will allow system integrators to have a single set of tools for upgrading all the IPM controllers in an entire system. In addition, the HPM.1 architecture ensures that all new compliant field-replaceable units will be automatically supported by the upgrade tools compliant with this specification, and vice versa,” he added. HPM.1 will be provided free to PICMG members and is available for purchase by non-members. More information, including product listings, can be found at

IBM to Acquire Telelogic, Become 800-Pound UML Gorilla

IBM has entered into a definitive agreement to acquire Telelogic at an offer price of 21 Swedish Kronor per share or approximately US $745 million, subject to regulatory reviews and other customary closing conditions. Telelogic is a public company headquartered in Malmo, Sweden. Upon completion of the acquisition (expected to close Q3 2007), Telelogic will be a business line within the IBM Rational Software unit. Last year, Telelogic acquired embedded software modeling tool rival I-Logix and has since worked to integrate the company’s Rhapsody product into Telelogic’s existing suite of Application Lifecycle Management (ALM) products, including the popular DOORS requirements management solution. Post acquisition, the competitive landscape fea-

tured a growing Telelogic looking to challenge IBM in this segment as both a leading provider of UML tools for embedded software development and a vendor of an integrated suite of ALM tools that could be competitive with IBM’s larger product offering. According to analysis by Venture Development Corporation (VDC), within the software modeling tools market, IBM’s acquisition of Telelogic will surely change the competitive landscape. The combined company will become the clear market leader in UML tools within the embedded space, and VDC believes that there will be few challengers able to match IBM Rational/Telelogic in terms of revenue, breadth of product offering, global-reach and consulting services. Perhaps more importantly, in addition to securing leadership in the embedded software modeling tools market, the acquisition also strengthens the positioning of the

company’s larger ALM offering across both the embedded and enterprise markets. IBM Rational/ Telelogic will now have a broad set of complementary market solutions to offer to their diverse customer base and will likely look to leverage solutions and services from both companies across specific target markets that play to each solution’s strength. The acquisition will provide greater opportunity to deliver integrated products to shared customers, especially within the military/aerospace, automotive/transportation and telecom/datacom industries.

Specification for MultipleInterface Memory Cards

The MultiMediaCard Association (MMCA) has announced a specification for the new miCARD, a 12 mm x 21 mm x 1.95 mm storage card designed for easy data interchange between MMC and USB devices. When used in portable devices such as

cameras, smart phones and PDAs, miCARD takes advantage of the low power consumption and highperformance characteristics of the MMC interface. The miCARD then allows consumers to transfer that media-rich content to PCs, printers and home entertainment appliances by inserting the card directly into those devices’ existing USB connectors—without the need for dedicated card slots or separate card readers. Preserving the performance and ease of use consumers currently experience with USB 2.0compliant devices, miCARD will transfer data at speeds up to 480 Mbits/s, with full electrical, mechanical and software compatibility. It is the first memory card to combine the features of the MMC System Specification v4.2 and USB 2.0, the most successful interface in the world. Initially, passive mechanical adaptors will be available to convert the miCARD for use in many of today’s CE products that accept full-size MMC cards. In the future, portable devices will be able to take advantage of miCARD’s smaller size by offering slots that accept miCARD directly. No change is needed for miCARD compatibility with existing USB Type-A ports; consumers can simply plug and play.

Linux to Launch into Space

Wind River Systems has been selected by Honeywell Aerospace to support the development of NASA’s New Millennium Program Space Technology 8 (ST8) Dependable Multiprocessor. The contract marks the first time a Linux platform has been selected by Honeywell for a space mission. Honeywell Aerospace is the prime contractor for NASA’s ST8 Dependable Multiprocessor project. Wind River’s Platform for Network Equipment, Linux Edition, July 2007

Industry Insider

July 2007

will be the underlying operating system to support the processing of science and experiment data on board the ST8 spacecraft. The Dependable Multiprocessor will create a new generation of “smart” spacecraft and robotics for future exploration missions conducted by NASA. Composed of a COTS-based supercomputer architecture capable of incorporating both on-chip and FPGA-based algorithmic coprocessors, Dependable Multiprocessor technology can autonomously and adaptively configure the level of fault tolerance applied to the COTS-based computer system in response to constantly changing mission environments and the criticality of the mission application. The Dependable Multiprocessor will allow the spacecraft to process and analyze its own data to make instant decisions about what is observed without having to send the information to Earth and wait for a reply. Any material put into space is subject to variable accelerations, mechanical shock and vibration, harsh vacuum conditions, extreme temperatures and often intense particle and electromagnetic radiation. Wind River Platform for Network Equipment, Linux Edition, running in conjunction with GoAhead SelfReliant Software, which provides high-availability middleware, and Honeywell’s Dependable Multiprocessing Middleware on Extreme Engineering Solutions’ XPedite6031 boards, will support the demonstration of high-availability and high-reliability operation for the ST8 Dependable Multiprocessor experiment. The ST8 mission is scheduled for launch in November 2009, with an expected duration of at least seven months consisting of two phases, including a one-month commissioning phase and a sixmonth experiment phase. The mission consists of four independent experiments, including the Dependable Multiprocessor on a common spacecraft bus being provided by Orbital Sciences Corporation. The Dependable Multiprocessor experiment will validate a comput-

er system architectural approach that allows application flexibility by applying robust control of the high-performance COTS cluster, enhanced software-based Single Event Upset (SEU) tolerance, and user-selectable redundancy only to the level required by the environment and the criticality of the task or computation.

“MicroTCA has created a huge buzz. There is a large demand to address these issues so it can be used in many more markets,” said Mike Franco, the chair of the new subcommittee. PICMG will publish backgrounders and other documents from the subcommittee on the “Resources” page of www. as work progresses.

PICMG Forms Rugged MicroTCA Subcommittee

VITA Secures ANSI ReAccreditation, Modifies Patent Policy

PICMG has formed the new Rugged MicroTCA subcommittee to investigate and define enhancements to the MicroTCA and AdvancedMC definitions. The enhancements will enable products to be used in markets where environmental requirements may be much harsher than the telecommunications market, the first target of these specifications. The committee is called the Rugged MicroTCA subcommittee but it will also address Advanced Mezzanine Cards, which are the building blocks of MicroTCA systems. The target markets for Rugged MicroTCA are: • Commercial and military applications including airborne, shipboard and ground mobile equipment • Telco Industry Customer Premise Equipment and Remote Access (such as roadside or pole mounted, no fans) • Machine Industry (Rotating machine mounted; no fans; vibration) • Transport Industry (Railway; truck, ship, aircraft mounted) • Traffic control (roadside, no fans) • Security (remote access, no fans) It is likely that there will not be a single solution for all these markets. The committee expects its deliverables will be “dot” specifications that layer on top of the base MTCA.0 specification and contain the environmental enhancements needed for a specific set of market applications. These are likely to include specifications for air cooling, conduction cooling and shock and vibration enhancements.

VITA, the VMEbus International Trade Association, has announced that it has been re-accredited by ANSI effective May 22, 2007. In January, VITA had submitted revised patent disclosure policies and standards procedures to ANSI. The revisions ensure that VSO participants disclose patents that are essential to implementing a new standard and that the participants openly declare the most restrictive terms required to license any such patents. The new requirement to declare the most restrictive licensing terms is intended to make the changed policy fair, reasonable and non-discriminatory. ANSI has approved these revised standards procedures with minor modifications. VITA has been exploring more effective patent disclosure procedures for several years. These explorations led VITA to query the Department of Justice about how to develop better procedures. During the first half of 2006, VITA and its board of directors developed new patent disclosure procedures for use by the VSO. The new patent policy was submitted to the Department of Justice on June 8, 2006, for their review. After numerous meetings to gain clarification, slight changes were made. The Department of Justice issued their positive business review letter on October 30, 2006. The VITA members and board of directors approved the changes in January of this year. ANSI’s approval of the procedure changes now completes the process for ANSI re-accreditation of the VITA standards efforts.

TechnologyInContext Vision and Inspection Systems

Advanced Vision Platforms Based on AdvancedTCA Architecture A solution for advanced machine vision applications that fulfills the demand for advanced technologies, customization and application emphasizes the reduction of operating expenses and space consumption using the ATCA architecture. by S  tephen Huang Adlink Technology


TechnologyInContext quality assurance, which is top priority no matter what the industry or application. AOI systems involve acquiring images into a computer, converting the images into usable formats, adjusting the images to the desired views, and calculating the appropriate data that represent the images to perform quality inspections. When specifically applied to a manufacturing environment, an AOI system must satisfy not only the requirements for high speed, high resolution, 24-hour operation and repeatability of measurements, but also for automatic identification, tracking and quality assurance throughout the entire production process. Some featured AOI systems with customized DSPs and FPGAs give basic solutions for translating results and locating defects. With advanced optical inspection technology, users are able to recover more of the good product (higher yield) and remove a higher percentage of defective product (quality control) than the manual sorting and defect removal methods historically used by many industries. In the wood panel industry, for example, increasing the number of decisions made by a machine vision system can also result in increased throughput, higher yield and more accurate product grading with fewer line workers, aiding a company’s bottom line by cutting costs. AOI systems can add significant value in manufacturing environments where processing is highly variable, by improving the uniformity of finished products. However useful, without the proper computer architecture and controls, AOI PMC Module

Camera Control

Image Rearrangement Camera Control Encoder.Trigger


Customizable Processing FPGA



Encoder I/O FPGA


Control 4MB Flash

Figure 2

systems run the risk of longer development times and effort for the integration of DSPs and FPGAs across multiple programming languages. Therefore, these solutions require higher software and hardware investments related to the acquisition and adoption of new DSP technologies. A solution for an advanced AOI application involves a platform that delivers an optimum balance between product capacity and cost ratio in relation to processing and input/output support, easy programmability, customization and maintenance.

AVP Platform Meets High-End Requirements

Although primarily designed for nextgeneration telecom applications, some benefits of Advanced Telecom Computing Architecture (ATCA) are found to be of great advantage in solving the current problems of high-end machine vision systems. The open architecture saves development time and associated costs, while


DMA Control PCI-X Bridge FPGA


DDR2-400 REF/ECC PCI-X 64b/133M

Intel 6700PXH


Intel 82546GB

PCI-X 64b/66M

AVP Blade Functional Block Diagram. Camera interface and camera control are implemented on a PMC module, which interfaces to the ATCA AVP platform. July 2007


TechnologyInContext ATCA systems equipped with PMC card expansion enhance system flexibility. By integrating customized PMC cards and utilizing innovative Gigabit Ethernet and shelf management technologies, a machine vision system based on ATCA ably meets all performance requirements in a compact and high-density, multi-blade rackmount system. A high-performance CPU blade with a PCI-X PMC module forms a single advanced CPU platform. A PMC module can

interface with several kinds of camera interface standards including Camera Link, Firewire and Gigabit Ethernet interface cameras that are becoming increasingly popular. With the proper combination of these technologies and a flexible architecture, a PMC module virtually eliminates the risk of obsolescence by ensuring backward and forward compatibility to keep systems performing at peak as technology advances and assures users maximum long-term return on their investment.

TechnologyInContext and storing high-density image data. Such blade servers represent a shift away from traditional proprietary machine vision system integrators by alleviating the need for a DSP and utilizing the power and performance of the ATCA architecture. For AOI applications, an AVP blade must utilize a high-performance CPU to process data at an optimal level while performing complex morphological operations. With a wide memory bandwidth and capacity, AVP blade technology further aids users by efficiently storing large amounts of image data that can be accurately tracked and debugged if necessary. Another essential item in building a cost-effective and space-saving platform for AVP is a comprehensive I/O density that provides greater flexibility and manageability. This will enable it to support up to four Camera Link connectors for dual-channel output, VGA, GPIO, LAN, and two USB ports on the front panel. The AVP blade backs a Camera Link or IEEE 1394 interface built on a PMC module and installed on two onboard 64-bit/133 MHz PCI-X slots. With an integrated CompactFlash card slot, developers can conveniently build an OS image to boot the system. A rear transition module provides an extra 2.5” SATA HDD storage that integrators need to store images for further analysis. The onboard PMC module of the AVP blade for a high-performance Camera Link or IEEE 1394 interface supports high-speed image data transfer. Write/ read wrappers around the FPGA manufacturer’s IP core were developed so that it can achieve the benefits of a high-capacity system, allow users to define the bus width of each write/read port, and enable calculation efficiency. Through the onboard high-density, customizable, FPGAbased processing core, each channel supports image data transfer rates of up to 640 Mbytes/s with an acquisition pixel clock rate of up to 85 MHz. A standard platform requires only a customized PMC card for full functionality. Specialized AMC cards are not necessary. The platform shortens application development and system upgrades while allowing SIs to implement their proprietary FPGA know-how for pre- or postimage processing.

As manufacturing environments evolve, they produce items that are delicate or susceptible to contamination, such as integrated circuits and temperature-sensitive devices, as well as pharmaceutical products that may require measurement methods that are noncontact and nonintrusive. Vision inspection is also advantageous in processing applications where safety is a factor, such as parts made from hazardous materials. Properly fitted AOIs not only add value through improved efficiency in many highly techni-

cal processes, but also allow for on demand statistical collection and the ability for realtime feedback in manufacturing processes. The power, flexibility and connectivity of AVP systems have become almost necessary to achieve 100% inspection with high throughput at relatively low cost. Adlink Technology Irvine, CA. (866) 423-5465. [].

FeaturedProducts Design Kit Fires Up DSP Development

VMETRO Combines Virtex-5 FPGA with ANSI C on a PMC Card VMetro and Impulse Accelerated Technologies have announced a DSP development kit, the V5+C, for rapid prototyping and algorithm development. The kit, which includes the latest-generation Impulse C-to-VHDL compiler tools and a VMetro PMC module based on the Xilinx Virtex-5 FPGA, allows system developers to hardware-accelerate DSP algorithms and quickly prototype on an FPGA within an ANSI C environment. The Impulse C compiler helps bring more software developers into FPGA technologies with the ability to automatically analyze, optimize and translate the original C code to run in parallel on an

FPGA to take advantage of the inherent parallel capabilities of FPGA devices. The result is an FPGA solution that outperforms most DSPs or processors. Using the Impulse tools in combination with Xilinx FPGA devices, users can expect to achieve 10x acceleration over their existing microprocessorbased solutions. The VMetro PMC-FPGA05 included in the V5+C kit is a PMC module with a large-capacity Xilinx Virtex-5 XC5VLX110 FPGA and customizable digital front-panel I/O. The FPGA is boosted by multiple banks of QDR and DDR memory. The PMC-FPGA05 was designed for embedded DSP applications where there is a need for flexible, customizable I/O and FPGA processing on the data stream. There are a number of I/O adapter modules available from VMetro that


July 2007

can be plugged onto the PMC-FPGA05. Customers can also develop their own I/O adapter modules to meet their custom I/O requirements. The V5+C kit extends the ability to develop DSP algorithms in ANSI C by enabling C applications to directly interface to VMetro-provided hardware IP blocks, eliminating the need for developers to use VHDL to interface their hardware-accelerated algorithms to the VMetro’s IP blocks for the Virtex-5 FPGA. The Impulse C Platform Support Package (PSP) provides C-callable interfaces to the VMetro PMCFPGA05 PMC module’s PCI-X bus, the QDR and DDR memories, and more. The V5+C kit includes a PMC-FPGA05 on a PCI-X carrier card along with the Impulse C tools and PSP. Early adopters of this kit receive special factory training and design support on their first algorithm. The discount price for early adopter customers is $9,995.

VMetro Houston, TX. (281) 584-0728. []. Impulse Accelerated Technologies Kirkland, WA. (425) 605-9543. [].

80 DSP Cores on a Single ATCA Blade

RadiSys Packs Quad Cores on Modules with 10 GigE An Advanced Telecommunications Computing Architecture (ATCA) digital signal processing (DSP) blade is targeted at providing telecommunications equipment manufacturers (TEMs) a way to achieve a low cost-per-port for next-generation VoIP, media processing and media gateway solutions. With an architecture that hosts up to 20 multicore MSC8144 DSPs from Freescale Semiconductor, the Promentum ATCA-9100 Media Resource Module from RadiSys is designed to provide system designers with a time-to-market advantage and ensure their ability to handle future requirements. Based on a modular design concept that incorporates mezzanines to host DSP “farms,” the ATCA-9100 enables a smooth transition from one generation of DSP to another without a complete overhaul of the blade. RadiSys has developed a proprietary mezzanine form-factor and connector instead of using the AMC form-factor normally associated with ATCA. The current card, for which there are two sites on the ATCA-9100, hosts ten quadcore MSC8144s for a total of 80 DSP cores on the blade. Upgrading to next-generation DSPs can be done by installing new mezzanines rather than replacing the ATCA blade. RadiSys is working with Freescale and its DSP roadmap to provide an ongoing upgrade path. Ten Gigabit Ethernet fabric connectivity and direct Ethernet access to DSPs give the Promentum ATCA-9100 superior packet and media processing capabilities. Additionally, the module includes Serial Rapid IO (SRIO) switching and support for easy debugging of DSP code, helping to avoid costly project delays that may result from problems that are difficult to debug. The ATCA-9100 provides a complete solution and alleviates TEM application development with the inclusion of onboard carriergrade Linux, switching software; blade management software; and other APIs. TEMs can either leverage their existing DSP code from prior generations or leverage off-the-shelf DSP software to address their product DSP software needs. “The need for high-performance solutions that address the demands of TEMs developing applications requiring media processing continues to grow and is stronger than ever,” said Jeff Timbs, marketing director for Freescale’s Networking System Division. “Working with RadiSys, we’re breaking new ground with multicore technology that reaches an outstanding level of performance density for high-capacity infrastructure applica-

tions as well as reducing total system cost, board space and power dissipation.” The Promentum ATCA-9100, with such high-density capability, reduces the hardware footprint in the central office by over 50 percent while attaining the same capacity of media processing over conventional solutions. Additionally, the ATCA-9100 is fully optimized for the RadiSys Promentum SYS6010, the company’s 10 Gigabit ATCA platform. The SYS-6010 has been adopted for a multitude of customer applications including Radio Network Controllers, Media Gateways, IPTV media routers, Security Gateways and IMS application servers. Freescale’s MSC8144 takes single-chip DSP integration to a higher level. Combining four StarCore DSP cores at up to 1 GHz each, the device is designed to deliver gigahertz performance, equivalent to a 4 GHz single-core DSP. Additionally, it integrates one of the industry’s largest embedded memories (at 10.5 Mbytes) in a single package, virtually eliminating the need to attach external memories while maintaining a highly competitive cost and power per channel.

RadiSys Hillsboro, OR. (503) 615-1100. []. July 2007


SolutionsEngineering PC/104

New Stacking for PC/104: USB Grows Up – and Up and Up Speeding past bus limitations for a proven and popular form-factor, the addition of stackable USB connectivity to PC/104 keeps real estate small, vastly increases bandwidth and makes the system processor independent. by Jim Turley Micro/sys


Figure 1

A stack of StackableUSB boards. A stack can include up to 16 boards.

are hopelessly tied to obsolete microprocessors or interfaces. Apple’s Macintosh has evolved through three different microprocessor families, and PC expansion cards change interfaces every few years.

Out with the Old, In with the New One of the more popular modular board standards is PC/104, with a name that makes its ancestry clear. It’s a 104pin embedded version of the IBM PC’s

SolutionsEngineering old expansion bus. PC/104 boards measure about 3.5 inches on a side and they stack neatly one atop the other like highrise electronic office buildings. The stack shares a single bus, sort of a vertical implementation of an IBM PC/AT’s motherboard. With its compact size and PC compatibility, PC/104 became very popular indeed among developers who didn’t want to create their own boards from scratch. As useful as it is, PC/104 is also inextricably tied to that old PC/AT motherboard design. Now that the standard is set, it’s counterproductive to change it, even though the PC expansion bus is long gone. Called ISA at the time—for industry-standard architecture—it’s now anything but. PCs and PC silicon have

make it popular with users and developers alike. In a moment of Frankensteinian genius, a group of board makers combined their favorite parts of PC/104 with features of USB to create—StackableUSB. The evocative, if uncreative, name describes it pretty well. It’s a version of the USB standard created for modular board-level systems. In fact, it’s mechanically identical to PC/104 but replaces the wheezing PC/AT bus with the more modern USB interface. Same body, different brain. Okay, so USB is hardly brand spanking new, but while ISA has one foot in the grave, USB is still growing. New PCs haven’t included ISA expansion slots for DEVICE












Figure 2

USB’s star topology uses root hubs, each of which contributed to increasing the total bandwidth of the system.

moved on from ISA to PCI (with a detour at Micro Channel along the way) to AGP and PCI Express. Presumably, there’ll be another new PC expansion “standard” in a few years. In the midst of all this leap-frogging of standards came the ambitiously named USB, the “universal serial bus.” Unlike the motherboard buses, USB has shown real staying power and is now almost as universal as its name implies. USB interfaces show up on everything from PCs and Macs to digital cameras, handheld devices, instruments, dataacquisition systems and storage media. It’s small, it’s fast, and it’s got built-in “plug and play” characteristics that 22

July 2007

years. Heck, they don’t even have PCI slots any more. But they all sprout USB ports front and back, usually a halfdozen of them. It’s not hard to see which way the world is going.

Racks, Screws and Electrons

StackableUSB uses the same physical form-factor as PC/104, so the boards will at least look familiar to PC/104 aficionados. More important, USB stacks can use the same enclosures, mounting hardware, cooling plenums, wiring harnesses, and whatever else that already exists for PC/104 installations. The stack can have a motherboard at the top or bottom if you wish, or the mod-

ules can be used stand-alone. Up to 16 boards will go into a StackableUSB stack (Figure 1) and each board can draw almost 1A from the +5V and +3.3V supplies. Where StackableUSB differs from PC/104, of course, is in its electrical interface. That’s taken straight from the USB 2.0 specification, so it’s fast when it can be and tolerant of older USB 1.1 devices when it has to be. It’s also compatible with hundreds of USB components, soft IP (intellectual property) cores and peripherals. Peripheral chips aren’t really available with ISA/x86 bus interfaces any more; they generally have USB or PCI interfaces instead. Obviously, this makes it a whole lot easier to find chips for your USB stack than it does for any ISA-based design. USB is both faster and smarter than traditional processor buses. For example, it has some built-in error-recovery mechanisms. USB is modestly fault tolerant, in the sense that it accommodates noise on the USB lines, detects transmission errors, and retries operations when necessary. Those are all useful features in noisy or safety-critical embedded applications.


So how fast is StackableUSB? That’s a tricky question because the answer depends on how your boards are designed, what chips you use, and how fast you want them to be. Bandwidth and throughput are all under the control of the system designer. Paradoxically, the two-wire USB bus is faster than the 16-bit ISA bus in most cases. (Actually, PC/104’s bus is a lot wider than that because it includes a gaggle of control and handshake signals in addition to its 16 data lines.) In all, there are more than 100 signal lines passing through every PC/104 board, while StackableUSB has just two per link (plus power and ground). Apart from everything else, this makes the connector a whole lot smaller, so there’s more room for components and less wasted on overhead. Instead of a single shared bus, USB follows a hub-and-spoke (star) topology. It’s more of a network than a bus, so it fans out differently (Figure 2). This is all for the good; you get more total bandwidth with USB and the bandwidth isn’t shared the same way as with PC/104. The standard USB specification defines the concept of root hubs, which are


Processors and Software

One other characteristic of StackableUSB is less obvious: it’s processor independent. Developers don’t have to use x86 processors. They certainly can use Intel or AMD chips, they just don’t have to. Having a choice of microprocessor family is all very well, but a choice of software (including operating system and middleware) is even better. Although x86 processors are popular and well supported, they’re also victims of their own success. Their life cycles are driven by the PC market, not the needs of embedded designers, so Intel and AMD chips tend to be expensive, power-hungry and have short lives. Pentium, Opteron and Core 2 Duo go through product cycles faster than a fashion designer, so getting a stable long-term supply of these chips is problematic. No sooner do you

3500 3000


2500 Bandwidth

sort of like bus masters. (In a PC or Mac, the computer itself is the root hub while the keyboard and mouse are client nodes connected to the hub.) You can have several root hubs and each one increases total bandwidth. If they’re older USB 1.1 hubs, they provide 12 Mbits/s of bandwidth to their downstream peripherals. Newer USB 2.0 root hubs, however, can provide a whopping 480 Mbits/s of bandwidth. A fully loaded StackableUSB system could have as many as 16 root hubs—a total of 7.8 Gbits/s of aggregate bandwidth! We can also look at this another way. Even though the bandwidth of an individual USB 1.1 channel is less than that of the PC/104 bus, as Figure 3 shows, the bandwidth of the entire USB stack is greater. That means if you’re only connecting two boards together, PC/104 could be faster. But after the third or fourth board in the stack, USB offers more headroom. And if you’re using high-speed USB 2.0 chips, it’s a lot more. The upshot is, StackableUSB bandwidth goes up, not down, as you add boards. Other buses share their bandwidth among boards, so each new board gets a comparatively smaller slice of the bandwidth pie. It also means that one datahungry bus hog can adversely affect all the other devices. StackableUSB goes the other way: adding boards with hub chips increases bandwidth instead of usurping it from the others.






1000 500 0

Figure 3

960 480 133 1

133 2

133 3








StackableUSB PC/104

PC/104 bandwidth versus StackableUSB bandwidth as a function of board count. StackableUSB bandwidth increases as hubs/boards are added.

design a system around one processor than it becomes obsolete, replaced by a faster, bigger and more expensive version. StackableUSB breaks the connection between bus standard and processor family. It allows boards and systems to be based around PowerPC, ColdFire, ARM, MIPS, or just about any other processor—or combination of processors. With processor independence comes software independence. Here the software issue becomes a double-edged sword. To reap USB’s plug-and-play benefits you need a lot of software. That “anytime, anywhere” connectivity doesn’t happen by magic. USB drivers are actually pretty complicated because they need to recognize all sorts of potential devices that might be plugged into their bus. They have to configure new devices on the fly without any jumpers, switches or hardware resets. That’s tough to do. On the other hand, most of that work has already been done. Most USB chips and peripherals either come with USB drivers or will enjoy native operating system support. Windows and MacOS obviously have tons of USB support built in, but many embedded operating systems also understand and support USB. You might have to write the odd driver here and there, but chances are you’ll find offthe-shelf support for many mainstream chips and peripherals.

Progress is a funny thing. On the one hand, we want all the benefits of newer technology: faster speeds, lower power, better integration, and so on. On the other hand, we cling to established standards, products and habits. It’s a combination of innovation and inertia in equal measure. StackableUSB takes advantage of that: a combination of old and new that mates the mechanical popularity of PC/104 with the electrical popularity of USB. It promises to make embedded design easier and more rewarding for developers for many years to come. Micro/sys Montrose, CA. (818) 244-4600. [].

July 2007


SolutionsEngineering PC/104

PC/104 Data Acquisition for Industrial Applications Computer-based measurement and control is based upon analog input and output variables from sensors, representing parameters such as temperature, pressure, acceleration humidity and others. In the “real world” a variety of analog sources must be accurately digitized for automation and control to be effective by Robert A. Burckle WinSystems


exploration her your goal peak directly al page, the t resource. chnology, and products


because of its susceptibility to vibration, humidity, temperature extremes and even the rapid market-driven obsolescence associated with the consumer world. In other cases, it is simply too bulky to package into an instrument or OEM application. Enter PC/104. Not only is the tiny 3.6” x 3.8” industry standard form-factor a great size for compact, highly integrated data acquisition systems, but it is powered by a wide range of PC-compatible CPU modules from 133 MHz to 1 GHz and beyond. These processors run Linux, Windows XP Embedded and other x86-compatible real-time operating systems with networking support. A designer or system integrator can stack two boards (CPU and analog) with an industrial CompactFlash device for data logging, which can be integrated into a single enclosure as small as 4” x 4” and only 2” high. These minuscule systems are designed to tolerate shock, vibration, dust, humidity, and operate over an extended temperature range without a fan, depending on the processor speed. But even with a small, rugged solution like PC/104, there are still important analog signal design issues to be considered. The data acquisition system must

5V 10°F






2 CHO 3 CH1




ADC+ 12







3V TO 5V 0.1°F


AGND2 AGND3 DGND 17 18 24










Overall stability and accuracy is achieved by integrating the key elements onto a single chip, in this case the Linear Technology LTC1588.

be configurable to handle a variety of full-scale voltage ranges and be accurate over a broad range to ensure integrity of the data. Ideally, the system should not require user calibration to maintain data integrity. Such a “No-Cal” implementation has great advantages. Old technology boards with trimpots (potentiometers) are prone to time- and temperature-related drift. Unpredictable and untraceable errors render data questionable and perhaps unusable. Additionally there is down time and the costs of a technician required to measure and adjust the system. User-initiated auto-calibration is better, but “No-Cal” solutions ensure the accurate results that are demanded by OEMs.

Precision Analog Input

In order to achieve good accuracy and resolution, a 16-bit analog-to-digital (A/D) converter is desirable. However, such precision comes with a cost. It takes specially designed circuits to mitigate noise and drift along with matching- and leakage-related inaccuracies over temperature. Frequent calibration helps, but 26





SDI 25 BUSY 22 SCK 26

9 CH7






Figure 2





10°F 20


July 2007

the optimal implementation would feature either automatic onboard calibration or no calibration (No-Cal) at all. The challenge to board and systems designers is to shrink all of this circuitry into a space-saving size while improving operation and cost of ownership over the long haul. To minimize the effects of drift error, analog board designers must approach calibration and drift from the ground up through careful design and component selection. The single most important component in any analog converter design is the analog voltage reference. Reference voltage drift directly affects analog conversion accuracy, expressed as full-scale (gain) error. The reference voltage tends to vary over temperature as well, although well-matched and compensated implementations keep the variations to a minimum (Figure 1). Drift contributions from any other source, whether from the converter itself or signal conditioning circuitry, can affect either zero (offset) or full scale trim (gain). Drift errors are predominantly functions of component drift over time

and how changes in temperature affect components in the system. Previous generation analog design techniques moved to onboard auto-calibration. However, in some manufacturers’ designs, these auto-calibration circuits would drift more over time and temperature than the analog converters they were intended to calibrate. At their best, auto-calibration circuits effectively compensate for drift error between calibration intervals over a limited temperature range on products with highly stable references. The worst auto-calibration circuits do little more than add marketing weight to a datasheet, and will likely reduce the functional accuracy in the field. The best solution today is to select an integrated data acquisition system on a chip. Linear Technology has integrated the key analog system design elements onto a single die in their “Soft Span” series of ADCs. By doing this they can match and trim each subsystem to compensate for errors introduced in the entire conversion process. With the small PC/104 board stacking architecture in mind, the Linear

SolutionsEngineering Technology’s LTC1859 is part of a family of analog-to-digital converters (ADCs) that lends itself to 8-channel applications with 16-bit conversions at fast sample rates. The device includes input multiplexer, range select, sample and hold, analog-to-digital converter voltage reference and associated control logic. Each high-resolution, high input voltage range ADC in this family has an on-chip, temperature compensated, curvature corrected, band gap reference that is factory-trimmed to 2.50V. In addition, the use of precision, laser-trimmed thin-film resistors eliminates the need for user calibration. Zero error, zero error match, full-scale error, full-scale error match, linearity, reference voltage and conversion time are trimmed during production. The LTC1859 series attains the desired No-Cal implementation—no user calibration is needed. This ADC uses an easy serial interface for configuration, and can be software programmed for 0V to 5V, 0V to 10V, ±5V or ±10V input ranges. The 8channel multiplexer can be programmed for single-ended inputs or pairs of differential inputs or combinations of both. This replaces external analog switches, amplifiers and attenuators. In addition, all channels are fault protected to ± 25V for high reliability and low cost-of-ownership considerations (Figure 2).

Figure 3

WinSystems PC/104 Analog and Digital I/O card, the PCMMIO, provides high channel count and spans a -40° to +85°C temperature range for industrial applications.

A fault condition on any channel will not affect the conversion result of the selected channel. An onboard high-performance sample-and-hold and precision reference minimize external components.

Precision Analog Output

Thanks to the highly integrated input conversion circuitry, there is sufficient room available on a PC/104-size board for several output channels of precision analog voltages. Similar to the analog input section, it is important that no calibration be required for the analog output as well. In the past, designing a universal output module was a difficult task since the cost and board space consumed were problematic. However, with the new multiple output range DACs, all of this complexity is unnecessary. As a further example of a highly integrated D/C converter, let’s examine the Linear Technology LTC1588. It is a 12bit D/A with all the standard industrial output ranges (0V to 5V, 0V to 10V, ±5V, ±10V). All of the ranges are accurate with low drift, fast settling and low glitch operation. The LTC1588 DAC incorporates all the switches and precision resistors. A full implementation is PC/104-friendly, using less than 0.5” x 0.5” of board space including the dual operational amplifier, bypass and compensation. This analog output subsystem can be reconfigured under software control in real time. An advanced analog I/O module facilitates the migration from PCcentric automation to small selfhosted stand-alone DAQ systems, in the industry standard PC/104 form-factor. This is possible thanks to the availability of NoCal integrated circuits. Such a high-density analog and digital I/O card can operate from -40° to +85°C. This PC/104compliant card includes a 16 channel, 16-bit analog-to-digital (A/D) converter, 8 channel, 12-bit digital-to-analog (D/A) converter and 48 lines of digital I/O. Using Linear Technology’s fully integrated A/Ds and D/As eliminates the need for all of the outboard analog circuitry used in older designs, which causes the errors and offsets that lead to the former need for calibration. There are no missing codes and the mea-

surements are monotonic over the full temperature range from -40° to +85°C. An example of such a card, the PCMMIO from WinSystems shown in Figure 3, has been designed to minimize drift error effects while simply containing ultra low-noise power supplies and Linear Technology SoftSpan A/D and D/A integrated converters. The module optimizes converter accuracy over time and temperature while avoiding the pitfalls of trimpots and other conventional calibration techniques. It is compatible with isolated signal conditioners that will protect, filter and isolate the analog input and output signals from electrical transients for rugged industrial applications. There are many models available from third-party vendors to interface to a wide variety of voltage, current, temperature, position and other analog-based instrumentation. There was even room left over for 48 lines of digital I/O for a very complete digital acquisition system. Each line is individually programmable for input, output, or output with read-back. Edge detection can also be programmed to generate interrupts for each event change without polling. The lines are TTL-compatible and can source and sink 12 mA, which allow them direct connection to industry-standard (Dataforth, Opto-22, etc.), optically isolated AC and DC signal conditioners. Using state-of-the-art low-noise 16bit A/D and D/A converters with No-Cal auto-calibration gives a shot in the arm to the DAQ market. This clean, simple design yields smaller size, lower cost and much better accuracy by avoiding errorprone manual calibrations. The PC/104 Bus platform provides an efficient and long-lifecycle, 16-bit data path for these converters. The small form-factor is attractive to integrators and OEMs who need to integrate the PC and the DAQ circuitry together for their next designs. WinSystems Arlington, TX. (817) 274-7553. []. Linear Technology Milpitas, CA. (408) 432-1900. [].

July 2007


IndustryInsight DSP vs. Multicore Systems

DSP and x86 – Getting Past the Hype in Processor Architecture Matching processors and applications has become increasingly challenging because of the constant change in the types and capabilities of processors available. Recently the x86 general-purpose processor has been shown to outperform some DSPs for particular algorithmic solutions. But is that the whole story? by B  rian Peebles Dialogic


where the file streaming that is critical to these functions is supported. Because DSPs are not a particularly efficient solution for file access due to a lack of native interfaces, such as SCSI and SATA, and because of the overhead of mounting a remote file system, a general-purpose CPU is preferable for servicing play and record media functions. Speech recognition and speaker identification previously required the computational performance of a DSP; however, the libraries required to support these algorithms have grown to the point where large amounts of memory are necessary, and such memory capacity is best served by a general-purpose CPU. Audio and video compression and decompression require the large amount of processing of which DSPs are capable, with minimal memory footprint and mass storage dependency (unless we are transcoding streaming files to and from a hard disk). So we see that while some functions can be optimized for dedicated silicon and run optimally on DSPs, others are best run on general-purpose CPUs. Figure 1 illustrates the relative efficiencies of various silicon technologies



Intel® Pentium® M 745 1.8GHz, 2MBL2, ATX

Relative Ease of Use




Relative Performance Efficiency (Processing bandwidth per size/power)

used in computational processing as a function of their “usability.” The term usability refers to a device’s overall flexibility in terms of the applications that can be implemented on it as well as the development environment (debug tools, compilers, profilers, etc.) that enable designers to implement those algorithms. The comparisons in Figure 1 are illustrated over time to show how the relative efficiency of the x86 CPU is improving. Key to many recent efficiency improvements is the trend toward multiple processing cores. When array processors emerged in the early part of this decade, they had ten times the performance of traditional DSPs. This efficiency gap was created because of the array processor’s architecture, which had many, simpler Arithmetic Logic Units (ALUs) and highperformance internal fabrics. These architectures are much more flexible than those of their traditional DSP counterparts both in terms of the types of algorithms they can support and the number of algorithmic instances they can provide. However, the array processor’s usability was impacted by the lack of tools necessary to make the multitude of processing cores perform, and as array processor tools are improving, usability is also improving. At the same time, both the DSP and the x86 CPU have an increased number of processing cores and are more competitive. In the x86 CPU, mathematical processing is limited to few integer ALUs and a floating-point engine. These functions are connected via dispatch ports to schedulers, register files and instruction queues in a tightly coupled fashion. Currently, the only way to increase the number of ALUs or floating-point engines is to replicate the entire core. However, this is likely to change for some variations of this kind of processor in the near future. The architecture commonly used for multiple core DSPs is currently much more efficient than that of the x86 CPU. In DSPs, a separate general-purpose unit (such as an ARM core) is often added to front-end multiple-execution engines. Therefore, the entire general-purpose structure is not replicated whenever an additional execution engine is added. The general-purpose core is responsible for load-balancing, scheduling, overhead processing and other management tasks.

Relative Performance Efficiency (Processing bandwidth per size/power)


Based on Intel® Pentium/Celeron® M processor. Intel 855GME / ICH4 chipset with integrated Intel® Extreme Graphics 2 engine 2x32 MB VRAM UMA. 2x DDR-RAM-SODIMM for up to 2GByte.

Fanless Intel® 852GM Celeron® M 600 MHz Micro PC

Relative Ease of Use






Onboard Intel® Celeron® M 600 MHz with 512K L2 Cache CPU

Figure 1

Intel® 82852GM Chipset

Relative efficiencies of popular processing technologies.

Scalable with Evalue EPIC SBC One SODIMM up to 1 GB DDR SDRAM

However, this architecture has its limitations in that only a few execution engines can be associated with a single generalpurpose core before the general-purpose core becomes a bottleneck. Array processors have a more uniform architecture, but many of them still require an external control processor in order to manage their overall operation. Most array processor designs are also somewhat deficient in floating-point arithmetic, so while they are adaptable to many different algorithms and applications (and many can be configured to do floating point), they do not do everything optimally. Despite the control processor, the array processor generally must perform its scheduling and load balancing based upon the configuration selected, and the array must process all overhead (such as protocols), which can keep it from attaining its peak efficiency.

IndustryInsight More Instances/Faster Response


DSP Array Processor


Algorithmic Volatility

Figure 2


Progression of algorithmic volatility and performance across technologies.

July 2007

Figure 1 also shows that, in terms of size and power performance, x86 processors are not as efficient as DSPs. The x86 processor is designed for many different types of solutions, ranging from embedded and laptop to desktop and server versions. Higher-performance versions, which could compete with DSPs, require from 35W to 90W of power and often need a chipset that doubles the size of the design and requires another 15W to 20W. A high-end DSP or array processor requires from 2W to10W and does not require an additional chipset. This means that DSPs have an inherent efficiency edge of 2x in terms of size and 5x to 10x in terms of power. Even if the x86 CPU were to outperform the DSP or array processor in algorithm performance, it would still be far less efficient. Four key design principles are important when determining which technology to use in media product architectures: scalability, versatility, density and programmability. In telecommunications, the media product architecture may need to support a wide range of applications from entry level (a few dozen channels) to high density (several thousand channels). If a separate hardware design is optimized for several ranges, the best price/performance can often be achieved, but at the expense of maintaining numerous designs, perhaps with significantly different components and, worst of all, different code bases. The challenge is to establish a common code base, and, if possible, a common, modular hardware design. A modular design typically leads to replicating a processor and having 1-N of the same processor on some extensible fabric; however, the other principles must be considered before rushing into this kind of design. We conclude then that the x86 general-purpose processor is simply not competitive with the DSP in terms of efficiency. As a result, no x86-based architecture can produce the same number of channels for a given algorithm as a DSP can in the same space and with the same electrical power. Since the maximum number of channels determines the overall dynamic range of a product offering, a design must attain this maximum density to achieve the lowest cost, size and power consumption and lead the industry.

IndustryInsight But the only constant is change. New variations on old algorithms, new algorithms, varying demand on algorithmic instances (what we refer to as “algorithm volatility�) all require a versatile platform. A versatile platform enables a designer to maintain a market leadership position by quickly introducing new algorithms or new features that differentiate the product line. While DSPs have made some strides in this area, they lack the overall versatility of the general-purpose CPU. Designs that attain longevity achieve it through versatility and therefore require some level of general-purpose functionality. Changing the code on any processor is always problematic, and writing the code initially is even more of an issue. This is why many DSP manufacturers are now providing algorithmic solutions with optimized code that can be licensed by a designer for a fee. The advantage of this approach is that it saves considerable timeto-market and greatly reduces program risk. The disadvantage of this approach is that it eliminates the designer’s ability to include key differentiating value into products and ties the ability of products to roll out new features to the DSP manufacturer. To add timely, differentiating value to a product, a designer must program the device. To reduce time and risk, a solid set of tools must be made available to the programmer in a development environment that they are familiar with. Virtually every programmer is familiar with the x86 programming environment. It is, in fact, the basis for DSP development. However, optimized code can be produced on the x86 far more easily than it can be developed for most DSPs. Figure 2 illustrates the applicability of the technologies discussed to algorithmic demands. The best solution appears to be a compromise solution, one that requires a mixture of dedicated silicon (ASICs) and DSP-accelerated x86 CPUs. Functions such as tone processing and echo cancellation can run on dedicated silicon. Algorithms that are stable and optimized, but require high density, can be provided by DSP acceleration modules connected to the x86 CPU. Algorithms that require more versatility for system interaction (memory, mass storage) or are more volatile (many new enhancements over a short period of time) are best run in the x86 CPU.

The challenge that remains is determining the optimal mixture of these components and configuring them in a system solution that maximizes the principles discussed here. In order to determine the best system solution, the designer must take into account a wide variety of issues including overall cost of the design, the licensing of algorithms (which depends on where they run), intercommunications latency, the overhead processing required for partitioning the architecture, and the

overall efficiency of the design in terms of device utilization. Dialogic Montreal, Quebec. (514) 745-5500. [].

July 2007


Software&Development Tools Linux

Linux and FOSS: End-to-End (and Top-to-Bottom, Too) Since its beginnings, Linux has permeated computing systems from enterprise to embedded. While not the only widely used OS, it has potential to achieve end-to-end acceptance with a uni-fied code base and development paradigm. by B  ill Weinberg LiPS Forum & Linux Pundit


eginning in the 1990s, Linux and other Free and Open Source Software (FOSS) began an inexorable march from hobbyist-ware into the enterprise and continued out to a range of embedded and ubiquitous computing applications. The penguin’s progress started modestly at first, waddling into noncritical utility computing roles, (departmental file and print servers, and intranet servers). Upon proving itself strikingly reliable, Linux then moved into increasingly crucial enterprise application server roles. On the enterprise desktop, Linux displaced legacy UNIX for technical workstation, documentation and data entry terminals. Free Software actually made its first appearance a decade earlier in embedded applications, with GNU tools gcc and gdb complementing and displacing proprietary compilers and debuggers, followed by the BSD TCP/IP stack, creating a de facto standard for IP networking implementations. Other FOSS components (like BerkeleyDB and Apache httpd) also found their way into larger-scale embedded applications through the mid 1990s. Starting in 1999, Linux began finding its way into a range of edge and access applications. Adoption came from a mix of organic use by developers at TEMs, NEPs and other OEMs familiar with UNIX in management and control plane applications, and from commercial tool kits and services from companies like

RedHat / Cygnus, MontaVista and Metrowerks, which is now part of FreeScale. Today Linux and FOSS experience broad and deep deployment across the entire spectrum of information technology (Figure 1). In the data center, Linux enjoys double-digit market share and rather more modest global desktop deployment in the single digits. 34

July 2007

Linux and FOSS actually garner an even greater number of embedded applications and constitute the industry’s leading platform: Venture Development Corporation reports up to a third of 32- and 64-bit designs were based on the open source OS in 2006.

Progression from Core to Edge, Deployment End-to-End

The incremental progress Linux made in the course of a decade represents a mix of commercial and community investments. On the technical side, key enablers included CPU and board support, architectural advances in scaling, memory access and storage and device drivers. These investments came from across the embedded device ecosystem—semiconductor manufacturers, board vendors, systems suppliers, ISVs and community resources. Business- and tech-savvy semiconductor suppliers like Intel, AMD, FreeScale, Intel and others not only saw the open source OS as an means to “fill sockets” but actually used Linux to bring up their new processors. Board vendors like Advantech, DTI, Kontron Motorola and RadiSys found they could offer richer board support, faster by leveraging community-developed kernels and device drivers. And systems houses like Fujitsu, HP, IBM and NEC saw an opportunity to span and consolidate diverse architectures and product lines while improving margins and expanding services offerings. ISVs saw an opportunity to consolidate and migrate legacy UNIXhosted products onto a single flexible, interoperable host platform. As such, Linux quickly accrued broad and deep hardware and software support and today runs on three dozen processor variants, thousands of SBCs and motherboards, across nearly every enterprise vertical and embedded application type (Figure 2).


End-to-End Candidates

Certainly other applications platforms exhibit comparable reach and applicability. Japan’s iTRON and μiTRON run on a similarly broad range of CPUs. Sun’s Java extends from enter-prise to desktop to embedded, and Microsoft Windows family OSs span a gamut that reaches from the server room to the desktop to in-car and in-hand applications. What makes the GNU/Linux platform different? The main difference is that the GNU/Linux OS—kernel, libraries and utilities—constitute a single, unified code base. Whether compiled to run on an ARM or an IBM S/390, in an SoC or on a server farm, in an MMU-less microcontroller or a 1000+ CPU supercomputer, the same code implements the same functions, everywhere. The Linux kernel source tree carefully segregates and minimizes architectural idiosyncrasies. CPU-specific code constitutes less than 5% of the total. Contrast other candidates for end-to-end ubiquity. iTRON, μiTRON (and its stillborn enterprise sibling bTRON) are not OSs— they are de facto standard API sets implemented by dozens of different companies with diverse agendas and divergent interpretations of the instruction sets, APIs and protocols. Sun, for pragmatic, application-directed reasons, segmented Java into a range of profiles (J2EE, J2SE, J2ME, mid-p, CLDC, etc.), resulting in fragmentation of class libraries and separate code bases for the major virtual machines (to say nothing about coffee cup clones). Windows operating systems don’t pretend to offer continuity with server, desktop and embedded OSs supporting different code bases and API sets.

Open Source vs. Closed Corporate Standards

Standardization is a very good thing. However, standardization and common, community-based implementation, trump standards compliance alone. Andrew Tanenbaum, creator of Minix (on which Linux is loosely based) expressed a key challenge with standardization when he said “The nice thing about standards is that there are so many to choose from.” Individual companies producing point products can usually manage to ensure standards compliance for a handful of standards for their products. Most corporate entities, small or large, are in a poor position to comply with, let alone implement, the alphabet soup of standards and protocols, or to build and maintain the tens of millions of lines of code that implement those standards. Companies boasting the wherewithal to create and implement standards, and presumably compliant products, also have the unfortunate tendency to improve the standards they help to define and later implement. They optimize and add value and otherwise ladle on their own secret sauces. Intentionally or not, these enhancements impact interoperability and drive vendor lock-in, in precise opposition to the original goals of open standards regimes. Open Source looks to standards as a source of requirements to guide implementation and to foster interoperability with other OSs and to support legacy code; GNU/Linux implements (among many others) POSIX, ISO/ANSI C/C++, X11, TCP/IP family protocols, and wireless and wire-line networking. The LAMP stack and Linux desktop applications offer the leading and most compliant implementations of derived protocols like HTTP and

Mobile/ Wireless Wireless Access Multimedia Home GW Imaging SOHO

POS/Kiosk Retail


Infrastructure Server/Blades •DSLAM •Firewall •Gateway •PBX •VPN •Wireless

Desktop/ Workstation


TODAY Figure 1

Utility Server

Data/Content Store

Application Server




Progression of Linux and FOSS from enterprise/utility computing outward to infrastructure, mobile and other embedded applications.

document formats like HTML, XML, ODF, etc. and myriad other standards and API sets. When code and patches to Linux or other projects omit APIs, re-interpret RFCs or otherwise drift from compliance, a mix of community and corporate interests coalesce to “make things right.” A good example lies in POSIX threads. In the 2.4 kernel timeframe, Linux (and the GNU libraries) supported a sui generis threading scheme, and most Linux programs were processbased. As embedded applications for Linux grew in importance, having a pthreads-compliant scheme emerged as a key requirement (e.g., in Carrier Grade Linux). Initially, community figures saw no need for pthreads and lobbied against implementation and integration. In spite of this resistance, IBM offered up Next Generation POSIX Threads (NGPT), a hybrid user-space and kernel implementation. NGPT met with mixed reviews but actually spurred a community effort toward true pthreads APIs and semantics. The result was the development of the highly compliant New POSIX Threads Library (NPTL), which today is the mainline 2.6 Linux threading scheme.

Benefits of a Unified End-to-End Platform

Being able to scale and repurpose a single code base across a continuum of system types and applications yields a range of benefits, some obvious, others less so. In terms of interoperability, the identical implementation of APIs and protocols provides the greatest assurances of interoperability of applications (vs.

those based on published standards alone). Complemented by tradi-tional compliance and interoperability testing, developers and users have access to the “best of both worlds”—a standardsbased and compliant platform that is also open source, for applications on servers, on the desktop and in embedded applications. There is also an advantage in unified skill sets. Many organizations run businesses that span horizontally, from enterprise to embedded applications (like telecom carriers and operators, medical services suppliers and governments). Others run vertically integrated businesses (like consumer electronics manufacturers and networking equipment providers). Both types expend huge resources in attempting to level internal technology fragmentation and the training, maintenance and support challenges that fragmentation creates. For decades, these companies have been seeking a strategic end-to-end alternative to a patchwork of legacy platforms. A more consistent management model is also a boon to organizations. These companies, the eco-systems around them, and the end-users they serve suffer from poor support and quality of service due to disparities in how systems, on and off shared networks, are managed. A single platform with identical system management paradigms and a much smaller range of support issues greatly enhanced organizations ability to provide quality of service at both system and human interaction levels. July 2007


Challenges to Linux and FOSS for End-to-End

Linux and FOSS are not a panacea. They constitute a large, dynamic and, some would say messy, code base and technology cloud. A few key areas where Linux and FOSS present challenges to building and maintaining end-to-end applications include the many commercial distributions available. Linux and FOSS are embraced for the freedom of choice and flexibility they offer. Too much choice is not always a good thing, especially when it comes to desktop distributions (Fedora, OpenSUSE, Ubuntu, Xandros et al.), embedded toolkits (MontaVista Linux, Wind River Linux, Open Embedded, etc.), and OEM-derived platforms. Even if the base platform—kernel, libraries, APIs and core functionality—is preserved, differences among distributions, kits and devices can substantially hamper interoperability, especially those dealing with configuration, provisioning and support. At the very least, the multiplicity of Linux editions complicates the life of ISVs, PMC CompactFlash Module service providers and IT departments trying to deploy applicaTwo Type I/ Type II CF Sockets tions and services across them. Linux still lags in terms of application frameworks. The See the full line of Mass Storage Products at leading proprietary platforms (Windows and Java) offer developers common development environments and application works (even if the platforms do not interoperate as advertised). or call Toll-Free: 800-808-7837 The Linux desktop boasts two active and fruitful frameworks Red Rock Technologies, Inc. 480-483-3777 (GTK and Qt); emerging equivalents also exist for mobile. Open source Eclipse has become the standard for IDEs, but there exists no single widely accepted programming paradigm that can be applied end-to-end. Certainly there exist multiple excellent JVMs, ORBs, rIPCs, databases, web clients/servers, but no single edrock_04.indd in ways and means unavailable to most boutique embedded 1 2/2/07 plat1:21:52 PM end-to-end capable framework (although mono, the open source forms. However, Linux and FOSS have much to learn from the answer to .NET, is evolving nicely). formal testing regimes of proprietary OS suppliers. Today, most The culture has tended to pay less attention to formal testing formal testing comes from commercial FOSS-based OSVs (Red regimes. “Many eyes make all bugs shallow,” touts open source Hat, MontaVista and others), but centralized community-based philosopher Eric Raymond. Indeed, the breadth and depth of the testing is catching up, as with the test projects hosted by the Linux user base exercises, prods and pokes the FOSS code base Linux Foundation, the home of the Linux Standards Base. The intent here has not been to promote Linux and FOSS as a candidate platform for end-to-end infrastructure. Rather, it has been to explain why Linux and FOSS are already attaining the Massively status of a ubiquitous platform—one that spans the continuum Scaled from server to desktop to blades to embedded. Certainly viable alternatives to Linux and FOSS exist at each node; readers need only look to the fragmented embedded OS market for examples of this long tail. End-to-end, a few contenders today pretend to bridge those nodes in a unified fashion, but fall short from a mix of proprietary burdens and fragmented code bases. Robust Enterprise This momentum enjoyed by Linux belies the well-known Client Server shortcomings of FOSS. Indeed, in many cases, Linux and FOSS are not deployed because of their attributes, but in spite of them. However, only Linux and FOSS have accrued the unity, critical mass and evolutionary velocity to qualify for this strategic platform role. Deeply Embedded

Figure 2

The long reach of Linux scalability.

Linux Phone Standards (LiPS) Forum []. Linux Pundit []. July 2007




aTCA6892 from Adlink Technology offers remote setup, and can be configured with the FC links at the front panel via SFP optical transceivers, or to the backplane with Zone 2 connectors, supporting PICMG 3.1 option 4/7 connectivity. When combined with FC switch and storage blades, the aTCA-6892 can directly access integrated, high-performance Storage Area Networks (SAN) in the same chassis without the need for additional fiber optic cabling. By combining the dual 64-bit Low Voltage Intel Xeon processors with up to 16 Gbytes dual channel PC3200 DDR2 REG/ECC memory, the aTCA-6892 provides computing power for mission-critical applications. In addition to the Fibre Channel links, the PICMG 3.1-compliant dual 1000BASE-BX GbE fabric interface ports offer gigabitspeed data transport options inside the shelf. The front of the processor blade offers a variety of I/O options including analog UXGA graphics, USB v2.0 ports and a serial console. The blade also offers a PCI/PCI-X 64-bit/133 MHz PMC site for further function expansion as well as reserved resources for a SATA RAID 0/1-enabled RTM interface. OEM pricing starts at $3,490. Adlink Technology, Irvine, CA. (949) 727-2099. [].

