RTC magazine by RTC Media

The magazine of record for the embedded computing industry

July 2012

www.rtcmagazine.com

Wi-Fi: Connectivity Backbone for Devices Tools Take a High-Level Look at Code

An RTCâ&#x20AC;&#x2C6;Group Publication

Have We Yet Pulled All the Power out of Multicore?

OpenCL

Unleashes the Power of Parallel Coprocessing

58 Rugged Router Runs Cisco IOS Software

60 Ultra Low Power ARM-Based Embedded Computer Designed for 3.5” to 12” HMI

TABLEOF CONTENTS

65 3U OpenVPX SBCs Bring 10 Gig Ethernet and PCI Express 3.0

VOLUME 21, ISSUE 7

Departments

6 8

Editorial The “ASIC Quandary”—How Much Can We Integrate, When and Why? Industry Insider Latest Developments in the Embedded Marketplace

Technology in Context

TECHNOLOGY DEPLOYED

The Expanding Roles of Non-Volatile Memory

Code Requirements and Verification

The Expanding Role of Non-Volatile Memory in High-Performance Embedded Architecture Adrian Proctor, Viking Technology

Form Factor Forum TECHNOLOGY IN SYSTEMS 10Small The Graying of Embedded Developing Hybrid Code Using OpenCL Products & Technology Programming: Parallel Embedded Technology Used by 58Newest Processing Made Faster and 26 OpenCL Industry Leaders Easier than Ever

EDITOR’S REPORT Realizing the Potential of Multicore Processors

Retooling Applications to Ride on Multiple Cores Tom Williams

44Requirements Engineering Today of the Passenger’s Seat: 48Out Requirements Traceability to Drive the Software Development Process Marcia Stinson, Visure

Jared Fry, LDRA Technology

Code Analysis with Visualization 54Transforming Paul Anderson, GrammaTech

Todd Roberts, AMD

Embedded Hybrid Code 32 Developing Using OpenCL Mark Benson, Logic PD

Computing with AMD 38 Parallel Fusion-Based Computer-onModules John Dockstader, congatec

Digital Subscriptions Avaliable at http://rtcmagazine.com/home/subscribe.php RTC MAGAZINE JULY 2012

JULY 2012 Publisher PRESIDENT John Reardon, johnr@rtcgroup.com

Editorial EDITOR-IN-CHIEF Tom Williams, tomw@rtcgroup.com CONTRIBUTING EDITORS Colin McCracken and Paul Rosenfeld MANAGING EDITOR/ASSOCIATE PUBLISHER Sandra Sillion, sandras@rtcgroup.com COPY EDITOR Rochelle Cohn

Art/Production ART DIRECTOR Kirsten Wyatt, kirstenw@rtcgroup.com GRAPHIC DESIGNER Michael Farina, michaelf@rtcgroup.com LEAD WEB DEVELOPER Justin Herter, justinh@rtcgroup.com

Advertising/Web Advertising WESTERN REGIONAL ADVERTISING MANAGER Stacy Mannik, stacym@rtcgroup.com (949) 226-2024 MIDWEST REGIONAL AND INTERNATIONAL ADVERTISING MANAGER Mark Dunaway, markd@rtcgroup.com (949) 226-2023 EASTERN REGIONAL ADVERTISING MANAGER Shandi Ricciotti, shandir@rtcgroup.com (949) 573-7660

Billing Cindy Muir, cmuir@rtcgroup.com (949) 226-2021

To Contact RTC magazine: HOME OFFICE The RTC Group, 905 Calle Amanecer, Suite 250, San Clemente, CA 92673 Phone: (949) 226-2000 Fax: (949) 226-2050, www.rtcgroup.com Editorial Office Tom Williams, Editor-in-Chief 1669 Nelson Road, No. 2, Scotts Valley, CA 95066 Phone: (831) 335-1509

Published by The RTC Group Copyright 2012, The RTC Group. Printed in the United States. All rights reserved. All related graphics are trademarks of The RTC Group. All other brand and product names are the property of their holders.

Untitled-5 1

JULY 2012 RTC MAGAZINE

6/4/12 2:04 PM

Higher Performance . Lower Power . Small Form Factor Small SuperServers® and Motherboards Optimized for Embedded Appliances

X9SPV-F/LN4F

mini-ITX (6.7” x 6.7”)

t Mobile, 3rd generation Intel® Core™ i7/i5/i3 processors and Intel® QM77 Express Chipset t 2x / 4x Gigabit LAN Ports, 10x USB Ports ctor t SATA DOM (Disk on Module) Power connector ptions t Dual/Quad core 17W, 25W and 35W TDP options t Up to 16GB DDR3 SO-DIMMs Memory with ECC t 1 PCI-E 3.0 x16 expansion slot t 2 SATA 6Gb/s and 4 SATA 3Gb/s Storage interface face with RAID D 0/1/5/10 0 0

New

d OpenGL 3.1) 3 1)) t Integrated Graphics ( DX11, OpenCL 1.1 and t IPMI 2.0 with dedicated LAN for Remote Management

Applications: t t t t t

(Front View)

SMB Servers Storage Head Node HD Video Conferencing Network and Security Appliance Digital / Network Video Surveillance

(Rear View)

SYS-5017P-TF/TLN4F Compact Short Depth System

17.2”W x 1.7”H x 9.8”L

Next Generation Family of Motherboards for Embedded Applications N ns

X9SCV-Q X X9SC X9 SCV S SC V--Q V Mini-ITX

C7Q67 C7Q6 C C7 7Q6 Q 7 Q67

Micro-ATX

X9SAE ATX

www.supermicro.com/Embedded © Super Micro Computer, Inc. Specifications subject to change without notice. Intel®, the Intel® logo, Xeon®, and Xeon® Inside, are trademarks or registered trademarks of Intel Corporation in the US and /or other countries. All other brands and names are the property of their respective owners.

X9DRD-EF X9DR X9 9DR DRDDRD D EF E ATX E-ATX

EDITORIAL JULY 2012

The “ASIC Quandary”—How Much Can We Integrate, When and Why?

e’ve just gotten news of a market survey on graphics shipments that reports a decline in overall graphics shipments for the year. We will not embarrass the company by naming it because while its survey did include graphics shipped in desktops, notebooks, netbooks and PC-based commercial and embedded equipment, it did not include ARM-based tablets or, apparently, ARM-based embedded devices. It did somewhat lamely admit that tablets have changed the PC market. Duh. If we’re talking about shipments of graphics processors, both discrete and integrated onto the same die with a CPU, it makes utterly no sense to ignore the vast and growing number of devices based on ARM processors with high-performance integrated graphics. The report claims that graphics leader Nvidia has dropped shipments—but does not acknowledge shipments of multicore graphics processors integrated with ARM cores on that same company’s Tegra CPU products. These processors appear in a wide variety of Android-based tablets. And, of course, there is no mention of the graphics integrated on the Apple-proprietary ARM-based CPUs built into the millions of iPads that are being sold. So what exactly is the point of all this? Go figure. It was easier when graphics companies sold graphics processors that could be discrete or integrated into graphics add-in modules. All this integration stuff has gotten very messy. Now whether you think you need the high-end graphics capability or not, it comes with your Intel Core i5 or i7; it comes with your low-power Atom CPU, and it is a featured part of your AMD G- or R-Series APU. One of the results has been that developers really are making use of the powerful graphics that come with the chips they need for their designs, and more and more possibilities are opening up for the use of such graphics in embedded consumer and industrial applications. It turns out, however, that this development with graphics points to a wider phenomenon. It is now possible to integrate practically anything we want onto a single silicon die. Of course, just because we can does not automatically make it a good idea. And that is the core of what we might call “the ASIC quandary.” What, exactly, are the conditions that must be met to put together a given combination of general vs. specialized functionality as hard-wired silicon? What amount of configurability and programmability? What mix of peripherals and on-chip memory of what type can be brought together that make technical, market and economic sense? And how much of it can we really take

JULY 2012 RTC MAGAZINE

Tom Williams Editor-in-Chief

advantage of? Of course, if you’re able to write a large enough check you can now get an ASIC that integrates practically anything, but that is not the path of the majority of the industry. Interestingly, graphics is something we now understand pretty well, while multicore CPUs are something we are still struggling to fully master (see the Editor’s Report in this issue). Back in the 80s, a company named Silicon Graphics occupied a huge campus in Silicon Valley. They produced a very high-end graphics chip called the Graphics Engine that was used in their line of advanced graphics workstations. They also developed the graphics language that eventually became OpenGL, the most portable and widely used graphics software today and one that can be used for all manner of high-end GPUs. Thus, of the mix of considerations mentioned above, market volume acceptance of high-end graphics has made it a natural for integration on all the latest CPUs. Multicore processors (which also integrate advanced graphics) are certainly another case, and there is definitely enough motivation to exploit their full potential that these will long continue to gain in market acceptance. But what else can we expect to eventually be massively integrated onto everyday CPUs? Could it be DSP? Well, that appears to be already covered by some of the massively parallel general-purpose graphics processors that are now integrated into CPUs by companies like AMD and Nvidia. There is a language that is emerging to take advantage of their power for numerically intensive processing that—perhaps not accidentally similar to OpenGL—is named OpenCL. For the present, we are also now seeing the increasing integration not of hard-wired silicon functionality but of silicon configurability and programmability. The initial Programmable System on Chip (PSoC) from Cypress Semiconductor started as an alternative to the dauntingly huge selection of 8-bit processors that offered a dizzying assortment of peripheral combinations. PSoC has since grown to include products based on 32-bit ARM devices. More recently, devices have come out of Xilinx and Altera that combine a 32-bit multicore ARM processor with its normal peripherals on the same die with an FPGA fabric. While the market has yet to issue a final verdict here, this indicates the direction of the ASIC quandary. Add the ability to make choices and later, if a large enough trend is identified, some enterprising company may issue the stuff hardwired. The ability for truly massive integration is here. The direction it takes will depend on forces both technical and economic.

INDUSTRY

INSIDER JULY 2012 Acromag Acquires the Assets of Xembedded On May 15, 2012, Acromag acquired the assets of Xembedded. This purchase adds the products of Xembedded to Acromag’s portfolio, benefitting both groups by combining product lines. The company says that now customers will have access to a complete product solution, ranging from CPU products to I/O solutions. Xembedded products are known for quality and technology. Their focus on VME, VPX and COM Express platforms complements Acromag’s FPGA, PMC, XMC and I/O technologies. This move also provides for extending production of the many legacy products that Xembedded, LLC was known for. The new “Xembedded Group” will now join the Acromag “Embedded Solutions Group” and “Process Instrument Group.” By forming this new group, Acromag hopes to provide uninterrupted service to former Xembedded customers. Along with the products, Acromag has hired all of the employees of Xembedded in order to provide the expertise to design, manufacture and support the many products in the “Xembedded Group” portfolio. The new “Xembedded Group” will work to continue the relationship of the many sales representatives and distributors that have sold these products for years.

IBM and National Instruments Collaborate on Software Quality and Test

Software is more complex than ever. The amount of software in today’s luxury car, for example, is expected to increase from almost 10 million lines of code in 2010 to 100 million in 2030. With this increased complexity, come higher error rates. To address these challenges, IBM and National Instruments have joined forces to allow system engineering departments the ability to trace and improve development end to end, from initial requirements through final systems test, in one single environment. The integration will include the leading real-time and embedded test software from National Instruments and the Application Lifecycle Management (ALM) capabilities from IBM Rational. Significantly reducing time and cost, this integrated approach eliminates the need to track items manually through spreadsheets and other tools. This offering brings quality and test together, with benefits that apply to engineering departments in virtually any industry including automotive, aerospace and defense, medical and telecommunications.

JULY 2012 RTC MAGAZINE

For instance, an automotive cruise control system must adhere to rigorous safety test standards. When testing, the tolerance of set speed is violated for more than a permitted period of time. This error can be flagged in the IBM and National Instruments solution. All test artifacts as well as data needed to reproduce the error are logged in the system automatically. Immediately, this error is traced to software design requirements to understand the impact of the error on customer experience. It can then be determined if this is a safety-critical issue and therefore assigned appropriate priority.

Near Sensor Computing Business Unit for Miniaturized FPGA Solutions

Interconnect Systems (ISI) has announced that it has established a new business unit for near sensor computing. Near sensor computing blends ISI’s advanced packaging expertise with FPGA system design to develop and manufacture miniaturized solutions targeted at a wide range of applications, including medical imaging, night vision, industrial inspection, unmanned aerial vehicles (UAVs), helmet cameras

and surveillance. Latest generation FPGAs provide small, low-cost engines that are ideal for sensor processing applications. ISI’s Near Sensor Computing modules move the processor closer to the sensor, decreasing size, cost, latencies, bandwidth requirements and power consumption across the entire system. These modules are configured to order with processor, memory and I/O resources tailored to each customer’s specific requirements. Current Near Sensor Computing systems in production are the NSC-625 and the NSC-2200. The NSC-625 is a 1-inch square (25 mm) form factor, while the NSC-2200 holds a 40 x 60 mm footprint. Both utilize a miniaturized processor, memory and I/O modules delivered pre-configured to the specific requirements of the customer’s near sensor computing application. These FPGA systems perform real-time image enhancement, sensor fusion and object/event recognition for any analog or digital sensor including 4 x HD IR and visible-spectrum video. Output options include 1/10 GigE and GigE Vision, LVDS, Camera Link, Aurora, RS-232, programmable pins and custom formats.

Cellular M2M Connections in the Retail Industry Surpassed 10 Million in 2011

The number of cellular M2M connections in the retail industry reached 10.3 million worldwide in 2011, according to a new research report from Berg Insight. Cellular M2M technology enables devices such as POS terminals, ATMs and ticketing machines to be used at new locations where fixed line connectivity is unavailable or impractical. The technology has a more transformational effect on markets such as vending and parking, where machine operators need to reorganize their operations in order to benefit from the availability of real-time information. Berg Insight forecasts that the number of cellular M2M connections in the global retail industry will grow at a compound annual growth rate (CAGR) of 21.6 percent during the next six years to reach 33.2 million connections in 2017. Shipments of cellular M2M devices for retail applications will at the same time increase at a CAGR of 10.7 percent from 5.2 million units in 2011 to 9.6 million units in 2017. “POS terminals will constitute the lion’s share of cellular M2M connections in the retail sector throughout the forecast period” says Lars Kurkinen, telecom analyst, Berg Insight. “But the penetration rate for cellular connectivity is today actually highest in the multi-space parking meter segment where it is well above 30 percent” he adds. Berg Insight expects that the vending machine segment will present a major opportunity for wireless connectivity in the long term, and the study indicates that vending telemetry solutions will be the fastest growing segment during the next six years.

Paper-Thin, Non-Flammable Battery Technology for Consumer Devices

A new battery technology aims at replacing conventional lithium-ion batteries and redefining the design of smartphones, tablets and other consumer devices by enabling thinner form factors and better performance. Infinite Power Solutions (IPS) has demonstrated advanced energy storage capacity for an organicfree, all-solid-state, thin-film battery. Using single-sided deposition, a paper-thin rechargeable cell achieved a capacity density of 1.25 mAh/cm2. IPS also announced it has already initiated talks with original equipment manufacturers for future adoption of this technology in consumer devices. IPS has also released a white paper documenting a path to manufacturing this new battery technology using roll-to-roll manufacturing and double-sided deposition in order to achieve a capacity density of 2.5 mAh/cm2, which is equivalent to prismatic lithium-ion cells (but without the risk of thermal runaway), and at comparable production costs. This new solid-state battery could be manufactured in volume for less than $1/Wh and, when fully packaged, provide 700 Wh/ liter—a 25 percent increase in energy density over today’s lithiumion cells. High-capacity cells can be arranged in a variety of configurations, including designing a battery around the form factor of the display of a smartphone or tablet. For example, cells can be safely stacked in a parallel configuration to form a 3.95V allsolid-state, thin-film battery pack about the size and thinness of five standard playing cards, which is three times thinner than today’s lithium-ion cell phone batter-

ies of equivalent capacity. Such a battery would have a capacity of approximately 1.4 Ah with a maximum current of 7A and could serve a variety of common handheld consumer devices.

LDRA Commits to Support All Aerospace Platforms for Certification

LDRA, a company in standards compliance, automated software verification, source code analysis and test tools, has automated the interface between its LDRA tool suite and Atmel’s AVR Studio. This interface helps ensure that aerospace applications developed using Atmel’s AVR ATmega128 are certifiable to DO-178C Level A. Automating the interface provides an efficient way for developers to more easily complete elements of DO-178C qualification that otherwise demand a highly skilled and laborintensive process. To achieve compliance with the FAA’s software standard DO-178C Level A, certification applicants must prove that their system and its application work as expected not only at the sourcecode level, but also at the objectcode level. Problems determining the cause of failures in the medical and automotive industries are also leading to the same technique becoming increasingly required in IEC 62304 and ISO 26262 certifications. For systems needing such stringent compliance, the LDRA tool suite traces requirements to the underlying assembler code to confirm that no errors have been introduced when the higher-level programming language devolves into low-level object code. With this interface, the LDRA tool suite drives the test environment to automate the exchange of execution history

and test data using Atmel’s AVR Studio 4.18 command-line utilities, AVRISP mkII device and an onboard serial port. The collation of the LDRA tool suites for both Assembler and C languages with the AVR-GCC compiler streamlines the interface for the developers, enabling them to efficiently verify that the AVR-GCCgenerated object code accurately and completely represents the application’s C source code operational requirements. Developers can fully verify the application on target, automating the functionality required to meet the stringent demands of DO-178C Level A.

I/O, power and infrastructure solutions, build technology components to HP Proactive Insight architecture standards and interfaces. These components help to analyze thousands of system parameters and automate processes that consume critical data center resources. Additionally, they provide insight into aspects of application performance and IT infrastructure optimization that had once been nearly impossible to capture and utilize. QLogic will use the HP PIA qualified insignia as visible assurance of meeting HP’s stringent quality and performance standards.

QLogic Joins the HP ProActive Insight Architecture Alliance

QLogic has announced it is a founding member of the HP ProActive Insight Architecture (PIA) Alliance, selected by HP for its networking solutions developed to the HP standards and interfaces that contribute to the performance and reliability of HP ProLiant Generation 8 (Gen8). HP ProActive Insight Architecture, now available in the new HP ProLiant Gen8 servers, transforms data center economics with built-in intelligence that improves performance and uptime, while lowering operating costs. As a founding member of the HP PIA Alliance, QLogic will contribute to the elimination of common problems caused by human-totechnology interaction that lead to system downtime and data loss. Through embedded HP ProLiant Gen8 server technology, HP PIA Alliance members, along with HP, can evaluate data and other critical analytics to continually optimize and improve on business results. Alliance members, drawn from leaders in memory, storage,

RTC MAGAZINE JULY 2012

SMALL FORM FACTOR

FORUM

Colin McCracken & Paul Rosenfeld

The Graying of Embedded

mbedded computing may be heading for some new territory. And success in that territory calls for new strategies and skills. Not that embedded will ever go away, of course. Once focused on unique application I/O and driven by desktop-derived or purpose-built processors, embedded was first swamped by commoditized x86 computing platforms and is now on the cusp of inundation by another consumer platform wave. Production volumes for applied (a.k.a. embedded) computing are dominated today by long-lifecycle motherboards in kiosks, digital signage, point of sale terminals and the like. Simple adaptations of processor vendor reference designs fueled the Asian Invasion of embedded SBCs and modules at prices even below the bill of materials costs of small form factor (SFF) embedded computers from North American and European manufacturers. Hundreds of medical, industrial, transportation, energy and military system OEMs benefited from pervasive standards-based building blocks for their line-powered fixed always-on devices. Though the bulk of system implementation resources shifted toward software, validation and regulatory tasks, there are still a few unsung hardware heroes to give each system its unique application I/O capabilities. Tablets and smartphones now threaten to claim a significant number of embedded system designs. We’ve become spoiled by the connectivity and convenience of these consumer and enterprise gadgets. Although still not suitable for applications requiring large-scale displays and high-end graphics, it’s hard to deny the economies of scale and overall utility of these lightweight ultra-portable gadgets for embedded applications. These rich open platforms are rapidly replacing purpose-built ruggedized user interface terminals for a variety of instruments and devices in all market segments. In other words, “there’s an app for that!” How can the SFF supplier community capitalize on this trend? Clearly there is very little point in creating the design equivalent of a phone or tablet. Popular x86 and Intel Architecture processors are already available in every conceivable shape and size. So why not repurpose the same cookie cutter reference design recipe for ARM modules? There is already some momentum in this direction, although the breakout of hype versus substance is yet to be seen. Besides low power, if the salient point

JULY 2012 RTC MAGAZINE

of using ARM is low cost, then why pay the overhead for COMs with baseboards when a college student can implement the reference design on a single board in a few weeks? The real challenge and cost lies with the software. It’s strange that x86 board vendors still need to learn what the RISC and microcontroller custom design community has known for decades. Perhaps the SFF community is better off enhancing the connectivity of its x86 boards to these ARM-based consumer devices. Porting, testing and certifying operating systems and middleware stacks creates value in saving the system OEM time and money. Partnering or even merging with embedded software companies would lead to a differentiated value proposition. Multicore processors can run the GUI and OS on one core and real-time operating system (RTOS) on the other. System OEMs prefer suppliers who stand behind their offering rather than pointing the finger at a third party. Maybe the industry is ready for another wave of consolidation, this time involving hardware and software together rather than only one or the other. Is the term “solution” overused yet? Before settling into a funk, keep in mind that there is still the core block of application-specific I/O functionality of the device to develop or upgrade. Just a few examples include controlling a laser or reading sensors, or capturing and compressing images, or taking a measurement, or monitoring a production process or other event-driven or real-time behavior. And for those design elements, system OEMs still need to rely upon the gray-haired or no-haired engineers among us. Those who remember that embedded development environments once consisted of scopes, logic analyzers, microprocessor emulators (with real bond-out silicon), assemblers and debuggers with disassemblers and instruction trace capability. Code had to be written efficiently to fit into kilobytes of RAM and mask ROM or EPROM. Fortunately, a few old-school designers and firmware engineers are still around who know how to use these tools when exception handlers or race conditions need to be debugged. Of course, the job now is to hide the secret sauce behind pretty APIs for the young buck software engineers. Who knows, your medical surgery apparatus or weather balloon may have its own following on Facebook or Twitter soon. Every silver lining has a touch of gray.

YOUR OFFICE. OUR ROUTER.

Utilizing the same Cisco IOS software that IT staffs are already trained on, our rugged, SWaP optimized routers are battlefield-ready. The X-ES XPedite5205 PMC router module and the SFFR small form factor router platform, working in conjunction with UHF, VHF, Wi-Fi and other radio platforms create mobile ad hoc networks to provide highly secure data, voice and video communications to stationary and mobile network nodes across both wired and wireless links. Call us today to learn more.

Connected everywhere. Thatâ&#x20AC;&#x2122;s Extreme.

Extreme Engineering Solutions 608.833.1155 www.xes-inc.com

editor’s report Realizing the Potential of Multicore Processors

Retooling Applications to Ride on Multiple Cores Taking advantage of the real potentials for performance offered by multicore processors is an ongoing quest. A new tool can dissect programs into multiple threads that can run across multiple cores—with significant performance gains. by Tom Williams, Editor-in-Chief

t goes without saying that today multicore processors are all the rage. As silicon manufacturers attempted to obtain increased performance by simply flogging the clock faster, they ran up against the very real limits of power consumption and heat dissipation. But advances in process technology and reduced geometries have enabled us to put multiple cores on the same silicon die and— hopefully—parallelize our software so that it effectively runs faster. But how successful has that really been? And have the very perceptive performance gains really been indicative of the potential that is there if it could be fully exploited. We don’t really know. Back in 2008, Steve Jobs, never one to mince words, was quoted by the New York Times as saying, “The way the processor industry is going, is to add more and more cores, but nobody knows how to program those things. I mean, two, yeah; four, not really; eight, forget it.” Of course, Steve, that is not going to stop us from trying. So among the questions are how far have we come in exploiting this potential and where can we still go?

Getting the Juice out of Multicore

First of all, the utilization of multicore architectures is taking place on several levels. Among these are the operating system and a variety of hypervisors, which are used to virtualize the multicore environment so that it can be used by more than one operating system. The next level is just starting to open up and,

JULY 2012 RTC MAGAZINE

of course, offers the biggest potential, which is the ability of individual applications to run smoothly across multiple processor cores. At the operating system level, for example, Windows 7 running on a quad-core Intel Core processors does take advantage of the multiple cores to the extent it can. It can launch an application on one core, drivers on another, and it can assign a whole host of background processes, of which the user is mostly never aware, across other cores. The user has no knowledge of or control over where these processes all run, but the fact that the operating system can do this does significantly increase the performance of the overall system. The world of hypervisors comes in several forms, which are basically software entities that manage virtual machines so that multiple guest operating systems can run on the same hardware platform. John Blevins, director of product marketing for LynuxWorks, offers a scheme for classifying hypervisors based on their functionality. There are Type 2 hypervisors that are basically emulation programs that run under a host OS. This would allow the running of a guest operating system such as Windows under a host OS like Linux. Type 1 hypervisors are computer emulation software tightly integrated with an OS to form a single “selfhosted” environment that runs directly on the hardware. The lines between Type 1 and Type 2 hypervisors can become a bit blurred. Therefore, Blevins offers a third des-

ignation he calls the Type Zero hypervisor. This type of hypervisor is built to contain only the absolute minimum amount of software components required to implement virtualization. These hypervisors are the most secure type of hypervisor as they are “un-hosted” and are dramatically smaller than Type 1 and 2 hypervisors. Since the Type Zero hypervisor runs directly on the hardware below the operating system, it can assign a given OS to a designated core or cores giving that OS complete protection from code running on other cores (Figure 1). Thus an OS that is multicore-aware could run on its assigned cores just as if it were on its own hardware platform. Windows 7, for example, would function in such a hypervisor environment exactly as it does on a multicore PC—within its assigned space. None of this, however, addresses the need to spread the actual applications across multiple cores to improve their performance.

Parallelizing the Application

In approaching the goal of actually getting everyday application programs to take advantage of multiple cores, the object will naturally be to first determine which parts of the code can be targeted to run on independent cores while keeping their communication and synchronization with each other and with the application as a whole. Multithreaded code offers itself to analysis and parallelization, but the question of how to approach the task is nontrivial. Steve Jobs was almost certainly correct in believing that nobody understands how to sit down at a computer and start writing multicore code from scratch. You’ve got to establish the structure of the program and then decide how to approach parallelization through analysis. This is the approach of the Pareon tool recently announced by Vector Fabrics. A normal application that has not been parallelized will only run on one core. The approach is to take the complete—or fairly complete— written and debugged source code and to understand it well enough to pick out those elements that will actually provide enhanced performance if they are parallelized. That normally means that the performance increase should significantly exceed the amount of overhead involved in parallelizing and such things as delays for communication and synchronization.

editor’s report

PARTITION 0

PARTITION 1

PARTITION 2

PARTITION n

LynxOS-SE RTOS

WINDOWS 7

WINDOWS XP

SECURE DEVICE SERVER

DIRECT DEVICE ASSIGNMENT

VIRTUALIZED SHARED DEVICES LynxSECURE

ETHERNET

SEPARATION KERNEL AND EMBEDDED HYPERVISOR GRAPHICS

USB

SATA

PHYSICAL DEVICE ASSIGNMENT ETHERNET 2

PHYSICAL DEVICES Figure 1 The LynxSecure separation kernel/hypervisor from LynuxWorks is being called a Type Zero hypervisor because it runs directly on the hardware below the operating systems. It can assign a given OS to a core or cores and ensure that it runs exclusively on its assigned hardware.

This is fundamentally different from what a hypervisor does. A hypervisor does not touch the code but provides virtual contexts in which it can run. Currently, Pareon supports the x86 architecture and the ARM Cortex A-9, mostly in a Linux environment. It can work with C and C++ programs including programs that use binary libraries or access databases. It can handle programs that include non-C/C++ codes such as assembler, but can only parallelize those sections written in C/C++. The process starts by building the program and running it on a model of the target hardware architecture. This allows insight into deep details of behavior including such things as bottlenecks, cache hits and misses, bus bandwidth and traffic, memory access times and more. After running this executable on the model, which is intended to give deep understanding of how the processor architecture and the code interact, Pareon performs dynamic analysis to gain information that will support deciding how best to go about an optimal parallelization. These are the initial steps of a three-stage process consisting of what Vector Fabrics describes as, “insight, investigation and implementation.” Having done the basic analysis, Pareon moves the user to the investigation stage, which actually involves more deliberate analysis by the user of the data gathered during the insight phase. For example, coverage analysis is needed to ensure that

Figure 2 This is a relatively complex loop with a number of dependencies, including communication, but would gain significantly from parallelization. In the bar above, the red shows the overhead that would be introduced compared to the execution time. The speedup is indicated on the left.

the execution profile of the application is as complete as possible so as not to miss any parts that would gain significantly from parallelization. In addition, a profiling analysis keeps track of where the ac-

tual compute cycles go. This can reveal hotspots and provide a basis to explore and find the best parallelization strategy. For example, the user can select a given loop and investigate the parallelization options. RTC MAGAZINE JULY 2012

editor’s report

FREE

DATA ACQUISITION HANDBOOK

Figure 3 While this loop could be parallelized, the view shows that the threads mostly wait on each other and therefore would offer very little performance advantage if parallelized.

Learn about DAQ, A/D Converters, Signal Conditioning, Measurement Techniques, and more... Download Your Copy Today!

mccdaq.com Contact us

Untitled-4 1

JULY 2012 RTC MAGAZINE

7/10/12 11:11 AM

The tool will alert the user to possible dependencies that would restrict partitioning. Otherwise, it presents options for loop parallelization from which the user can choose, such as the number of threads. From this the tool can show the overall performance increase, taking into account overhead involved with such things as spawning and synchronizing tasks or the effects of possible memory contention and data communication (Figure 2). Of course, not all loops are good candidates for parallelization. If threads have to wait on each other so that one thread must complete before the next can start, parallelization is not going to be a great benefit. So a good deal of the developer’s effort is devoted to understanding the code and making informed choices about what to parallelize and what not to (Figure 3). To help with such decisions there is a performance prediction feature that immediately shows the impact of parallelization on program performance. Once the developer has gone through and examined the parallelization options, made his or her choices and gotten a view of the overall improvement, it is time to refactor the code to implement the chosen improvements. Pareon keeps track of where the developer has chosen to add parallelism, and once the strategy has been selected presents the user with detailed stepby-step instructions on how to implement that parallelism. Thus, there may not be an

automated button, but every step needed to properly modify the source code is given. Once the code is refactored, it can be recompiled using a normal C or C++ compiler. In fact, Pareon integrates with standard development tools so that familiar debuggers, profilers, text editors and other tools are available for subsequent tweaks and examination of the code. Such steps will probably be needed if the code is intended for embedded applications where timing constraints, interrupt latencies and other issues affecting deterministic behavior are critical. In fact, Vector Fabrics is not currently targeting the tool at hard real-time applications, but more at the tablet, handheld mobile and smartphone arena. The temptation to use it for embedded applications will no doubt be strong and it seems clear that this avenue will also be pursued by other players if not by Vector. The push to truly take advantage of the possibilities offered by multicore architectures is increasing—as is the number of cores per die—and we can expect further innovation in this dynamic arena. LynuxWorks San Jose, CA. (408) 979-3900. [www.lynuxworks.com]. Vector Fabrics Eindhoven, Netherlands. +31 40 820-0960. [www.vectorfabrics.com].

USB Modules & Data Acquisition Showcase Featuring the latest in USB Modules & Data Acquisition technologies USB-104-HUB

Multifunction DAQ-PACK System Series

Rugged, industrialized, four-port USB hub Extended temperature operation (-40°C to +85°C) Supports bus powered and selfpowered modes Three power input connectors (power jack, screw terminals, or 3.5” drive Berg power connector) USB/104 form-factor for OEM embedded applications OEM version (board only) features PC/104 module size and mounting compatibility Includes micro-fit embedded USB header connectors

ACCES I/O Products, Inc. Phone: (858) 550-9559 Fax: (858) 550-7322

Up to 128-channels, 16-bit singleended or differential analog inputs Multifunction DAQ with sustained sampling speeds up to 500kHz Flexible, software configured functionality Wide variety of input ranges, unipolar and bipolar, per 8-channel programmable Extensive range of flexible signal conditioning types

ACCES I/O Products, Inc. E-mail: contactus@accesio.com Web: www.accesio.com

Phone: (858) 550-9559 Fax: (858) 550-7322

Get Connected with ADLINK USB2401/1901/1902/1903 DAQ Modules!

ADLINK Technology, Inc. Phone: (408) 360-0200 Fax: (408) 360-0222

16-bit, 16-CH, 250kS/s (USB-190x) 24-bit, 4-CH, 2kS/s (USB-2401) Built-in signal conditioning enabling direct measurement of current signal, bridge-based sensor, TC and RTD. (USB-2401) USB powered Easy device connection with removable screw-down terminal module Free U-test utility, delivering full function testing with no programming required E-mail: info@adlinktech.com Web: www.adlinktech.com

Data Acquisition Handbook The Data Acquisition Handbook includes practical DAQ applications as well as theoretical measurement issues. Each chapter details basic DAQ and signal conditioning principles including sensor operation and the need for careful system interconnections. Download your FREE copy today!

Measurement Computing Phone: (508) 946-5100 Fax: (508) 946-9500

USB Wi-Fi Modules 802.11b/g/n Compliant

Radicom Research, Inc. Phone: (408) 383-9006 Fax: (408) 383-9007

USB 2.0 hot swappable interface Compatible with USB1.1 and USB2.0 host controllers Up to 300Mbps receive and 150Mbps transmit rate using 40MHz bandwidth Up to 150Mbps receive and 75Mbps transmit rate using 20MHz bandwidth 1 x 2 MIMO technology for exceptional reception and throughput 2 U.FL TX/RX antenna ports Wi-Fi security using WEP, WPA, WPA2 Compact size: 1.0” x 1.0” x 0.25” (Modules) E-mail: sales@radi.com Web: www.radi.com

E-mail: contactus@accesio.com Web: www.accesio.com

E-mail: info@mccdaq.com Web: www.mccdaq.com

Best Scope Selection Saelig has the widest, unique selection of scopes to suit any budget and application: low-end PC scope adapters up to high-end 16-bit 12GHz units. Standalone, remarkable economical LCD scopes, 7-in-1 toolbox essentials, up to bargain-priced 4-channel benchtop work horses with FFT and auto-test capabilities. See our website for a great selection of spectrum analyzers, RF signal sources, and waveform generators not available anywhere else! The best selection and lowest prices are always found at www.saelig.com

Saelig Company, Inc. Phone: (585) 385-1750 Fax: (585) 385-1768

E-mail: sales@saelig.com Web: www.saelig.com

ploration your goal k directly age, the source. ology, d products

Technology in

context

The Expanding Roles of Non-Volatile Memory

The Expanding Role of Non-Volatile Memory in High-Performance Embedded Architecture The answer to the big question, “What will be the next non-volatile memory to replace NAND flash?” is currently unclear, but due to the limitations of NAND flash (endurance, reliability, speed) when compared to DRAM, it is likely a new non-volatile memory technology will evolve. by Adrian Proctor, Viking Technology

ver the past 20+ years, there have inate the computing industry today are, been numerous memory tech- DRAM and NAND flash, but it should nologies brought to market with be noted that both of these technolovarying degrees of commercial success. gies have their pros and cons (Table 1). Among these are static RAM (SRAM), DRAM delivers the highest performance nies providing solutions nowRAM, NOR flash, Eprom, pseudo static (latency / speed), with practically infinite ion into products, technologies and companies. your goal is to research the latest yet it is volatile and has much EEprom, DRAM and Whether NAND flash. endurance, ation Engineer, or jump to a company's technical page, the goal of Get Connected is to put you Generally speaking, these “memory” lower capacity points than other memoyou require for whatever type of technology, ries such as NAND flash. On the other and productstechnologies you are searchingcan for. be split into two categories, volatile and non-volatile. Vola- hand, NAND flash scales to high capactile memory will not retain data when ity, is non-volatile and relatively cheap power is turned off; conversely, non-vol- ($//Gbit), but it is significantly slower atile memory will. The two dominating than DRAM. Additionally, endurance memory technologies in the industry and data retention are getting worse as today are DRAM (volatile) and NAND process geometries continue to shrink, flash (non-volatile). Figure 1 summa- meaning that for write-intensive enterrizes memory technologies as emerging, prise applications, NAND flash, in the long term, may not be an optimal memniche and those in mass production. The memory technologies that dom- ory technology. There is much discussion in the industry as to what new universal Get Connected memory technology or technologies with companies mentioned in this article. will materialize as real contenders to www.rtcmagazine.com/getconnected

End of Article

JULY 2012 RTC MAGAZINE

Get Connected with companies mentioned in this article.

displace either or both NAND flash & DRAM. Some of these newer emerging technologies include: Magnetic RAM (MRAM), Ferroelectric RAM (FRAM), Phase Change Memory (PCM), Spin-Transfer Torque RAM (STT-RAM) and Resistive RAM (ReRAM) or memristor. See sidebar “Emerging Memory Technologies Overview,” p. 22. FRAM, MRAM and PCM are currently in commercial production, but still, relative to DRAM and NAND flash, remain limited to niche applications. There is a view that MRAM, STT-RAM and ReRAM are the most promising emerging technologies, but they are still many years away from competing for industry adoption. Any new technology must be able to deliver most, if not all of the following attributes in order to drive industry adoption on a mass scale: scalability of

´1RZ LQWHOOLJHQW FRQQHFWHG GHYLFHV FDQ EHQHÀW PRUH WKDQ HYHU IURP ,QWHO $WRP SURFHVVRUV µ 3UDQDY 0HKWD 6HQLRU 3ULQFLSDO (QJLQHHU DQG &72 ,QWHO (PEHGGHG DQG &RPPXQLFDWLRQV *URXS

2QH VPDOO 6R& 1RZ HPEHGGHG ZLWK ELJJHU LGHDV 'R PRUH ZLWK WKH 1HZ ,QWHO $WRP SURFHVVRU ( [[ VHULHV EDVHG SODWIRUP 1RZ ZLWK HQKDQFHG +' YLGHR FDSWXUH DQG GLVSOD\ FDSDELOLW\ DQG VXSSRUW IRU PRUH 26V

'XDO +' 'LVSOD\ +DUGZDUH DFFHOHUDWHG YLGHR HQFRGLQJ IRU GULYLQJ KLJK GHÀQLWLRQ GXDO S GLVSOD\V

6$7$

([SDQGHG FRPSDWLELOLW\ ZLWK 6HULDO $7$ GHYLFHV

0LFURVRIW :LQGRZV (PEHGGHG 6WDQGDUG :LQGRZV (PEHGGHG &RPSDFW DQG $QGURLG

:H WRRN RXU ÀUVW ,QWHO $WRP EDVHG SURFHVVRU DQG PDGH LW HYHQ PRUH FDSDEOH $QG VWLOO UHWDLQHG WKH ÁH[LELOLW\ RI D VWDQGDUG 3&, ([SUHVV LQWHUFRQQHFW WKDW OHWV \RX SLFN DQG FKRRVH , 2 KXEV SURSULHWDU\ $6,&V RU GLVFUHWH GHYLFHV $V ZHOO DV VXSSRUW IRU HPEHGGHG RSHUDWLQJ V\VWHPV LQFOXGLQJ $QGURLG 0HH*R )HGRUD DQG 0LFURVRIW :LQGRZV ;3 (PEHGGHG 6WDQGDUG (PEHGGHG &RPSDFW DQG :LQGRZV (PEHGGHG 6WDQGDUG $OO LQ D UXJJHGL]HG VPDOO IRRWSULQW SURFHVVRU ZLWK DQ LQGXVWULDO WHPSHUDWXUH UDQJH & WR & DQG ORZ GHIHFWV SHU PLOOLRQ '30

7KLQN ELJ *HW VWDUWHG /($51 025( $7 ! HGF LQWHO FRP JR DWRP H [[

,QWHO &RUSRUDWLRQ ,QWHO WKH ,QWHO ORJR DQG WKH ,QWHO (PEHGGHG ORJR DUH WUDGHPDUNV RI ,QWHO &RUSRUDWLRQ LQ WKH 8 6 DQG RU RWKHU FRXQWULHV 6RIWZDUH DQG ZRUNORDGV XVHG LQ SHUIRUPDQFH WHVWV PD\ KDYH EHHQ RSWLPL]HG IRU SHUIRUPDQFH RQO\ RQ ,QWHO PLFURSURFHVVRUV 3HUIRUPDQFH WHVWV VXFK DV 6<6PDUN DQG 0RELOH0DUN DUH PHDVXUHG XVLQJ VSHFLÀF FRPSXWHU V\VWHPV FRPSRQHQWV VRIWZDUH RSHUDWLRQV DQG IXQFWLRQV $Q\ FKDQJH WR DQ\ RI WKRVH IDFWRUV PD\ FDXVH WKH UHVXOWV WR YDU\ <RX VKRXOG FRQVXOW RWKHU LQIRUPDWLRQ DQG SHUIRUPDQFH WHVWV WR DVVLVW \RX LQ IXOO\ HYDOXDWLQJ \RXU FRQWHPSODWHG SXUFKDVHV LQFOXGLQJ WKH SHUIRUPDQFH RI WKDW SURGXFW ZKHQ FRPELQHG ZLWK RWKHU SURGXFWV 2WKHU QDPHV DQG EUDQGV PD\ EH FODLPHG DV WKH SURSHUW\ RI RWKHUV

Sharpen your engineering skills

with Intel® at RTECC Real-Time & Embedded Computing Conference August 21, 2012 Irvine, CA

Morning & Afternoon Sessions Plus Hands-On Lab

Seating Is Limited

Intel® Boot Loader Development Kit (BLDK) for Embedded Systems

Complete Agenda Available Online

Start with an overview of BLDK and complete your training with a Hands-on Lab In the hands-on lab you will learn how to: • Create a Boot Loader Development Kit (BLDK) Project • Build a Firmware Image Using Windows Hosted Tools • Boot an E6XX Systems to UEFI Shell & Explore the Various Options • Update E6XX Firmware from UEFI Shell

See what’s on the schedule at www.rtecc.com

August 21, 2012 Irvine, CA

Attendees who complete the class will be entered in a drawing for an Intel® Atom™ Processor E6xx System (a $300 value)

?^acfT[[

LIGHTNING FAST BOOT take advantage of the new Intel® BLDK Intel® Boot Loader Technology The Intel® Boot Loader Development Kit (Intel® BLDK) is a software toolkit that allows creation of customized and optimized initialization firmware solutions for embedded Intel® processor-based platforms. The Intel BLDK enables rapid development of firmware for fixed-function embedded designs—those requiring basic initialization and functionality rather than the full capabilities delivered with a traditional BIOS.

Support Intel ® BLDK technology Intel ® Atom™ E6X0T processor Dual display: VGA and 18/24bit LVDS 1 x SD Socket, 1 x Mini PCI-E and 1 x PCI-E expansion slot Support up to 2GB DDR2 SDRAM Small Form Factor 120(L) x 120(W) mm

NANO-6040 Nano-ITX Board

Support Intel ® BLDK technology Intel ® Atom™ E6X0T processor Single channel display with LVDS 4 x PCI Express x1 expansion slot CAN Bus interface supported Support up to 2GB DDR2 SDRAM Small Form Factor 70(L) x 70(W) mm

PQ7-M105IT Qseven Module

Portwell Industrial Embedded Solutions Wide selection of Mini-ITX, Nano-ITX, 3.5” ECX, COM Express and Qseven modules The latest Intel® Core™ i5/i7 and single-/ dual-core Atom™ CPUs Custom carrier board design and manufacturing Quick time-to-market solutions Extended temp and ECC memory support Small Form Factor

ISO 9001:2008

ISO 13485:2003

technology in context

Emerging Memory Technologies Overview

“Memory”

MRAM: Magnetic RAM MRAM is a non-volatile memory. Unlike DRAM, the data is not stored in electric charge flows, but by magnetic storage elements. The storage elements are formed by two ferromagnetic plates, each of which can hold a magnetic field, separated by a thin insulating layer. One of the two plates is a permanent magnet set to a particular polarity; the other’s field can be changed to match that of an external field to store memory. Vendors: Everspin (Freescale spin-off), Crocus Technology

STT-RAM: RAM

Spin-Transfer

Torque

STT-RAM is an MRAM, which is non-volatile, but with better scalability over traditional magnetic RAM. STT is an effect in which the orientation of a magnetic layer in a magnetic tunnel junction or spin valve can be modified using a spin-polarized current. Spin-transfer torque technology has the potential to make possible MRAM devices combining low current requirements and reduced cost. However, the amount of current needed to reorient the magnetization is at present too high for most commercial applications. Vendors: Samsung, SK-Hynix, Renesas, Toshiba, Everspin, Crocus Technology

PCM: Phase Change Memory PCM is a non-volatile random access memory. It utilizes the unique behavior of chalcogenide— a material that has been used to manufacture CDs—whereby the heat produced by the passage of an electric current switches this material between two states. The different states have different electrical resistance, which can be used to store data. It is expected PCM will have better scalability than other emerging technologies. Vendors: Micron, Samsung

ReRAM: Resistive RAM ReRAM is a non-volatile memory that is similar to PCM. The technology concept is that a dialectric, which is normally insulating, can be made to conduct through a filament or conduction path formed after application of a sufficiently high voltage. Arguably, this is a memristor technology and should be considered as potentially a strong candidate to challenge NAND flash. Vendors: SK-Hynix, HP, NEC, Panasonic, Samsung

JULY 2012 RTC MAGAZINE

Volatile Memory

Non-Volatile Memory

Mass Production

Emerging Technology

Mass Production

Niche

Emerging Technology

DRAM

Floating Body

NAND

MRAM

ReRAM

SRAM

NOR

FRAM

STT-RAM

Psuedo SRAM

EPROM

PCM

EEPROM

Racetrack

MRAM Figure 1 Memories categorized as mass production, niche application or emerging technology.

the technology, speed of the device, power consumption better than existing memories, endurance, densities better than existing technologies, and finally cost—if the emerging technology can only manage one or two of these attributes, then, at best, it is likely to be resigned to niche applications. So the answer to the question, “What will be the next non-volatile memory to replace NAND flash?” is almost certainly that a new non-volatile memory technology will evolve. However, it probably will not replace the current mainstream memories for at least the next 5 to 7 years. However, non-volatile DIMMs, such as the ArxCis-NV from Viking Technology, can enable increased application performance and far improved power failure / system recovery, when compared to current implementations (Figure 2).

What Is a Non-Volatile DIMM?

A non-volatile DIMM is a module that can be integrated into the main memory of high-performance embedded compute platforms, such as AdvancedTCA Blades; perform workloads at DRAM speeds; yet be persistent and provide data retention in the event of a power failure or system crash. A non-volatile DIMM is a memory subsystem that combines the speed and endurance of DRAM together with the non-volatile data retention properties of NAND flash. This marriage of DRAM and NAND technology delivers a highspeed and low-latency “non-volatile / persistent” memory module. Designed from the ground up to support unlimited read/write activity, it performs at fast DDR3 speeds and can sustain itself in the face of host power failure or a system crash.

technology in context

Figure 2 The Viking Arx-Cis-NV fits into a socket in main memory but has power backup via a supercapacitor, making it a non-volatile DRAM.

Server performance gap between technologies

Performance

What makes these modules different from standard DRAM modules is that in the event of a power failure or system crash, the data in the NV-DIMM is securely preserved and available almost immediately upon power being restored to the host system, much as in the case of suspend/resume. If performance is critical to business success and if minimizing downtime is an important issue, then NV-DIMMs will be extremely valuable wherever the application bottleneck is storage and I/O, and where downtime costs money. With the recent surge in use of SSDs in high-performance compute environments, and those architectures also utilizing caching software (auto-tiering), many applications have enjoyed significant performance improvements. Therefore the SSD and software bundle has significantly improved the memory/storage gap, which has helped alleviate I/O bottlenecks (Figure 3).However, the fact remains that NAND flash SSDs are best suited for read applications, not write-intensive ones. Thus, these intelligent caching software solutions, when paired with SSDs, will utilize other system resources to ensure performance & reliability, i.e., CPU, DRAM and HDDs. Indeed most SSD caching software will utilize the host system’s standard DRAM for intensive write activity to preserve the SSD and also keep the write cache in DRAM. That means this critical data is susceptible to loss in the event of a system crash or power failure. A common work-around to protect the data is to “checkpoint” the write buffer out into slower block-based storage, namely disk, at regular intervals. But “checkpointing” naturally has a negative impact on the I/O performance. With NV-DIMMs integrated into the host system, data-centric and write-intensive applications will enjoy a significant increase in performance. In addition to the benefits of increased application performance, recovery is another area of significant benefit. Should the system experience a power outage, one of two scenarios

Time

Server CPU Speeds

Caching Software w/SSDs does better job of bridging I/O gap

SSD Tier 0

SSDs implemented as a higher tier of storage improves I/O performance

HDD

1970

1980

1990

2000

2012

Figure 3

will occur. Either the power failure will cause a catastrophic loss of the “inmemory state,” or the backup power supplies will enable the appliances to transfer this data held in main memory out to disk. The entire state—which could be hundreds of gigabytes of DRAM— must be saved to a storage back end. Both “saving” and “recovering/reconstructing” this amount of data, across multiple servers, will be extremely slow

and place a very heavy load on the storage infrastructure, resulting in severe I/O bottlenecks. When utilizing non-volatile DIMMs, in the event of power loss or if the system crashes unexpectedly, the NV-DIMMs allow the system to recover their “in-memory state” almost instantaneously without putting any load on the storage back end—in a sense, making the failure appear as a

RTC MAGAZINE JULY 2012

technology in context

suspend/resume event. This means that business-critical applications such as online transaction processing (OLTP) can be up and running again in a matter of minutes rather than hours and without the need for uninterruptable power supply (UPS) intervention.

Pro’s & Con’s of DRAM & NAND Flash Technology

DRAM

NAND

Speed

Power

× ×

Capacity (GB) Endurance

Data Retention (Non-Volatile)

Cost/GB

Reliability

Technical Improvements with Geometry Shrinks

TABLE 1

Untitled-7 1

JULY 2012 RTC MAGAZINE

The Role of the Supercapacitor

In order for a non-volatile DIMM to perform its task, a small energy source is required to ensure 100% data security on system failure. Supercapacitors are an appropriate technology for use in this environment, primarily because they provide a superior solution when compared to batteries. Any technology, when new and relatively unknown, will encounter questions about long-term reliability and capabilities. There are a number of reasons why supercapacitors are suitable for use in this type of application. Supercapacitors are highly efficient components with high current capability. Their efficiency—defined as the to-

6/7/12 3:34 PM

technology in context

tal charge removed divided by the total charge added to replenish the charge removedâ&#x20AC;&#x201D;is greater than 99%, even at very high currents, meaning that little charge is lost when charging and discharging the supercapacitor. Since supercapacitors are designed with a very low equivalent series resistance (ESR), they are able to deliver and absorb very high current. The inherent characteristics of the supercapacitor allow it to be charged and discharged at the same rates, something no battery can tolerate. In battery-based systems, you can only charge as fast as the battery will accept the charge. Since supercapacitors operate without relying on chemical reactions, they can operate over a wide range of temperatures. On the high side, they can operate up to 65Â°C, and withstand storage up to 85Â°C, without risk of thermal runaway. On the low side, they can deliver power as cold as -40Â°C. Determining battery state of charge (SOC) and state of health (SOH) is a significant consideration for robust battery systems, requiring sophisticated data acquisition, complex algorithms and long-term data integration. In comparison, it is very simple to determine the SOC and SOH of supercapacitors. At the same time, the energy storage mechanism of a supercapacitor is capable of hundreds of thousands of complete cycles with minimal change in performance. They can be cycled infrequently, where they may only be discharged a few times a year, or they may be cycled very frequently. Life cycle and maintenance are also important factors. The energy storage mechanism of a supercapacitor is a very stable process. It is capable of many years of continuous duty with minimal change in performance. In most cases, supercapacitors are installed for the life of the system. In addition, supercapacitors cannot be over charged/discharged, and can be held at any voltage at or below their rating. If kept within their wide operating ranges of voltage and temperature, there is no recommended maintenance.

delivers value that far surpasses a simple DRAM, DIMM and SSD architecture; it is greater than the sum of the two technologies used.

The architecture of NV-DIMMs provides a full interconnect on-module that will independently allow transfer of data between the DRAM and the flash without contention for other I/O or CPU resources. Ultimately, applications can rely on the high-speed memory (DRAM) to be â&#x20AC;&#x153;persistentâ&#x20AC;? and need not slow down to â&#x20AC;&#x153;checkpointâ&#x20AC;? or consume other system resources. The NV-DIMM

Viking Technology Foothill Ranch, CA. (949) 643-7255. [www.vikingtechnology.com].

TRRUST-Stor

â&#x201E;˘

A New Level of SSD Security

8JUI FODSZQUJPO VOQBSBMMFMFE SVHHFEJ[BUJPO BOE CMB[JOH GBTU FSBTF .JDSPTFNJ T 533645 4UPSâ&#x201E;˘ JT UIF mSTU BOE POMZ TPMJE TUBUF ESJWF EFTJHOFE GSPN UIF HSPVOE VQ GPS IJHI FOE JOEVTUSJBM BFSPTQBDF EFGFOTF BOE JOUFMMJHFODF BQQMJDBUJPOT t PS (# EFOTJUJFT t )BSEXBSF JNQMFNFOUFE "&4 FODSZQUJPO t 5BNQFS SFTJTUBOU GFBUVSFT BWBJMBCMF t &0- NBOBHFNFOU t 533645 1VSHFâ&#x201E;˘ SFOEFST EBUB OPO SFDPWFSBCMF JO NJMMJTFDPOET t 'VMM ESJWF FSBTF JO MFTT UIBO TFDPOET t )BSEXBSF CBTFE BVUIFOUJDBUJPO t -PX QPXFS PQFSBUJPO t .5#' HSFBUFS UIBO IPVST t .JMJUBSZ HPWFSONFOU BHFODZ TBOJUJ[BUJPO QSPUPDPMT t #VJMU *O 4FMG 5FTU #*45

Email: sales.support@microsemi.com website: www.microsemi.com

Power Matters. Untitled-2 1

6/6/12 3:52 PM RTC MAGAZINE JULY 2012

technology in

systems

Developing Hybrid Code Using OpenCL

OpenCL Programming: Parallel Processing Made Faster and Easier than Ever Newer processors may not only have multiple cores of the same architecture, they may also integrate heterogeneous computing elements. Programming such devices with a single code base has just gotten easier. by Todd Roberts, AMD

arallel processing isnâ&#x20AC;&#x2122;t really new. It has been around in one form or another since the early days of computing. As traditional CPUs have become multicore parallel processors, with many cores in a socket, it has become more important for developers to embrace parallel processing architectures as a means to realize significant system performance improvements. This move toward parallel processing has been complicated by the diversity and heterogeneity of the various parallel architectures that are now available. A heterogeneous system is made up of different processors, each with specialized capabilities. Over the last several years GPUs have been targeted as yet another source of computing power in the system. GPUs, which have always been very parallel, counting hundreds of parallel execution units on a single die, have now become increasingly programmable, to the point that it is now often useful to think of GPUs as many-core processors instead of special purpose accelerators.

JULY 2012 RTC MAGAZINE

All of this diversity has been reflected in a wide array of tools and programming models required for programming these architectures. This has created a dilemma for developers. In order to write high-performance code they have had to write their code specifically for a particular architecture and give up the flexibility of being able to run on different platforms. In order for programs to take advantage of increases in parallel processing power, however, they must be written in a scalable fashion. Developers need the ability to write code that can be run on a wide range of systems without having to rewrite everything for each system.

OpenCL for Unified, Portable Source Code

OpenCL, the first open and royalty-free programming standard for general-purpose parallel computations on heterogeneous systems, is quickly growing in popularity as a means for developers to preserve their expensive source code investments and easily target multicore CPUs and GPUs.

OpenCL is maintained by the Khronos Group, a not-for-profit industry consortium that creates open standards for the authoring and acceleration of parallel computing, graphics, dynamic media, computer vision and sensor processing on a wide variety of platforms and devices. Developed in an open standards committee with representatives from major industry vendors, OpenCL affords users a cross-vendor, nonproprietary solution for accelerating their applications across mainstream processing platforms, and provides the means to tackle major development challenges, such as maximizing parallel compute utilization, efficiently handling data movement and minimizing dependencies across cores. Ultimately, OpenCL enables developers to focus on applications, not just chip architectures, via a single, portable source code base. When using OpenCL, developers can use a unified tool chain and language to target all of the parallel processors currently in use. This is done by presenting the developer with an abstract platform model that

tech in systems

conceptualizes all of these architectures in a similar way, as well as an execution model supporting data and task parallelism across heterogeneous architectures.

Write A

Write B

Key Concepts and Workflows

OpenCL has a flexible execution model that incorporates both task and data parallelism (see sidebar â&#x20AC;&#x153;Task Parallelism vs. Data Parallelismâ&#x20AC;?). Tasks themselves are comprised of data-parallel kernels, which apply a single function over a range of data elements in parallel. Data movements between the host and compute devices, as well as OpenCL tasks, are coordinated via command queues. Where the concept of a kernel usually refers to the fundamental level of an operating system, here the term identifies a piece of code that executes on a given processing element. An OpenCL command queue is created by the developer through an API call, and associated with a specific compute device. To execute a kernel, the kernel is pushed onto a particular command queue. Enqueueing a kernel can be done asynchronously, so that the host program may enqueue many different kernels without waiting for any of them to complete. When enqueueing a kernel, the developer optionally specifies a list of events that must occur before the kernel executes. If a developer wishes to target multiple OpenCL compute devices simultaneously, the developer would create multiple command queues. Command queues provide a general way of specifying relationships between tasks, ensuring that tasks are executed in an order that satisfies the natural dependences in the computation. The OpenCL runtime is free to execute tasks in parallel if their dependencies are satisfied, which provides a general-purpose task parallel execution model.

Write C

Kernel A

Kernel C

Kernel B

Read A

Kernel D

Read B Figure 1 Task parallelism within a command queue.

Events are generated by kernel completion, as well as memory read, write and copy commands. This allows the developer to specify a dependence graph between kernel executions and memory transfers in a particular command queue or between command queues themselves, which the OpenCL runtime will traverse during execution. Figure 1 shows a task graph illustrating the power of this approach, where arrows indicate dependencies between tasks. For example, Kernel A will not execute until Write A and Write B have fin-

ished, and Kernel D will not execute until Kernel B and Kernel C have finished. The ability to construct arbitrary task graphs is a powerful way of constructing taskparallel applications. The OpenCL runtime has the freedom to execute the task graph in parallel, as long as it respects the dependencies encoded in the task graph. Task graphs are general enough to represent the kinds of parallelism useful across the spectrum of hardware architectures, from CPUs to GPUs. Besides the task parallel constructs provided in OpenCL, which allow synchronizaRTC MAGAZINE JULY 2012

Tech In Systems

Task Parallelism vs. Data Parallelism OpenCL supports task-parallel and dataparallel programming models, each optimized for different types of problems and processing platforms. Task parallelism is the simultaneous execution on multiple cores of many different functions across the same or different datasets, and is ideally suited for multicore CPUs. In this model, an instance of code is executed on a device independent of other operations being executed on other devices. Traditionally this has been in the form of a thread of code running on a CPU core. In OpenCL this can be a kernel executing on the CPU or GPU. Parallelism occurs when multiple threads or kernels are executing at the same time. Data parallelism is the simultaneous execution on multiple cores of the same function across the elements of a dataset, and is ideally suited for GPUs. In the data-parallel programming model, a computation is defined in terms of a sequence of instructions executed on multiple elements of a memory object. These elements are typically arranged in an index space, which defines how the execution maps onto the work items.

tion and communication between kernels, OpenCL supports local barrier synchronizations within a work group. This mechanism allows work items to coordinate and share data in the local memory space using only very lightweight and efficient barriers. Work items in different work groups should never try to synchronize or share data, since the runtime provides no guarantee that all work items are concurrently executing, and such synchronization easily introduces deadlocks. Developers are also free to construct multiple command queues, either for parallelizing an application across multiple compute devices, or for expressing more parallelism via completely independent streams of computation. OpenCLâ&#x20AC;&#x2122;s ability to use both data and task parallelism simultaneously is a great benefit to parallel application developers, regardless of their intended hardware target.

Kernels

As mentioned, OpenCL kernels provide data parallelism. The kernel ex-

JULY 2012 RTC MAGAZINE

work group (Wx , Wy) Gx

sx = 0 sy = 0 work item (WxSx + sx , WySy + sy)

sx = Sx - 1 sy = 0 work item (WxSx = sx , WySy + sy) Sy

Gy sx = 0 sy = Sy - 1 work item (WxSx = sx , WySy + sy)

sx = Sx - 1 sy = Sy - 1 work item (WxSx + sx , WySy + sy)

Figure 2 Executing kernels - work groups and work items.

ecution model is based on a hierarchical abstraction of the computation being performed. OpenCL kernels are executed over an index space, which can be 1, 2 or 3 dimensional. In Figure 2, we see an example of a 2-dimensional index space, which has Gx * Gy elements. For every element of the kernel index space, a work item will be executed. All work items execute the same program, although their execution may differ due to branching based on data characteristics or the index assigned to each work item. The index space is regularly subdivided into work groups, which are tilings of the entire index space. In Figure 2, we see a work group of size Sx * Sy elements. Each work item in the work group receives a work group ID, labeled (wx, wy) in the figure, as well as a local ID, labeled (sx, sy) in the figure. Each work item also receives a global ID, which can be derived from its work group and local IDs. Work items in different work groups may coordinate execution through the use of atomic memory transactions, which are an OpenCL extension supported by some OpenCL runtimes. For example, work items may append variable numbers of results to a shared queue in global memory. However, it is good practice that work items do not, generally, attempt to communicate directly because without careful design, scalability and deadlock can be-

voidtrad_mul(int n, const float *a, const float *b, float *c) { int i; for (i=0; i<n; i++) c[i] = a[i] * b[i]; } Figure 3 Example of traditional loop (scalar).

come difficult problems. The hierarchy of synchronization and communication provided by OpenCL is a good fit for many of todayâ&#x20AC;&#x2122;s parallel architectures, while still providing developers the ability to write efficient code, even for parallel computations with non-trivial synchronization and communication patterns. The work items may only communicate and synchronize locally, within a work group, via a barrier mechanism. This provides scalability, traditionally the bane of parallel programming. Because communication and synchronization at the finest granularity are restricted in scope, the OpenCL runtime has great freedom in how work items are scheduled and executed.

A Typical OpenCL Kernel

As already discussed, the core programming goal of OpenCL is to provide

tech in systems

programmers with a data-parallel execution model. In practical terms this means that programmers can define a set of instructions that will be executed on a large number of data items at the same time. The most obvious example is to replace loops with functions (kernels) executing at each point in a problem domain. Referring to Figures 3 and 4, let’s say you wanted to process a 1024 x 1024 image (your global problem dimension). You would initiate one kernel execution per pixel (1024 x 1024 = 1,048,576 kernel executions). Figure 3 shows sample scalar code for processing an image. If you were writing very simple C code you would write a simple for loop, and in this for loop you would go from 1 to N and then perform your computation. An alternate way to do this would be in a data parallel fashion (Figure 4), and in this case you’re going to logically read one element in parallel from all of a (*a), multiply it from an element of b in parallel and write it to your output. You’ll notice that in Figure 4 there is no for loop— you get an ID value, read a value from a, multiply by a value from b and then write the output. As stated above, a properly written OpenCL application will operate correctly on a wide range of systems. While this is true, it should be noted that each system and compute device available to OpenCL may have different resources and characteristics that allow and sometimes require some level of tuning to achieve optimal performance. For example, OpenCL memory object types and sizes can impact performance. In most cases key parameters can be gathered from the OpenCL runtime to tune the operation of the application. In addition, each vendor may choose to provide extensions that provide for more options to tune your application. In most cases these are parameters used with the OpenCL API and should not require extensive rewrite of the algorithms.

Building an OpenCL Application

An OpenCL application is built by first querying the runtime to determine which platforms are present. There can be any number of different OpenCL

A minimalist OpenCL program #include <CL/cl.h> #include <stdio.h> #define NWITEMS 512 A simple memset kernel const char *source = “__kernel void memset( __global uint *dst ) “{ “ dst[get_global_id(0)] = get_global_id(0); “}

\n” \n” \n” \n”;

int main(int argc, char ** argv) { 1. Get a platform. cl_platform_id platform; clGetPlatformIDs( 1, &platform, NULL ); 2. Find a gpu device. cl_device_id device; clGetDeviceIDs( platform, CL_DEVICE_TYPE_GPU, 1, &device NULL); 3. Create a context and command queue on that device. cl_context context = clCreateContext( NULL, 1, &device, NULL, NULL, NULL); cl_command_queue queue = clCreateCommandQueue( context, device, 0, NULL ); 4. Perform runtime source compilation, and obtain kernel entry point. cl_program program = clCreateProgramWithSource( context, 1, &source, NULL, NULL ); clBuildProgram( program, 1, &device, NULL, NULL, NULL ); cl_kernel kernel = clCreateKernel( program, “memset”, NULL ); 5. Create a data buffer. cl_mem buffer = clCreateBuffer( context, CL_MEM_WRITE_ONLY, NWITEMS * sizeof(cl_uint), NULL, NULL ); 6. Launch the kernel. Let OpenCL pick the local work size. size_t global_work_size = NWITEMS; clSetKernelArg(kernel, 0, sizeof(buffer), (void*) &buffer); clEnqueueNDRangeKernel( queue, kernel, 1, NULL, &global_work_size, NULL, 0, NULL, NULL); clFinish( queue ); 7. Look at the results via synchronous buffer map. cl_uint *ptr; ptr = (cl_uint *) clEnqueueMapBuffer( queue, buffer, CL_TRUE, CL_MAP_READ, 0, NWITEMS * sizeof(cl_uint), 0, NULL, NULL, NULL ); int i; for(i=0; i < NWITEMS; i++) printf(“%d %d\n”, i, ptr[i]); return 0; }

CODE BLOCK 1

RTC MAGAZINE JULY 2012

Tech In Systems

kernel void dp_mul (global const float *a, global const float *b, global float *c) { int id = get_global_id (0); c[id] = a[id] * b[id]; } // execute over “n” work-items Figure 4 Data parallel OpenCL.

implementations installed on a single system. The desired OpenCL platform can be selected by matching the platform vendor string to the desired vendor name, such as “Advanced Micro Devices, Inc.” The next step is to create a context. An OpenCL context has associated with it a number of compute devices (for example, CPU or GPU devices).

Within a context, OpenCL guarantees a relaxed consistency between these devices. This means that memory objects, such as buffers or images, are allocated per context; but changes made by one device are only guaranteed to be visible by another device at well-defined synchronization points. For this, OpenCL provides events with the ability to syn-

chronize on a given event to enforce the correct order of execution. Most OpenCL programs follow the same pattern. Given a specific platform, select a device or devices to create a context, allocate memory, create device-specific command queues, and perform data transfers and computations. Generally, the platform is the gateway to accessing specific devices, given these devices and a corresponding context. The application is independent of the platform. Given a context, the application can: • Create one or more command queues. • Create programs to run on one or more associated devices. • Create kernels within those programs. • Allocate memory buffers or images, either on the host or on the device(s)— Memory can be copied between the host and device. • Write data to the device. • Submit the kernel (with appropriate arguments) to the command queue for execution. • Read data back to the host from the device.

ATCA, μTCA, VME AND VPX SYSTEMS...FASTER. Propel your project success with Schroff ® Systems and Subracks EXPRESS. We provide VITA and PICMG compliant product solutions faster and at a competitive price. Protect your application with standard or customized electro-mechanical and system products – shipped in as few as two weeks and backed by our global network and more than 60 years of engineering experience. See our complete offering online.

RAPID DELIVERY VITA and PICMG compliant solutions.

Untitled-6 1

JULY 2012 RTC MAGAZINE

W W W.SCHROFF.US 7/9/12 4:04 PM

tech in systems

The relationship between context(s), device(s), buffer(s), program(s), kernel(s) and command queue(s) is best seen by looking at sample code.

Example Program – Simple Buffer Write

Here is a simple programming example—a simple buffer write—with explanatory comments. This code sample shows a minimalist OpenCL C program that sets a given buffer to some value. It illustrates the basic programming steps with a minimum amount of code. This sample contains no error checks and the code is not generalized. Yet, many simple test programs might look very similar. The entire code for this sample is provided in Code Block 1. 1. The host program must select a platform, which is an abstraction for a given OpenCL implementation. Implementations by multiple vendors can coexist on a host, and the sample uses the first one available. 2. A device ID for a GPU device is requested. A CPU device could be requested by using CL_DEVICE_TYPE_CPU instead. The device can be a physical device, such as a given GPU, or an abstracted device, such as the collection of all CPU cores on the host. 3. On the selected device, an OpenCL context is created. A context ties together a device, memory buffers related to that device, OpenCL programs and command queues. Note that buffers related to a device can reside on either the host or the device. Many OpenCL programs have only a single context, program and command queue. 4. Before an OpenCL kernel can be launched, its program source is compiled, and a handle to the kernel is created. 5. A memory buffer is allocated on the device. 6. The kernel is launched. While it is necessary to specify the global work size, OpenCL determines a good local work size for this device. Since the kernel was launched asynchronously, clFinish() is used to wait for completion. 7. The data is mapped to the host for examination. Calling clEnqueueMapBuffer ensures the visibility of the buffer

on the host, which in this case probably includes a physical transfer. Alternatively, we could use clEnqueueWriteBuffer(), which requires a pre-allocated host-side buffer. OpenCL affords developers an elegant, non-proprietary programming platform to accelerate parallel processing performance for compute-intensive applications. With the ability to develop and

maintain a single source code base that can be applied to CPUs, GPUs and APUs with equal ease, developers can achieve significant programming efficiency gains, reduce development costs, and speed their time-to-market. Advanced Micro Devices Sunnyvale, CA. (408) 749-4000. [www.amd.com].

We can help keep your VME dinosaur running...even better. Feed your beast with our embedded legacy VME upgrades and keep your system from going extinct.

XV 6U ME-6 Pro 30 ces 0 i7 sor Int ® Mo el dul Cor ™ e e

- Plug into any VME legacy system (including 5V only chassis) - Hardware byte swap - 3 or 5 row VME connectors, with or without P0 - Extensive I/O options

EMBEDDED AN ACROMAG, INC. GROUP

90( 93; F3&, 30& &20 ([SUHVV

Untitled-3 1

www.xembedded.com 734-975-0577

7/10/12 11:07 AM RTC MAGAZINE JULY 2012

technology in

systems

Developing Hybrid Code Using OpenCL

Developing Embedded Hybrid Code Using OpenCL Open Computing Language (OpenCL) is a specification and a programming framework for managing heterogeneous computing cores such as CPUs and graphics processing units (GPUs) to accelerate computationally intensive algorithms. by Mark Benson, Director of Software Strategy, Logic PD

JULY 2012 RTC MAGAZINE

NDRange size Gx

OpenCL NDRange (N=2)

Work-item

(sx,sy) = (0,0)

(sx,sy) = (Sx-1,0)

Work-item

(sx,sy) = (0, Sy-1)

(sx,sy) = (Sx-1,Sy-1)

Work-group size Sy

Work-group (Wx, Wy)

NDRange size Gy

n recent years, the mechanism by which incremental computational performance has been achieved has shifted from clock speed to a proliferation of processing cores. This shift, being driven primarily by undesirable quantum effects at higher signaling speeds and practical limits on the rates we can dissipate heat, has caused an acceleration of new software techniques. These techniques allow us to not only leverage homogeneous multicore CPUs, but also graphics accelerators, digital signal processors (DSPs) and field-programmable gate arrays (FPGAs) as general-purpose computing blocks to accelerate algorithms hungry for everhigher computational performance. Proposed by Apple and maintained by the Khronos Group, OpenCL was created to provide a portable open programming framework that enables software to take advantage of both multicore CPUs and also specialized processing cores, most notably GPUs, for non-graphical processing purposes in a highly parallel way. OpenCL is similar to OpenGL in that it is a device-agnostic open standard that anyone can adopt and use to create a custom implementation. OpenCL was designed to work with OpenGL in that data can be shared between frameworksâ&#x20AC;&#x201D;data can be crunched with OpenCL and subsequently displayed using OpenGL.

Work-group size Sx

Figure 1 An OpenCL NDRange.

The OpenCL specification was developed by a working group formed in 2008, chaired by Nvidia, and edited by Apple. Since then, backward-compatible revisions of the OpenCL specification have been released along with a set of conformance tests that can be used to demonstrate compliance. Conformant implementations of OpenCL for a given processor are available primarily from the silicon vendor (Altera, AMD, ARM, Freescale, Imagination Technologies, Intel, Nvidia, Texas Instruments, Xilinx, etc.). An OpenCL driver

from these vendors is required in order for the OpenCL framework to run on top of it. OpenCL is similar to Nvidiaâ&#x20AC;&#x2122;s CUDA, Brook from Stanford and Microsoft DirectCompute. In relation to these, OpenCL has a reputation of being open, portable, lower-level, closer to the hardware, and in some ways harder to use. Think of OpenCL as a portable hardware abstraction layer that supports parallel programming on heterogeneous cores. OpenCL also comes with a language that is based on a subset of C99 with some

Medical Electronic Devices Are Changing the Way We Live. Be Part of the Action!

BRINGING TOGETHER ENGINEERS & DEVELOPERS TO LEARN, ENGAGE AND COLLABORATE ON SOLUTIONS TO POWER THE NEXT ERA OF HEALTHCARE. Free Technical Conference · Keynote Address · Exhibition · Seminars/Workshops · New Product Showcase · Networking

Upcoming Event:

August 23rd San Diego, CA

Colocated with the REAL-TIME & EMBEDDED COMPUTING CONFERENCE

PCI Express, PCI, and ISA Experts RTD Offers In-House Design, Engineering, Manufacturing & Technical Support

4JOHMF #PBSE $PNQVUFST t t t t t

*OUFM $PSF %VP 1FOUJVN . $FMFSPO . -PX 1PXFS ".% (FPEF -9 3VHHFE 4VSGBDF .PVOU 4PMEFSFE 3". 0OCPBSE *OEVTUSJBM 'MBTI %JTL UP Â&#x201E;$ 0QFSBUJPO

%BUB $PMMFDUJPO .PEVMFT t t t t t t

4QFDJBMUZ .PEVMFT t t t t t

1PSU &UIFSOFU 4XJUDI %FMUB 4JHNB "OBMPH * 0 )PU 4XBQQBCMF 3FNPWBCMF 4"5" 6TFS $POGJHVSBCMF .JOJ 1$*F %JHJUBM 4JHOBM 1SPDFTTPST

"VUP $BMJCSBUJOH "OBMPH * 0 "EWBODFE %JHJUBM * 0 4JNVMUBOFPVT 4BNQMJOH )JHI 4QFFE .D#41 1VMTF 8JEUI .PEVMBUJPO *ODSFNFOUBM &ODPEJOH 0QUP *TPMBUFE .04'&5 '1("

1PXFS 4VQQMJFT t t

)JHI &GGJDJFODZ 1PXFS 4VQQMJFT 6OJOUFSSVQUJCMF 1PXFS 4VQQMJFT

1FSJQIFSBM .PEVMFT t t t t t t t t t t t t

.BTT 4UPSBHF .PUJPO $POUSPM 4ZODISP 3FTPMWFS 7JEFP $POUSPM 'JSF8JSF 64# 64# $"/ #VT $"/ 4QJEFS (JHBCJU &UIFSOFU (14 (4. (134 &%(& .PEFN 8JSFMFTT 5FMFNBUJDT

XXX SUE DPN r TBMFT!SUE DPN

#VT 4USVDUVSFT t t t t t

1$*F 1$* &YQSFTT 1$* 1$ Plus 1$

The products above are just a sampling of RTDâ&#x20AC;&#x2122;s board-level and ruggedized modular packaging solutions. From low-power to high performance, RTD can tailor a system for your mission-critical application. Visit www.rtd.com to see our complete product list.

AS9100 & ISO9001 Certified

35% &NCFEEFE 5FDIOPMPHJFT *OD

Tech In Systems

Embedding Excellence MSC Embedded offers a large variety of COM products in COM ExpressTM, QsevenTM and ETX®. Our COM products are scalable, easy to install and powerful to enable long living industrial applications with an upgrade path to future needs. COM Express™ MSC CXB-6S Intel® CoreTM – 2nd Generation Multiple i5, i7 and Celeron processor options available Single, dual and quad core solutions Intel® HD Graphics 2000 / 3000 Intel® QM67 or HM65 Platform Controller Hub Up to 16GB DDR3 SDRAM, dual channel Four SATA-300 interfaces One PATA interface LVDS (24 Bit, dual channel) and VGA Resolution up to 2560 x 1600 Five PCI Express™ x1 lanes Eight USB 2.0 interfaces

V-2_2012-WOEI-5945_RTC

COM Express Type 2 125 mm x 95 mm (4.92 x 3.74”)

Untitled-6 1

MSC Embedded Inc. direct: (650) 616 4068 info@mscembedded.com www.mscembedded.com

JULY 2012 RTC MAGAZINE

3/1/12 9:52:51 AM

additional features that support two different models of programming for parallelism: task parallelism and data parallelism. Task parallelism is a model with which embedded engineers are most familiar. Task parallelism is commonly achieved with a multithreading OS, and leveraged so that different threads of execution can operate at the same time. When threads need to access common resources, mutexes, semaphores, or other types of locking mechanisms are used. OpenCL supports this model of programming but it is not its greatest strength. Data parallelism is used in algorithms that use the same operation across many sets of data. In a data-parallel model, one type of operation, such as a box filter, can be parallelized such that the same microalgorithm can be run multiple times in parallel, but each instantiation of this algorithm operates on its own subset of the data—hence the data is parallelized. This is the model of programming that OpenCL is best suited to support. Five compatible and intersecting models of OpenCL will help explain the concepts it embodies. These are framework, platform, execution, memory and programming. The OpenCL framework consists of a platform layer, a runtime and a compiler. The platform allows a host program to query available devices and to create contexts. The runtime allows a host program to manipulate contexts. The compiler creates program executables and is based on a subset of C99 with some additional language features to support parallel programming. In order for silicon vendors to provide OpenCL conformance, they need to provide an OpenCL driver that enables the framework to operate. The platform is defined by a host that is connected to one or more devices, for example, a GPU. Each device is divided into one or more compute units, i.e., cores. Each compute unit is divided into one or more processing elements. Execution within an OpenCL program occurs in two places: kernels that execute on devices—most commonly GPUs—and a host program that executes on a host device—most commonly a CPU. To understand the execution model, it’s best to focus on how kernels execute. When a kernel is scheduled for execution

by the host, an index space is defined. An instance (work item) of the kernel executes for each item in this index space. In OpenCL, the index space is represented by something called an NDRange. An NDRange is a 1-, 2- or 3-dimensional index space. A graphical representation of an NDRange is shown in Figure 1. The host defines a context for the kernels to use. A context includes a list of devices, kernels, source code and memory objects. The context originates and is maintained by the host. Additionally, the host creates a data structure using the OpenCL API called a commandqueue. The host, via the command-queue, schedules kernels to be executed on devices. Commands that can be placed in the command-queue include kernel execution commands, memory management commands and synchronization commands. The latter are used for constraining the order of execution of other commands. By placing commands in OpenCL commandqueues, the runtime then manages scheduling those commands to completion in parallel on devices within the system. Work items executing a kernel have access to the following types of memory: • Global memory—available to all work items in all work groups. • Constant memory—initialized by the host, this memory remains constant through the life of the kernel. • Local memory—memory shared by a work group. • Private memory—memory private to a single work item. As already mentioned, OpenCL supports two main types of programming models: data-parallel where each processor performs the same task on different pieces of distributed data; and task-parallel where multiple tasks operate on a common set of data. In any type of parallel programming, synchronization between parallel threads of execution must be considered. OpenCL offers three main ways to control synchronization between parallel processing activities. First, there are barriers to constrain certain work items within an index space to operate in sequence. Second, there are barriers to constrain the order of commands within the command-queue. And finally there are events generated by commands within the command-queue. These

tech in systems

events can be responded to in a way that enforces sequential operation. Using tools like OpenCL is great for photo/video editing applications, AI systems, modeling frameworks, game physics, Hollywood rendering and augmented reality, to name a few. However, there is also an embedded profile for OpenCL defined in the specification that consists of a subset of the full OpenCL specification, targeted at embedded mobile devices. Here are some highlights of what the OpenCL Embedded Profile contains: â&#x20AC;˘ 64-bit integers are optional â&#x20AC;˘ Support for 3D images is optional â&#x20AC;˘ Relaxation of rounding rules for floating point calculations â&#x20AC;˘ Precision of conversions on an embedded device is clarified â&#x20AC;˘ Built-in atomic functions are optional Looking forward, the OpenCL roadmap contains a number of initiatives to take it to the next level of relevance. High-Level Model (OpenCL-HLM): OpenCL is currently exploring ways to unify device and host execution environments via language constructs so that it is easier to use OpenCL. The hope is that by doing this, OpenCL will become even more widely adopted. Long-Term Core Roadmap: OpenCL is continuing to look at ways to enhance the memory and execution models to take advantage of emerging hardware capabilities. Also, there are efforts underway to make the task-parallel programming model more robust with better synchronization tools within the OpenCL environment. WebCL: OpenCL has a vision to bring parallel computation to the web via Javascript bindings. Standard Parallel Intermediate Representation (OpenCL-SPIR): OpenCL wants to get out of the business of creating compilers and tools and language bindings. By creating a standardized intermediate representation, bindings to new languages can be created by people outside of the OpenCL core team, enabling broader adoption and allowing the OpenCL intermediate representation to be a target of any compiler in existence, now or in the future. OpenCL has a bright future, but it has some hurdles to overcome, many of which

are being addressed by current initiatives within the working group. In the next decade of computing as we continue to see a proliferation of processing cores, both homogeneous CPUs and heterogeneous CPUs/GPUs, we will continue to have increasing needs for sophisticated software frameworks that help us take advantage of all of the hardware computing power that is available on our systems. As this trend continues, OpenCL is positioned strongly

as an open, free and maturing standard that has strong industry support and a bright future. LogicPD Eden Prairie, MN. (952) 941-8071. [www.logicpd.com]. Khronos Group [www.khronos.org].

THEMIS TACTICAL SYSTEMS REDUCING RISK AND INCREASING SURVIVABILITY UNDER EXTREME ENVIRONMENTAL CONDITIONS

VITA-74 NANOATR

FEATURES Â&#x2039; Intel Atom N455 @ 1.66 GHz Â&#x2039; 1 GB @ 667 MHz DDR3 Â&#x2039; Electrical per VITA-46 3U VPX Â&#x2039; Electrical per VITA-65 OpenVPX Â&#x2039; BP Connectors per VITA-57 FMC Â&#x2039; 4 Slot + Storage Â&#x2039; Conduction Cooled with Fins Â&#x2039; Dimensions (W x H x D) 4.88â&#x20AC;? x 4.12â&#x20AC;? x 4.38â&#x20AC;? Â&#x2039; 4.5 lbs (average) Â&#x2039; Conduction Cooled Â&#x2039; Operating Temberature 40Â° C to + 71Â° C Â&#x2039; +28 VDC (18 to 36 VDC) Â&#x2039; MIL-STD-810G, MIL-STD-461F TARGET APPLICATIONS Â&#x2039; Mission Computing Â&#x2039; Payload Control Â&#x2039; Real Time Control Â&#x2039; Data Recording Â&#x2039; Small Storage and Communications Systems Â&#x2039; Mobile Robotics

NANOPAK

FEATURES Â&#x2039; Intel Atom N455 @ 1.66 GHz Â&#x2039; 1 GB @ 667 MHz DDR3 Â&#x2039; VITA-74 Derivative Â&#x2039; I/O Through Front Panel Connector Â&#x2039; Dimensions (H x W x D) 89 mm X 21 mm X 90 mm Â&#x2039; Conduction Cooled Â&#x2039; Operating Temperature -40Â° C to + 71Â° C Â&#x2039; MIL-STD-810G, MIL-STD-461F TARGET APPLICATIONS Â&#x2039; Real Time Control Â&#x2039; Data Recorders Â&#x2039; Small Storage and Communications Systems Â&#x2039; Mobile Robotics

SCAN THIS QR CODE WITH YOUR MOBILE PHONE FOR MORE INFO ON THEMIS TACTICAL SYSTEMS PRODUCTS

BENEFITS

AVAILABLE 3U VPX CARDS

Â&#x2039; :[H[L VM [OL (Y[ ;OLYTHS HUK 2PUL[PJ +LZPNU Â&#x2039; 3V^LY 7YPJL Â&#x2039; :OVY[ 3LHK ;PTLZ Â&#x2039; :THSS 7YVNYHTZ HYL >LSJVTL

Â&#x2039; ;:)*P ? < =7? :PUNSL )VHYK *VTW\[LY ^P[O 0U[LS *VYL;4 P *7< Â&#x2039; ;06* ? < =7? ?4* 74* *HYYPLY 4VK\SL Â&#x2039; ;:* ? < =7? 7VY[ :(;( :(: 9(0+ 4VK\SL ^P[O 74* ?4* Â&#x2039; ;:4 ? < =7? :(;( :(: 4HZZ :[VYHNL +YP]L 4VK\SL Â&#x2039; ;.( ? < =7? .YHWOPJZ 7YVJLZZVY ^P[O (4+ , .7<

www.themis.com (510) 252-0870

Untitled-8 1

ÂŠ 2012 Themis Computer. All rights reserved. Themis Computer, Themis and the Themis logo are trademarks or registered trademarks of Themis Computer. All other trademarks are the property of their respective owners.

10:20:22 AM RTC MAGAZINE 3/6/12 JULY 2012

technology in

systems

Developing Hybrid Code Using OpenCL

Parallel Computing with AMD Fusion-Based Computer-onModules The integration of powerful graphics processors on the same die with multicore x86 processors is opening new areas of compute-intensive embedded applications. by John Dockstader, congatec

mbedded computing tasks are getting more and more demanding across all applications. The same applies to the processors, which must be flexible and customizable in order to encode or decode a variety of media signals and data formats such as JPEG, MP3 and MPEG2.Depending on the specific requirements, a choice of processor types is available. If the application is highly specific and individual, a digital signal processor (DSP) is a common choice. If the application is basic enough to be handled by an x86 architecture type processor, the use of a General Purpose Computing on Graphics Processing Unit (GPGPU) can enhance performance. AMD Fusion-based Computer-on-Modules, which include AMD’s integrated GPGPU, are now appearing on the market and provide compute capabilities beyond the traditional x86 performance scope (Figure 1). For a long time CPUs have been required to offer dedicated and often parallel performance for the processing of complex algorithms on top of high generic, mostly serial, processing power. This is necessary, for instance, when encoding or decoding high definition video, processing raw data— such as in industrial image processing—or performing complex vector calculations in diagnostic medical imaging procedures. Un-

JULY 2012 RTC MAGAZINE

Figure 1 Computer-on-Module Concept with AMD Fusion.

til now, if processed in an x86 design, these tasks required high computing performance with high clock frequencies, resulting in high energy consumption and heat dissipation. While multicore technology and continuous efficiency improvements in processor technology can address these issues to a certain degree, the fact remains that a speeding up of the clock rate alone is not enough to meet all application requirements. For example, high 3D performance is required for appealing animation, visualization and smooth playback of HD content. The graphics core also needs to support the CPU when decoding HD videos—something that

is of particular importance in medical technology, as in 4D ultrasound or endoscopy, and also in infotainment applications. The closer the embedded application gets to the consumer sector, the higher the user expectations. For this reason, AMD has combined both technologies in one package with the release of the embedded G-Series and RSeries platforms. Users can now take advantage of an extremely powerful graphics unit with highly scalable processor performance. The so-called Accelerated Processing Unit (APU) combines the serial processing power of the processor cores with the parallel processing power of the graphics card. This signals an end to the previous software-based division of tasks between the processor and the graphics unit. Simply put, this means the processor cores can offload parallel tasks to the graphics unit, thereby increasing the overall performance of the system far beyond what has previously been possible. Driven by the consumer market, the performance of graphics cores has steadily increased. In particular, the 3D representation of virtual worlds has pushed the specialization of graphics cards and created a demand for high parallel processing capacity. Due to the variety of graphics data, such as the calculation of texture, volume and 3D modeling for collision detection and vertex shaders for

Why Should Researching SBCs Be More Difficult Than Car Shopping? INTELLIGENTSYSTEMSSOURCE.COM IS A COMPARISON TOOL FOR DESIGN ENGINEERS LOOKING FOR CUSTOM AND OFF-THE- SHELF SBCS AND SYSTEM MODULES. Todayâ&#x20AC;&#x2122;s systems combine an array of very complex elements from multiple manufactures. To assist in these complex architectures, ISS has built a simple tool that will source products from an array of companies for a side by side comparison and provide purchase support.

INTELLIGENTSYSTEMSSOURCE.COM

Tech In Systems

2 DDR3 SO-DIMMS

Memory Controller

SIMD Engine Array

VCE

SAMU

Unified Video Decoder

x86 Cores

Platform Interfaces

PCI EXPRESS x16 PCIe SATA LPC SPI

HDMI

VGA HD AUDIO CIR

Unified Media Interface

dvi

USB 2 PCIe 4x1 USB 3.0 Optional

Controller Hub

Figure 2 The AMD-R-Series integrates two to four x86 cores along with an SIMD parallel processing engine originally designed for high-end graphics, but which can also be used for numerically intensive parallel operations.

geometry calculations, the functions are no longer firmly cast in hardware, but are freely programmable. As a consequence, advanced graphics units provide an enormous and highly flexible performance potential. With the help of GPGPUs, this potential can be used not just for the calculation and representation of graphics, but also for data processing. Possible uses include the calculation of 3D ultrasound images in medical applications, face recognition in the security sector, industrial image processing and data encryption or decryption. Certain types of data—such as from sensors, transducers, transceivers and video cameras— can be processed faster and more efficiently with dedicated processing cores than with

JULY 2012 RTC MAGAZINE

the generic serial computing power of x86 processors. This is due to the fact that with a GPGPU it is irrelevant whether the data generated by the program code is purely virtual or whether it is supplied via an external source. So it makes good sense to unite the CPU and GPU in an APU for an even stronger team (Figure 2). It is not so much the CPU but the APU performance that is important. This means OEMs and users need to say goodbye to the phrase “excellent CPU performance,” because processing power is no longer defined by the CPU alone. These days the graphics unit plays a crucial role as well. In addition to the pure representation of graphics, it is already used in mass applications

such as filtering algorithms of photo editing programs like Photoshop, programs for encoding and converting video data and Adobe Flash Player. In the past, developers struggled with the fact that traditional CPU architectures and programming tools were of limited use for vector-oriented data models with parallel multi-threading. With the introduction of AMD Fusion technology, that hurdle has been overcome. Easy to use APIs such as Microsoft DirectCompute or OpenCL, which are supported by the AMD Fusion technology, enable application developers to efficiently harness the power of the graphics core of the APU for a variety of tasks beyond imaging—provided, of course, that the graphics core supports it. The AMD embedded G-Series and RSeries platforms, with integrated graphics, do exactly this and AMD offers software development kits for it. This makes moving to a new type of data processing easier than ever before. In signal processing, a GPU covers a specific application area. Even though there are less graphics engines compared with a DSP processor, the GPU comes up trumps on programmability. The individual engines can be used flexibly and can be allocated to different tasks. For example, it is possible to use 30 engines in parallel for fast Fourier transform (FFT), 20 engines for JPG and another 30 for a total of up to 80 possible engines for MPEG2 encoding. For specific tasks, a GPU is therefore more efficient than a DSP. In general, applications with less data and simple algorithms are better suited in order to avoid overloading the system and memory bus. Good examples from the medical industry are portable ultrasound devices with low imaging rates or image analysis equipment. Another very exciting application is the use in multiple security testing processes to validate the authenticity of banknotes. In these applications, the developer is not tied to existing algorithms, but can program his or her own security mechanisms. A classic DSP is often used for smaller applications such as seamless processing of digital audio or video signals. A distinction is primarily made between floating and fixed point DSPs. The DSP is optimized for a single operation, massively parallelized and achieves a fast execution speed. Typical applications include mixing consoles for sound manipulation, hard drives or speaker crossovers.

CPU System Memory

Fusion GPU ...

GPU Memory

Discrete GPU ...

Figure 3 AMD-Fusion-GPU-Architecture. In addition to the integrated GPGPU, an external graphics processor or DSP can be attached for specialized tasks.

In the future, GPGPUs will be able to fulfill even more of the classic functions of DSPs. But it is also clear that a pure DSP application will not be replaced by a GPGPU (Figure 3). For a GPGPU to perform digital signal processing effectively, the application has to support typical computing features. GPGPUs also work for “simple” embedded computing tasks. AMD Fusion technology is not exclusively positioned for specialized applications. On the contrary, the Computer-on-Module standard COM Express from congatec with AMD Fusion can be used across the entire embedded computing spectrum. Thanks to high scalability—ranging from single core processors to quad core processors based on the AMD R-Series—the new AMD platform covers approximately 80% of all application requirements in the embedded market; from low power right through to high performance applications. Breaking down the performance spectrum to known standards, we can also say that the AMD embedded G-Series platform is scalable for solutions requiring anything between an Intel Atom

and an Intel Core i5 dual core processor. It is important to note that this power calculation does not take into account the superior graphics performance, which thanks to the GPGPU can also be used for other embedded computing tasks. So depending on the application, the performance potential may even be much higher. OEMs can therefore implement their entire product range on the basis of a single processor architecture, regardless of the specific sector. This not only reduces development time, but also simplifies the supply chain and lifecycle management and reduces associated costs. For OEMs and developers who prefer to use core computing components without much design effort and who strive to optimize their supply chain management by using highly flexible COTS platforms, Computer-on-Modules are the appropriate solution. congatec San Diego, CA. (858) 457-2600. [www.congatec.com].

…www.men.de/safe … www.men.de/safe … www.men.de/safe … www.men.de/safe … www.men.de/safe … www.men.de/safe … www.men.de/safe … www.men.de/safe … www.men.d

tech in systems

Untitled-4 1

Safety-critical and Redundant CompactPCI® and VMEbus

Certiﬁable up to SIL 4 or DAL-A with safe operating system N Safe and reliable through triple redundancy N Simple software integration through lockstep architecture N Voter implemented as IP core in safe FPGA N Conductive-cooling options for harsh environments N Meets environmental standards DO-160 and EN 50155 N Developed according to DO-254 and EN 50129 N Guaranteed quality based on ISO 9001, EN 9100 and IRIS MEN delivers redundant computers on board and system level (COTS or customized) for safetycritical control systems for railway, avionics and industrial.

MEN Micro, Inc. 24 North Main Street Ambler, PA 19002 Tel: 215.542.9575 E-mail: sales@menmicro.com

www.men.de/safe

RTC MAGAZINE JULY 2012

3/14/11 9:55:04 AM

ECCN.COM

MAKES IT HAPPEN

ECCN.com is the top and most powerful network for Chinaâ&#x20AC;&#x2122;s electronic market. It provides hundreds of thousands of Chinese engineers with the most up-to-date technology information from the leading manufacturers and solution providers from the U.S.A. and Europe. The all-inclusive platform includes technology news, new products, a technical library, webinars, webcasts, training and videos. Coverage ranges from consumer electronics to embedded, industrial, automotive and medical technolgoies. Components supported range from sensors and analog circuits to programmable logics and MCU.

Additionally, the portal allows engineers to purchase products online to help them finish their prototype and development projects. ECCN.com has 6 local offices to serve the engineers and make delivery fast and efficient. It also has Global Electronics China Magazine (GEC) to reinforce marketing and promotion. Buy it online. Buy it in person.

Beijing Shanghai Shenzhen Xian Wuhan Hong Kong

ECCN.COM THE CHINA eCONNECTION Connecting 2 Continents and over 1,000,000 Chinese Electronic Buyers & Engineers

www.eccn.com

Tel: 010-82888222, 0755-33322333 Fax: 010-82888220, 0755-33322099

technology deployed Code Requirements and Verification

Requirements Engineering Today Clearly defining and understanding requirements before embarking on a large and complex software development project is key to success. The ability to establish and then fully implement all requirements demands a systematic method supported by comprehensive tools. by Marcia Stinson, Visure

equirements engineering is the process of determining user expectations for a new or modified product. That sounds so simple, doesn’t it? Of course, those who are in the field of requirements engineering—also called requirements analysis—understand the difficulty and complexity hidden within that simple statement. Anyone on a development team who has labored to deliver a project and has discovered that it wasn’t, after all, what the user really wanted, understands the difficulty and complexity even more. Requirements engineering provides a powerful tool for engineering teams to streamline development, simplify project management, and deliver results on time, on budget, and with the least possible headaches. According to a survey by the Standish Group, about half of all software projects fail to be delivered on time and within budget. Even worse, many of those that are delivered on time are not accepted by the users and require additional rework. The most common culprit? Incomplete and changing requirements. This problem appears to be an even bigger issue than challenges like lack of user involvement and lack of resources. Especially in complex projects, there are many layers of requirements that must be understood from the

JULY 2012 RTC MAGAZINE

very start of development. Poor requirements management can lead to the development of systems that are expensive, late to market, and missing key features. Getting the user expectations described correctly is the key to everything else. This is where requirements engineering can help. To understand the basic concept of requirements engineering, consider the V-model in Figure 1. This model has been around for years and is still recognized and used today to understand the concept of a requirements hierarchy and a related testing hierarchy. Requirements begin at a very high level of understanding and evolve into more and more technical details. Testing, on the other hand, begins at the lowest level and builds up to finally verify that the system is providing what the users have asked for and needed. Formally, requirements engineering activity is divided into requirements development and requirements management. Requirements development is composed of elicitation, analysis, specification, and validation. Requirements management is the control of the entire requirements process, especially handling any change in requirements. On the other hand, some practitioners just label the whole activity as requirements analysis.

Why Is This Important to Embedded Systems?

In the development of most systems today, the focus is on functionality. If the functionality meets the user needs then there is usually some acceptance on the part of the users that some of the nonfunctional requirements may not be completely satisfied. In most systems, as long as the functionality is met, some adjustments to the non-functional requirements are more readily accepted. Products are systems that consist of sub-systems and their interfaces. Each subsystem is considered an independent system with interfaces to other subsystems. Each subsystem has its own system requirements, but because it is associated with a larger system, it will be subject to certain restrictions imposed by the system as a whole. Embedded systems are subsystems of the overall system, so their requirements are derived from system requirements. The individual component merits its own requirements analysis, modeling, and further decomposition of requirements, but these activities cannot be done in isolation—they rely on the requirements they have been given. The subsystem must meet these requirements in order to fit effectively into the overall system. . These

Technology deployed

Feasibility Study / Concept Exploration System Validation Plan

System Validation

System Verification Plan (System Acceptance)

and n

nitio Defi

Detailed Design

Unit / Device Test Plan

pos ition om

Subsystem Verification

Rec

ition

High-Level Design

Subsystem Verification Plan (Subsystem Acceptance)

System Verification & Deployment

tion

pos om Dec

System Requirements

Retirement / Replacement

and

Concept of Operations

Changes and Upgrades

Unit/Device Testing

gra

Lifecycle Processes

Operations and Maintenance

Inte

Regional Architecture(s)

Software / Hardware Development Field Installation

Document/Approval

Implementation Time Line

Development Processes

Figure 1 V-curve of requirements engineering illustrates the relationship between a requirements hierarchy and related testing hierarchy.

requirements can be classified as functional and non-functional. Functional requirements specify capabilities, functions or actions that need to be possible with the system. That is, what the system shall do. An example of a functional requirement would be “The missile shall hit a moving target,” or “The ground station shall be able to establish the missile’s target.” Non-functional requirements specify the qualities that the product or its functions must have. This means that nonfunctional requirements not only apply constraints to the system to be built but also establish its quality and actually play a key part in the development. An example of quality aspects of a function would be “The missile shall hit a moving target in the range of 100 miles,” or, “The missile shall hit a moving target in the range of 4,000 miles.” Thus, a single non-functional aspect may not only influence the

complete development of a system, but every constraint also has an associated cost. Therefore, requirements containing quality aspects are categorized as nonfunctional requirements, whereas requirements describing capabilities are categorized as functional requirements. Non-functional requirements that are allocated to embedded systems normally impact the entire product, including how the system is developed and maintained. Embedded systems must be designed within the context of the environment in which they will be running since the environment imposes restrictions on how the product will behave. In embedded system development, there must be an increased focus on handling and recovering from errors that may occur. Other typical nonfunctional software requirements for embedded systems include synchronous vs. non-synchronous execution, safety and

reliability, resource constraints and autonomy or the ability to operate without human intervention. While the functionalities are normally prioritized by means of costbenefit, constraints are divided into two types of non-functional requirements: hard non-functional and soft non-functional. Hard non-functional requirements are non-negotiable and must be met by the system or the system is considered to be failed. Soft non-functional requirements are negotiable with the customer and failure to fulfill these requirements will not necessarily cause the system to fail testing. In embedded systems, most of the non-functional requirements are hard requirements. There may be some room for negotiation but normally that requires a series of trade-offs with other components of the system, not RTC MAGAZINE JULY 2012

technology deployed

Testing Requirements at Various Levels The missle shall hit the target.

The missle shall update its position every 30 seconds.

The altimeter shall read terrain data in 1 second.

verifies

Fire a test missle at a target.

Test in lab to ensure missile updates position every 30 seconds. Test that altimeter reads terrain data in 1 second.

Figure 2 To ensure a successful project, requirements must be tested at various levels.

with the end users. Usually the overall system requirement is a hard requirement that has to be met. Consider a situation for a control system that is embedded in a cruise missile. The missile has to read terrain data, process the data, and update the missile’s course. The quality aspect of this capability is to do so in 30 seconds. The 30 second limit is a hard non-functional requirement. There is no wiggle room in meeting it—if the requirement is not met, the missile will likely miss an update, become lost and miss the target. On the other hand, if we establish that the update frequency shall be every millisecond, the cost of the system might become unaffordable, or not feasible with the current technology—and not add any value to the system compared to the 30 second update. Knowing this will change the way developers will design the software. They may break the 30 seconds down into even more detailed processes, allocating a portion of the time to each process. There will be 10 seconds to read the terrain data, 15 seconds to process the data, and 10 seconds to update the missile, for example So there may be some negotiating among the three processes on how the 30 seconds is allocated. But in the end, the update must be completed in 30 seconds in order to keep the missile on track. These requirements may change the way developers look at designing and implementing the code re-

JULY 2012 RTC MAGAZINE

quired to make this happen. Getting the real requirements defined right and allocated to the individual components is a key factor and requires solid requirements engineering practices.

Collecting and Managing Requirements

To develop a successful project, collecting and managing requirements must begin on day one. It starts with the request that initiates the project and continues throughout deployment. During the initial phase of a project, requirements gathering begins as requirements analysts communicate with users and spend the time required to understand the problem they are trying to solve. Understanding the problem correctly is essential to building something that will meet the needs of the end-user. Misunderstanding here will result in a project that does not function as desired from the perspective of the users. User requirements should be documented, along with business requirements (business rules) and nonfunctional requirements. Key skills to these activities are good listening skills, communication skills, and concerted effort to understand the system from the user’s perspective. These requirements are often modeled in a use case structure to help understand the flow of the requirements. Requirements must be analyzed for completeness, correctness, and feasibility. As a simple example of this, there may be a requirement

for a missile to read terrain and update its location within two seconds and the updates will occur every 30 seconds. These requirements are probably based on analysis that indicates this is the minimum time required to keep the missile on track. Obviously this is a requirement that must be met and this will drive the development effort to focus on these kinds of performance requirements. Once the user requirements are defined, the process begins to detail the steps the system must provide to meet the user needs. The system requirements are derived from the user requirements. This requires a much more technical analysis and an understanding of the system that is to be built, as well as other systems that may interact with the system being built. The requirements may be developed in phases. Some may be reused for several different systems. These kinds of situations lead to a very complex requirements model that must be thought through carefully. Attributes are usually applied to the requirements to help understand the variants and releases of the systems in which they are used. If we refer to the missile example once again, let’s assume that a new missile is built that weighs less and goes even faster. This results in the need to update the location within one second and the updates need to occur every 15 seconds. This will be a variant of the missile that will have different requirements associated with it. After system requirements are developed, they must be allocated to components of the system. This is typically where an embedded system picks up requirements. For example, the first missile will go through three steps to update its location. Each step will have a time associated with it and the total allocation of the time will be two seconds or less. First, it must read the terrain data which is a requirement for the altimeter. Then the missile stores the terrain data. The mission software will then store the data and compare the data to the onboard map to determine if an update is required. The missile must then perform the update. Each step will have a performance requirement associated with it, the total of which is no more than the required three seconds.

Technology deployed

Table 1 shows a sample requirements traceability matrix (RTM) for a few requirements in a system. Now multiply that matrix by hundreds and imagine the complexity of managing that traceability. Without a requirements management tool to manage these relationships throughout the system, it is impossible to understand their effect upon one another and the impact of any changes. Think about the traceability relationships required to manage a complex system that has several layers of requirements (i.e. many subsystems). Traceability ensures that all user needs are being satisfied by the system and that no system features are added at a phase in the project that are not derived from a user need. Traceability also allows engineers to see what requirements might be impacted by a proposed change. If you consider not only the traceability, but also attributes (additional information for requirements) that must be maintained, the task grows even more complex.

Requirements and Other Development Activities

Requirements form the basis of the project information. Everything done on the project is based on the requirements. Models provide a useful tool for design teams, enabling them to better understand the system and develop requirements. A model is a representation of something real, produced to represent at least one aspect of an entity or system to be investigated by reducing complexity. Models may also be produced for testing ideas. A model may take a physical form or may be expressed in the form of a specification. Select the type of modeling you are going to do on the project carefully. Keep in mind that a single model does not represent all aspects of the project, but usually just a single view. Usually multiple models are used to describe very complex systems and each model has a specific purpose. Software developers rely on requirements engineers to give them the information they need to design and develop the system. Collaboration with the developers early in the process will help ensure technical issues are resolved as early as possible and then potential solutions are vetted with the users. In most

Requirements Traceability Matrix (RTM) Business User Requirement BR1.

System Requirement UR1.

Design Requirement

Document

SR1.

DD1.

SR2.

DD1.

SR3.

DD2.

UR2.

SR4.

DD3.

UR3.

SR5.

DD3.

SR1.

DD1.

table 1 Requirements traceability matrix shows the complexity and interconnectedness of the process.

organizations, there is a gap in traceability between system requirements and the design models. It is important to trace requirements to design (textual or models) to ensure all requirements are being considered in the solution. This is also necessary to do a comprehensive impact analysis when changes are proposed to the requirements. Although this takes time and effort, the benefits of maintaining some kind of traceability to design artifacts is important. Requirements also form the basis for all testing that must take place. Part of the requirements engineering responsibility is to ensure that all requirements are tested and that the system has passed these tests. In a traceability model, all requirements should link to some kind of test to ensure that all requirements are evaluated at the correct level (Figure 2). As you can see from this example, testing at the lowest level can be done quickly. The altimeter is tested outside of the missile for performance. The next level of testing is done in a laboratory environment that simulates the missileâ&#x20AC;&#x2122;s activity. The final test, actually shooting a missile is very costly and difficult to do. The hope is that by testing individual subsystems first the majority of bugs will be found at this level and not at the end when the missile is actually launched.

Requirements traceability provides an essential tool to ensure project success, not just in terms of performance but also in meeting time-to-market goals. With requirements traceability, everything on the project is traced back to a requirement. It ensures that there arenâ&#x20AC;&#x2122;t any unnecessary efforts. If the argument arises on the project about who is doing what, or why, or on whose direction, the team can always return to the requirements. Without a clear set of requirements to drive the direction of a project, we are like a ship floating in the ocean with no course. Visure Montreal, Quebec. (514) 944-0154. [www.visuresolutions.com].

RTC MAGAZINE JULY 2012

technology deployed Code Requirements and Verification

Out of the Passenger’s Seat: Requirements Traceability to Drive the Software Development Process Static requirements traceability has its strong and weak points. Dynamic traceability can serve as a way to adapt the requirements traceability matrix as its component parts change. Not only can the RTM be shown as a way to connect many parts of a whole, but also to drive the development process itself. by Jared Fry, LDRA Technology

equirements traceability has become ubiquitous in the software development process. While useful in many environments, it is especially so in regard to safety-critical systems. The requirements traceability matrix (RTM) is a crucial artifact for the verification process and provides insight into the many interconnected aspects of the development ecosystem. An RTM attempts to represent a complex, dynamic environment within a static context. This conversion introduces weakness into the traceability.

Static Requirements Traceability and its Drawbacks

During the software development process, many artifacts are generated with various links to one another. These artifacts are wide-ranging in scope. Everything from high- and low-level requirements, to models, designs, source code and test cases may be linked. This connection between artifacts gives a view into how each object

JULY 2012 RTC MAGAZINE

is decomposed into the others. A high-level requirement may be represented by several lower-level requirements, each with several models, multiple lines of source code, and numerous tests associated with them. These “many-to-many” relationships create a complex web of bonds that can be difficult to grasp (Figure 1). The use of an RTM can help to unravel that web. The matrix provides an organized structure in which these complex relationships can be understood. From any point within the RTM, developers should be able to determine what it took to get to that point and where they can go from that point. An example of this is the verification of safety-critical systems. The need arises to prove that requirements have been implemented and behave properly. The RTM can show where high-level requirements decompose into lower-level requirements, source code and associated tests. Only when these tests are completed and passed can the requirement be considered fulfilled.

Despite the many benefits of requirements traceability, it does have some drawbacks, the majority of which stem from a limited ability to represent the RTM in a sufficiently useful way. Often the matrix is kept as an addition to the end of each document or as a spreadsheet. This flat representation must be constantly maintained to stay up to date with changing artifacts. Due to the complex nature of what the RTM represents, this maintenance must be performed diligently. Unfortunately, this is not always the case. The RTM can easily be outpaced by the development of its component parts. This will usually have the effect of rendering the RTM incorrect or completely obsolete. Updates to the traceability are often held until the time of verification or artifact release. This delay introduces the potential of traceability errors. Meanwhile, the sheer complexity of the RTM makes finding and discovering errors more difficult. The challenge of finding a single requirement that is not mapped correctly or a test case that is linked to the wrong source code is compounded when there are hundreds or thousands of these artifacts. Developers often rely on the RTM to determine the cost and risk involved in making changes to specific objects. If a requirement is being considered for revision, the impact will not fall to that artifact alone—all upstream and downstream associations will be affected. Determining the scope of this impact will require the use of the RTM. If it is out of date, not maintained properly, or difficult to follow, these estimates may not correlate properly to the real world, causing potential catastrophe to the development process.

Traceability Tools to Create a Dynamic Matrix

Modern requirements traceability tools can help to alleviate some of these weaknesses. These tools have been designed to access each artifact of the development process. This includes contact with other tools such as requirements management tools, modeling tools, and even testing tools. Traceability can then be created along these objects to form the RTM. Often this traceability can exist only within the tool and not require any type of modifications to the objects themselves. This link-

technology deployed

Requirements

Development

Verification

System Requirements

SW Architecture / Design

Test Procedures

Test Cases High-Level Requirements

Source Code Test Results

Low-Level Requirements

Executable Object Code

Review and Analysis Results

Figure 1 The complicated relationships between development artifacts can make a requirements traceability matrix (RTM) difficult to understand and maintain.

Figure 2 Bidirectional traceability within a requirements traceability matrix is dynamic and immediately indicates the upstream and downstream impact of changes within the software development lifecycle. TBmanager is the component in the LDRA tool suite offering this capability.

ing provides an easily accessible version of the RTM that replaces the previous means of storing the RTM as a static object. Ease of access is not the only advantage, though. Once the RTM has been developed, traceability tools provide some powerful features. A graphical representation of the traceability can be generated that greatly reduces the complexity inherent in the matrix. The “many too many” relationships can be quickly viewed and understood rather than remaining abstract to the user. Making the RTM easier to understand and view is a significant advantage for users attempting to implement requirements traceability in a dynamic way. An easier-to-digest RTM im-

Untitled-2 1

JULY 2012 RTC MAGAZINE

6/27/12 10:16 AM

proves the chances of finding errors in the linkage between objects. In fact, many tools will indicate to the user potential problems with the traceability, including unlinked objects and incorrect or outdated links. As the complexity of traceability becomes more transparent, the task of assessing the impact of changes is simplified. The upstream and downstream impact of a modification can now be quickly evaluated. If a requirement is to be updated, a simple analysis of the RTM can demonstrate its influence: • Decomposed requirements or models may need to be updated. • Source code that implements the requirement may need to be re-written.

Technology deployed

Software Requirements & Defect Reports

Project Managers

Manage requirements; assign verification & debug tasks

Map requirements to design and source code

Requirements Traceability Matrix (RTM)

Test Cases

Test Engineers

Model or Design Specification

Software Engineers Code Base

Verifies requirements against test cases

Implement requirements & verify design

Development & Build Engineers

Figure 3 User roles enable verification tasks to be distributed appropriately and managed through the RTM.

• Tests that were developed to verify the old requirement may no longer be valid. • Cost and risk can be applied to these potential changes to create a true estimate that better mirrors the real world. When linked objects are modified, the dynamic aspects of modern traceability tools are brought into focus. The traceability utility connects to artifacts external to itself. It can detect when changes have been made to those artifacts and report to the user. Potential impacts from these changes can immediately be investigated and acted upon appropriately. This can be as simple as re-running test cases to verify that updated code still meets the linked requirement. It may also become complex enough to touch each linked artifact connected to the changed object. Regardless of complexity, a dynamic RTM can provide the necessary information in real time as changes are made. This alleviates the problem of the RTM becoming outdated or obsolete; instead, the most current version of each object will be represented in the traceability. Any impact from any changes detected will be made visible to the user, who can take the actions needed to keep the RTM in balance (Figure 2).

Traceability Matrix as a Driver of the Development Process

The next step in the evolution of re-

quirements traceability is for it to grow from a simple artifact of the development process into the main driver of the process itself. The foundation for this expansion has already been laid. The RTM consists of all major artifacts generated by the software development process and the relationships between them. Once these links are identified, the associated tasks can be identified as well. The addition of these tasks to the RTM is the starting point of the transformation. Cutting-edge requirements tools are already implementing the necessary features to allow the RTM to reach its highest potential. For example, each implementation of a requirement in source code will require analysis and testing. These will generally be expanded on in a software testing plan that is included as an artifact in the RTM. Each test outlined in this document becomes a task that must be completed to validate the associated requirements. These tests can then be identified when validation of the requirement is needed such as when linked source code has been modified. The ability of the traceability tools to connect with the software testing tools provides a simple and automated way to execute or update test cases when needed, and to access the results of those tests. The concept of users and roles is one of great importance. A user can be defined and assigned a role within the development

Untitled-9 1

OceanServer Digital Compass Products: • Low Cost Family of Electronic Compasses • Solid State Package • High Accuracy Offered in Serial, USB or TTL • Under $200.00 in Low Volume • Hard & Soft Iron Calibration • Fully Engineered Solution For Embedded Applications

ocean-server.com (508)-678-0550 RTC MAGAZINE JULY 2012

1/12/10 10:03:31 AM

technology deployed

team. Each role is given certain permissions with respect to how they may interact with the traceability (Figure 3). For example, project managers may be given full control of the RTM and its respective objects and also have the ability to assign tasks, while system designers may only be given the option of updating models, other design artifacts and their associated tasks. And software developers may only have access to the source code. Once these roles are established, the as-

signment of tasks can begin. The results of work performed on these tasks are immediately made available to the traceability managers. This will give an up-to-date view of the software development progress. As tasks are completed, progress is being made. If changes are made within the RTM, then new tasks may be generated or old tasks repeated to verify that everything still works as expected. The generation of tasks within an RTM is not always a single-step process.

A change to one object in the traceability may cause a cascade of tasks across several users in multiple roles. Take an update to a requirement as an example. Updating that single requirement is a task in itself that will often fall to the manager or technical writer role. Any decomposed requirements may also need to be updated. Design models that are associated with this requirement may need to be modified by a system designer. A software engineer may need to analyze and potentially re-write any source code that was implemented for that requirement. Lastly, a design engineer may need to update test cases or create new tests that fully verify that the implementation behaves as expected. Creating and enforcing a workflow is needed to manage the many paths that are involved with the software process. Each step can be defined and assigned to a user/ role. When a task in the workflow is completed, the next step in the workflow is initiated. This includes informing the assigned users that they have tasks to complete. The cycle repeats itself until all required tasks have been accomplished. By using a requirement traceability tool that can enforce a workflow into the development environment, each user becomes aware of the tasks that are assigned to them at a given time. Managers gain visibility into bottlenecks that slow down the development process and can redistribute tasks accordingly if users are overloaded with responsibilities. Requirements traceability has long been a key element in the software development lifecycle. Despite its inherent importance, it has often been lightly regarded and occasionally mistreated. By itself, a static RTM can be viewed as merely another artifact, one that has the constant potential to be obsolete or contain hard-to-diagnose errors. However, buried within the lowly RTM are the makings of a potent tool. By harnessing the best in traceability tools, the RTM can become a dynamic powerhouse that not only links artifacts, but can enforce teamor project-specific workflow to drive the entire software development process. LDRA Technology San Bruno, CA. (650) 583-8880. [www.ldra.com].

Untitled-19 1

JULY 2012 RTC MAGAZINE

2/3/12 3:55:28 PM

Learn to make your products

Rugged, Low-Power, Fast, Small and Easy to Use at the only conference dedicated entirely to flash memory!

Are you struggling with crucial Solid State Drive (SSD) decisions? Can SSDs resolve your application bottlenecks? How can you maximize SSD performance? Flash Memory Summit will explore new frontiers in enterprise storage and help you make the right choices. Summit highlights include:

Pre-Conference Seminar: SSDsâ&#x20AC;&#x201D;The Fundamentals Forums and Tutorials 11 Keynote Presentations FMS Theater

August 21-23, 2012 Santa Clara Convention Center Santa Clara, CA

www.FlashMemorySummit.com COTS Journal and RTC Magazine readers: Enter code SPGP and receive a $100 discount!

technology deployed Code Requirements and Verification

Transforming Code Analysis with Visualization As the volume of code becomes ever larger and more complex, more efficient methods and tools are needed to analyze it to find and correct defects. A newly emerging approach of graphical navigation can help engineers find their way through the thicket of complex interdependencies. by Paul Anderson, GrammaTech

ode analysis tools for finding programming defects in large code bases have proven very popular in recent years because they are effective at improving software quality. These tools work by finding paths through the code that may trigger risky, undefined, or unwanted behavior. For a large code base, tools may generate many warnings, so it is important to be able to process these efficiently. Each report must be inspected by an engineer to determine if it constitutes a real problem and whether it should be corrected. If a fix is proposed, then the engineer will want to know what other parts of the code may be affected by the change. The process of inspecting a warning to determine if it warrants action is known as triage. Some studies have shown that it takes an average of ten minutes to triage a warning report, but there is a large variance. Many reports can be dealt with in a few seconds, but the more complex ones can take significant effort. They may involve unusual control flow along paths that go through several procedures located in different compilation

JULY 2012 RTC MAGAZINE

units, and can depend subtly on different variables. The path may be feasible and risky in some contexts, but infeasible or harmless in others. Consequently it can be tricky and time-consuming for engineers to fully understand reports. The process of remediation can be similarly difficult because a proposed fix can have wide-ranging and unexpected consequences. A small change to a single procedure can potentially affect all functionality that depends on calls to that procedure. To be efficient at deploying the fix, an engineer will want to understand the other components of the software that are most strongly dependent on the change, so that validation activities can be prioritized to focus on those components first. The process of identifying the affected components is sometimes referred to as append_str

ripple-effect analysis or impact analysis. To deal with complexity, programs are usually designed so they can be thought about at different levels of abstraction, and implemented so that those levels are apparent in the source code. This is usually helpful but can sometimes be misleading because the implementation may diverge from the design and the boundaries of the clean abstractions may be violated. The essence of the issue is that programs can be large complicated beasts with complicated and subtle dependences between their components. New tools are emerging that help engineers penetrate this fog of complexity. Program visualization tools are proving especially useful at helping engineers gain insight into the subtleties of their program. When used appropriately they can amplify the effectiveness of a code analysis tool. An important property of these tools that makes them effective is that the visualization is completely and automatically generated directly from the code itself. Thus the engineer can see exactly what is in the code instead of an idealized representation that may hide too many essential details. The code can be shown at different levels of abstraction from high-level modules down through compilation units, then individual procedures and finally as the text of the code itself. Until fairly recently, code visualization tools have been limited in the amount of information they can display. However two trends have converged to make it possible to have tools that can show very large quantities of information, yet still be responsive to user actions. First, new techniques have emerged for automatically eliding information depending on the zoom level. Secondly, powerful video cards with hardware acceleration features for rendering detailed scenes

return_append_str

Figure 1 A visualization of the call graph showing the immediate neighbors of the function in which the bug occurs.

strcpy

Technology deployed

cmd_book

BookPGNReadFromFile

cmd_pgnload

PGNReadFromFile

yylex

append_comment

cmd_book

BookPGNReadFromFile

cmd_pgnload

PGNReadFromFile

yylex

append_str

return_append_str

strcpy

Figure 2 A larger fragment of the call graph. Functions in red are involved in a path that is likely to trigger the defect.

Figure 3 Top-down visualization of the source code for a medium-sized program.

have become ubiquitous, and the tools are now able to take advantage of this. The combination of these factors means that powerful new visualization techniques are feasible. Let’s look at some examples of how visualization can be used to help an engineer interpret the results of a code analysis tool.

Bottom-Up Visualization

Imagine a static analysis tool has reported a potential buffer overrun. The engineer responsible for triaging this warning must ask the following questions: • Is the warning a real defect? Static analysis tools make approximations that can cause false positives, so it is important to determine this first. • Is the bug likely to show up in the field? • Some buffer overruns are harmless, but others may cause crashes or may be critical security vulnerabilities. What are the consequences of this bug being triggered? • The point at which the buffer overrun occurs is seldom the exact point where the programmer erred. The error may be where the buffer was allocated or

where an index into the buffer was calculated. Where was the error that gave rise to this bug? • How should the defect be fixed? • Finally, are there other defects like this in other parts of the code? These questions are all best answered by starting from the point where the error occurs and working backward and forward through the low-level components of the code. Take for example a buffer overrun found in an open-source project. The offending code appears in a function named return_append_str as shown here: if (!dest) { newloc = (char *) malloc(strlen(s))+1; strcpy(newloc, s); return newloc; }

In this case it is easy to confirm that this is unquestionably a real bug—the +1 is in the wrong place (it should be between the parentheses), so the call to strcpy on the following line will always overflow the buffer by two bytes. The

next question is to determine if the defect is likely to show up in the field. Note that the defect is only triggered if the true branch of the conditional is taken. Perhaps this code is never deployed in an environment where that can happen. To answer this question, it is important to consider the many ways in which the function can be called. This is where a visualization tool can begin to be helpful. Figure 1 shows a visualization of the subset of the call graph in the vicinity of the defect. In this figure, functions on the left call functions on the right. From this it can be seen that the only call to return_append_str is from the function named append_str. The user can then expand the view by working backward in the call tree to show more detail. Once enough detail has been revealed to understand the context, it becomes evident that there are several different ways in which the function containing the bug can be called. The next question is whether some or all of these are dangerous. Figure 2 shows how this can be seen in the visualization. In this case the user has asked the analysis engine to determine which of the RTC MAGAZINE JULY 2012

technology deployed

paths leading to return_append_str are dangerous. The red path indicates that the defect is likely to be triggered if that sequence of calls occurs. From here it is

Top-Down Visualization

possible to show a textual representation of the call path from which it is easy to find the point of error and begin to plan a fix.

Not all code analysis tasks are suited to a bottom-up approach. Sometimes engineers want to take a high-level view

src swap.c search.c

quiesce.c

Quiesce

inlines.h AddXrayPiece

SwapOff atak.c

eval.c

AttackTo

Bishop Trapped

iterate.c High modified cyclomatic complexity may indicate this function is difficult to test properly.

leadz

GenAtaks Iterate

KPK Evaluate

test.c

LoneKing

inlines.h ScoreKBNK

leadz

ScoreP TestEval TestEvalSpeed

DoubleQR7

AttackFrom nbits

ScoreDev

cmd.c

FindPins

ScoreK

cmd_show

CTL

hung.c EvalHung

inlines.h nbits

Figure 4 A visualization of the call graph where the colorization indicates potentially risky values of a metric.

Allison >'y 'ϭϱϬ >ŽǁͲWƌŽĮůĞ &ĂŶůĞƐƐ ^ǇƐƚĞŵ ǁŝƚŚ /ŶƚĞů® ƚŽŵΡ ƵĂů ŽƌĞ Wh

^ŽŌǁĂƌĞ ŶŐŝŶĞĞƌ Ăƚ >ŽŐŝĐ ^ƵƉƉůǇ

Less than 1.5” tall, this system is DIN-rail and surface mountable ĨŽƌ ĚŝƐĐƌĞĞƚ ŝŶƐƚĂůůĂƟŽŶ͘ /ͬK ŝŶĐůƵĚĞƐ /ŶƚĞů 'ď > E͕ , D/͕ s' ͕ Ϯǆ ŚŝŐŚͲĐƵƌƌĞŶƚ h^ ͕ ƐŽůŝĚ ƐƚĂƚĞ ƐƚŽƌĂŐĞ ŽƉƟŽŶƐ͕ ĂŶĚ tŝͲ&ŝ͘

Untitled-2 1

JULY 2012 RTC MAGAZINE

7/10/12 11:05 AM

Technology deployed

of the code. Large programs can contain hundreds of thousands of procedures, and there may be millions of calls between procedures. Clearly it is infeasible to display all of these at once, so visualization tool designers have developed representations that summarize that information when the program is viewed from a high level, yet allow more detail to be revealed as the user drills down to lower levels. From a code analysis point of view, a common use case is for a manager to see which high-level modules have the highest density of warnings, and to be able to drill down through sub-modules to low-level components and finally to the code itself. Figure 3 shows a sequence of screenshots from a visualization tool (this is from CodeSonar) that demonstrates top-down visualization. Here the module hierarchy of the code is derived automatically from the file and directory structure. The leftmost part shows a fully zoomed-out view of the program. When zoomed out, the low-level calling relationships between procedures are projected onto the higher levels. As the user zooms in, more details start to emergeâ&#x20AC;&#x201D;first subdirectories, then source files, then individual procedures. The rightmost part shows how the visualization can lead directly to the textual representation of the code. Here the layout of nodes is chosen automatically by a graph-layout algorithm. Tools usually offer users a choice of different layout strategies. For the toplevel view, a â&#x20AC;&#x153;clusterâ&#x20AC;? layout where link direction is indicated by tapered lines, as in Figure 3, is often the most appropriate. A left-to-right layout is commonly more useful when showing a small number of nodes, such as when operating in a bottom-up mode. Operations on the elements of these views can be used to help an engineer plan a fix to a bug. The user can select a function and with a single command can select all other functions that are transitively reachable from that function. These will be in components that may be affected by the proposed fix, so testing activities should prioritize those parts first. Additional data can be overlaid on the visualization to help users understand what parts of the code warrant attention. The warning density metric mentioned

above is appropriate, but standard source code metrics may be useful too. Figure 4 shows a visualization of part of a small program where components containing functions with increasing cyclomatic complexity are highlighted in deeper shades of red. This helps users quickly see risky parts of the code. Visual representations of structures and relationships are well known to be helpful for users wishing to gain an under-

Untitled-12 1

standing of complex systems. Tools that generate visualizations of software systems are particularly useful when used in conjunction with code analysis tools. Together they can make the process of improving software quality much more efficient. GrammaTech Davis,CA. (800) 329-4932. [www.grammatech.com].

9:50:36 AM RTC MAGAZINE 1/11/12 JULY 2012

products &

TECHNOLOGY Rugged Router Runs Cisco IOS Software If you try to pronounce SFFR, it will probably come out “safer,” which is exactly what the so-named SFFR router from Extreme Engineering Solutions purports to offer: safer, secure, encrypted communications, courtesy of the company’s hardware, running Cisco IOS IP Routing software. At less than 72 cubic inches and 3.5 pounds, the SFFR router provides mobile ad hoc networking for military, heavy industry and emergency response, extending the Cisco enterprise infrastructure beyond the reach of traditional fixed-network infrastructure. The X-ES SFFR router incorporates Cisco IOS IP Routing Software with Cisco Mobile Ready Net capabilities to provide highly secure data, voice and video communications to stationary and mobile network nodes across both wired and wireless links. Combining the SFFR with UHF, VHF, Wi-Fi and other radio platforms enables integrators to create mobile,

FEATURED PRODUCT Family of 12 Volt Integrated Power IC Solutions Delivers High Power Density While many small embedded modules may integrate a range of devices with differing power input requirements, it is desirable to have one power input to the module itself. This, then, entails regulating the power inputs to serve the different onboard needs—a job that can become complex and space-hungry. A family of fully integrated 12 volt DC/DC converters is implemented in power MOSFET technology with double the power density over alternative solutions to address this challenge. GND S_OUT MASTER VOUT VIN M/S

VOUT

VFB Feedback & Compensation

VIN

GND S_IN

SLAVE VOUT

VIN M/S REXT

VFB

OPEN

51.1k

Up to 4x 15A EN23F0 devices can be paralleled for high load applications

wireless, ad hoc networks, without requiring a connection to central infrastructure. This rugged router, available in natural convection-cooled, conduction-cooled, or forced air-cooled enclosures, can be installed tomorrow in any vehicle or aircraft and/or deployed in the harshest of environments. Unique features of SFFR include the rugged router running Cisco IOS Release 15.2GC, Cisco Unified Communications Manager Express (CME) support and Cisco Mobile Ready Net, which allows for mobile ad hoc networking and Radio Aware Routing (RAR) with Dynamic Link Exchange Protocol (DLEP). In addition there is integrated threat control using Cisco IOS Firewall, Cisco IOS Zone-based Firewall, Cisco IOS Intrusion Prevention System (IPS) and Cisco IOS Content Filtering. Identity management is supported using authentication, authorization and accounting (AAA) and public key infrastructure. Hardware features include hardware acceleration and hardware encryption, four integrated 10/100/1000 Ethernet ports available via commercial RJ-45, industrial IP66/67, or military D38999 connectors. The mini-system footprint is only 4.88 in. (W) x 1.90 in. (H) x 7.70 in. (L) = 71.4 in3. Included is an integrated MIL-STD-704 28V DC power supply with MIL-STD-461 E/F EMI filtering and the whole system meets MIL-STD-810F environmental and MIL-STD-461E EMI specifications. Extreme Engineering Solutions, Middleton, WY. (608) 833-1155. [www.xes-inc.com].

JULY 2012 RTC MAGAZINE

According to Enpirion, which is emphasizing its focus on miniaturizing DC/DC power systems in applications such as telecommunication, enterprise, industrial, embedded computing and storage systems, the addition of the EN2300 family seeks to address this challenge with a line that includes the EN2340QI 4 amp, EN2360QI 6 amp, EN2390QI 9 amp and EN23F0QI 15 amp devices. These devices capitalize on Enpirion’s PowerSoC technology, which integrates the controller, power MOSFETs, high frequency input capacitors, compensation network and inductor. The EN2300 family offers small solution sizes with highly efficient performance, high reliability and a dramatic reduction in time-tomarket. Customers have already validated these benefits with more than 50 design wins ahead of the official market release. Enpirion’s proprietary high-speed transistor structure is implemented in 0.18u LDMOS process and excels at the ability to operate at high frequency while reducing switching losses. This is consistent with the focus on driving high-speed, low-loss power MOSFET technology as the key enabler for delivering the highest efficiency solutions with leading power density. The EN2300 devices offer compact solution footprints from 4 amp at 190 mm2 to 15 amp at 308 mm2, which represents up to a sixty percent area reduction versus competing alternatives at comparable performance. The devices support an input voltage range of 4.5 to 14V and an output voltage range of 0.6 to 5V. Enpirion’s EN2300 devices are available now. The EN2340QI is priced at $3.56, the EN2360QI at $4.41, the EN2390QI at $6.60, and the EN23F0QI at $9.00 in volumes of 1k units. Enpirion, Hampton, NJ. (908) 894-6000. [www.enpirion.com].

PRODUCTS & TECHNOLOGY

Power Efficient Dual Core Processing in a Ruggedized Chassis System An ultra-compact, fanless system is designed around the tiny VIA EPIA-P900 Pico-ITX board. The VIA AMOS-3002 from Via Technologies leverages the digital performance of the combined 1.0 GHz VIA Eden X2 dual core processor and the VIA VX900H media system processor (MSP) on the VIA EPIA-P900 board. The VIA AMOS-3002 offers a powerful, rugged and HD-ready industrial-class PC that combines 64-bit computing in an ultra-compact system. The highly integrated, all-in-one VIA VX900H boasts hardware acceleration of the most demanding codecs, including MPEG-2, WMV9 and H.264, in resolutions up to 1080p across the latest display connectivity standards, including native HDMI support, for next generation multimediaintensive applications. The system operates completely fanlessly within a robust chassis measuring 19.7 cm x 10.4 cm x 4.9 cm (WxDxH). The VIA AMOS-3002 has a certified operating temperature of -20 to 60 degrees C, vibration tolerance of up to 5 Grms and a shock tolerance of up to 50G. The VIA AMOS-3002 is also available with the VIA EPIA-P830 featuring a 1.0GHz Nano E-Series processor, offering an operating temperature of -20 to 70 degrees C. Storage is provided through a Cfast slot for a SATA interface Flash drive while an optional storage subsystem expansion chassis offers support for a standard 2.5â&#x20AC;? SATA drive. Comprehensive I/O functions on front and rear panels include two COM ports, six USB 2.0 ports, including two of which are lockable for increased ruggedization, line-in/out, one DIO port, one VGA and one HDMI port for display connectivity and two GLAN ports for dual Gigabit networking. Optional Wi-Fi and 3G networking are available through a MiniPCIe expansion slot.

Ad Index

VIA Technologies, Fremont, CA. (510) 683 3300. [www.via.com.tw].

Industry-Standard Modular System Pre-Validated for Intelligent Digital Signage A new open pluggable specification (OPS)-compliant modular solution is designed to make digital signage applications more connected, intelligent and secure, which results in devices that are easier to install, use and maintain. The KOPS800 from Kontron is based on the OPS specification created by Intel to help standardize the design and development of digital signage applications that use LCD, touch screens or projector display technologies. As an industry-standard system solution that can be docked into any OPS-compliant display, the KOPS800 simplifies development, reduces implementation costs and speeds time-to-market of a wide variety of enhanced functionality and graphics-intensive digital signage that help in delivering a rich user experience for information and retail systems that will be part of the retrofit of discrete legacy systems worldwide. The Kontron KOPS800 is based on the Intel Core i7 processor architecture and the Intel 6 Series HM65 / QM67 chipset. It features a comprehensive range of externally accessible I/O including Gigabit Ethernet RJ45, two USB 3.0 ports, a HDMI connector and audio jack. The Kontron KOPS800 also supports OPS JAE interconnect I/O such as HDMI, DisplayPort and USB 2.0 and 3.0. For added security, it supports Intel vPro with Intel Active Management Technology and features 802.11 a/b/g/n Wi-Fi for wireless connectivity. It also offers up to 8 Gbyte of dual channel DDR3-1600 non-ECC system memory and 80 Gbyte mSATA HD integrated storage. The system is pre-validated for use with a Microsoft Windows Embedded OS (such as WES7 Pro 64-bit) and Intel Audience Impression Metric (Intel AIM Suite) technology based on Anonymous Viewer Analytics (AVA) software. Digital signage systems employing the Kontron KOPS800 with the AIM Suite and running a content management system (CMS) can simultaneously play high-definition video while gathering valuable viewer demographics without invading their privacy to push custom-tailored messaging to the target audience, which results in delivering a rich, immersive user experience that can offer significant infrastructure cost savings. Kontron, Poway, CA. (888) 294-4558. [www.kontron.com].

Get Connected with technology and

Mini-ITX Module with Third Generation Intel now Core companies providing solutions Get Connected is a new resource for further exploration Processors

products, technologies and companies. Whether your goal A new Mini-ITX form into factor utilizes the third generation Intel is to research the latest datasheet from a company, speak directly Core processors. The new WADE-8013 from American Portwell is dewith an Application Engineer, or jump to a company's technical page, the signed to provide high performance and flexibility forinfunctional expangoal of Get Connected is to put you touch with the right resource. sion, and is suitableWhichever for applications in gaming, kiosk, digitaltype signage, level of service you require for whatever of technology, Connected will help you connect with companies and products medical/healthcare,Get defense and industrial automation andthecontrol. you are searching for. proThe third generation Intel Core www.rtcmagazine.com/getconnected cessors are manufactured on the latest 22nm process technology and are the most powerful and energy efficient CPUs from Intel to date. Portwell has used this technology to create a series of products that provide smart security, Getintelligent Connected with technology and companies prov cost saving management and Get Connected performance for industrial platforms. is a new resource for further exploration into pro datasheet from aiscompany, speak directly with an Application Engine Furthermore, Portwellâ&#x20AC;&#x2122;s WADE-8013 in touch with the right resource. Whichever level of service you requir based on the Intel Q77 Express chipset Get Connected will help you connect with the companies and produc and its third generation Core processors www.rtcmagazine.com/getconnected in an LGA1155 socket, which have integrated the memory and PCI Express controllers supporting two-channel DDR3 long DIMMs and PCI Express 3.0 to provide great graphics performance. The WADE-8013 has an extensive feature set, including SATA storage specification, up to 6 Gbit/s, on four SATA interface (two SATA III and two SATA II). It also provides connectors for RAID 0/1/5 and 10 modes, and the latest PCIe 3.0 (one PCI Express x16 slot) to support devices for double speed and bandwidth, which enhances system performance. The primary interfaces of the WADE-8013 include the latest USB 3.0 high-speed transmission technology, which supports 10 USB ports (four USB 3.0 ports on rear I/O and six USB 2.0 pin headers on board), two long-DIMM memory slots for DDR3 SDRAM up to 16 Get Connected with companies and Gbyte, andproducts three display types of VGA, HDMI and DVI-D. In addition, featured in this section. the WADE-8013 is equipped with dual Gigabit Ethernet connectors and www.rtcmagazine.com/getconnected an mSATA via Mini-PCIe, which provides customers with the ability to choose flexible devices for their storage-demanding applications.

Products

American Portwell, Fremont, CA. (510) 403-3399. [www.portwell.com]. Get Connected with companies and products featured in this section. www.rtcmagazine.com/getconnected

RTC MAGAZINE JULY 2012

PRODUCTS & TECHNOLOGY

Ultra Low Power ARM-Based Embedded Computer Designed for 3.5â&#x20AC;? to 12â&#x20AC;? HMI

/LQX[ .HUQHO ,QVWDOOHG 0% )/$6+ 0% 5$0 0K] $UP &38 'LJLWDO , 2 /LQHV %DVH Âą7 (WKHUQHW 86% DQG 6HULDO 3RUWV +DUGZDUH &ORFN &DOHQGDU :DWFKGRJ DQG $XGLR ,Q 2XW 9 '& 3RZHU

4W\

3UHORDGHG ZLWK '26 )ODVK )LOH 6\VWHP 0+] &RPSDWLEOH 3URFHVVRU . )ODVK . '5$0 3LQ ',3 6RFNHW 'LJLWDO , 2 /LQHV 6HULDO 3RUWV &RQVROH 'HEXJ 3RUW :DWFKGRJ ELW 7LPHUV 9 '& RU 9 3RZHU

4W\

Announced by Artila Electronics, the M-606 is an ARM9 WinCE 6.0 single board computer in a standard 3.5â&#x20AC;? form factor. It is powered by a 400 MHz Atmel AT91SAM9G45 ARM9 processor and equipped with 128 Mbyte DDR2 RAM, 128 Mbyte NAND Flash and 2 Mbyte DataFlash. The M-606 provides one 10/100 Mbit/s Ethernet, four USB 2.0 hosts, three RS-232 ports, one RS-422/485 port, audio, microSD socket and LCD TTL/ LVDS interface. The advanced internal 133 MHz multi-layer bus matrix and 64 Kbyte SRAM, which can be configured as a tightly coupled memory (TCM), sustain the bandwidth required by LCD with resolution up to 1280x860. The resolution of LCD can be configured by using the LCD configuration utility included in the pre-installed WinCE 6.0. The M-606 can drive 5 VDC and 12 VDC backlight of the LCD with up to 3A current output and PWM brightness control. The M-606 supports .NET framework 2.0 and userâ&#x20AC;&#x2122;s application can be developed by VB .Net, C# and C/C++. In addition, a remote display control utility that provides a graphic interface for the remote user to control M-606 is also included in the WinCE 6.0. Artila Electronics, New Taipei City, Taiwan. +886.2.86.67.23.40. [www.artila.com].

ZZZ MNPLFUR FRP VDOHV#MNPLFUR FRP

PC/104-Plus Dual Channel Gigabit Ethernet Module Untitled-3 1

6/25/12 1:14 PM

VERY COOL PRODUCTS!

INTERNATIO NAL

No matter how you shake it, bake it, or configure it, everyone knows the reputation, value and endurance of Phoenix solid state and rotating disk VME products. Leading the way in storage technology for decades, Phoenix keeps you on the leading edge with very cool products!

We Put the State of Art to Workk www.phenxint.com t 714-283-4800 PHOENIX INTERNATIONAL IS AS 9100/ISO 9001: 2008 CERTIFIED

Untitled-6 1

JULY 2012 RTC MAGAZINE

A two-channel, Gigabit Ethernet LAN module is designed to offer flexible, high-performance networking connectivity for industrial embedded applications. The PPM-GIGE-2 from WinSystems offers self-stacking I/O expansion on PC/104, EPIC and EBX SBCs based on the industry-standard PC/104-Plus form factor. This add-in module uses standard RJ-45 connectors to plug into 10/100/1000 Mbit/s networks using standard Category 5 (CAT5) unshielded twisted pair (UTP) copper cables. Two Realtek RTL8110s are the Ethernet controllers used by the PPMGIGE-2. They are supported by a wide range of operating systems including Windows, Linux and other x86 realtime operating systems. These onboard Gigabit Ethernet controllers combine a triple-speed, IEEE 802.3-compliant Media Access Controller (MAC) with a triple-speed Ethernet transceiver, 32-bit PCI bus controller and embedded memory. With state of-the-art DSP technology and mixed-mode signal technology, it offers high-speed transmission over CAT 5 UTP cable. Functions such as crossover detection and auto-correction, polarity correction, adaptive equalization, cross-talk cancellation, echo cancellation, timing recovery and error correction are implemented to provide robust transmission and reception capability at gigabit data speeds. The PPM-GIGE-2 requires only +5 volts at 500 mA (2.5W). It will operate from -40Â°C to +85Â°C. The module measures 90 mm x 96 mm (3.6â&#x20AC;? x 3.8â&#x20AC;?) and weighs only 88 grams (3 ounces). WinSystems also offers a single channel version of this board called the PPM-GIGE-1. Quantity one pricing for the dual channel PPM-GIGE-2 is $199, and $149 for the single channel PPM-GIGE-1. WinSystems, Arlington, TX. (817) 274-7553. [www.winsystems.com].

9/9/11 6:36:24 PM

PRODUCTS & TECHNOLOGY

EBX SBC Gains Power Thanks to Third Generation Intel Core Processor Powered by a third Generation Intel Core processor, a new EBX single board computer boasts a high performance level, combined with a highspeed PCIe expansion site that enables the integration of complex high-bandwidth functions, such as Digital Signal Processing and video processing. These applications have historically been performed with large chassis-based systems and custom hardware. The Copperhead from VersaLogic offers dual- or quad-core performance that allows high-end compute-bound and video-bound applications to now be tackled with just a single embedded computer board, not a set of boards in a rack. This opens new opportunities for automating high-end applications that need to be more portable, rugged, or lower cost than previous CPU architectures allowed. Based on the industry-standard EBX format of 5.75 x 8 inches, the Copperhead features onboard data acquisition via sixteen analog inputs, eight analog outputs and sixteen digital I/O lines and up to 16 Gbyte of DDR3 RAM. System I/O includes dual Gigabit Ethernet with network boot capability, two USB 3.0 ports, ten USB 2.0 ports, four serial ports and HD audio. Dual SATA 3 and SATA 6 interfaces support Intel Rapid Storage Manager with RAID 0, 1, 5, and 10 capabilities (SATA 6 ports only). Flash storage is provided via an mSATA socket, eUSB interface and a Mini PCIe socket. The Mini PCIe socket also accommodates plug-in Wi-Fi modems, GPS receivers, MIL-STD-1553, Ethernet channels and other plug-in mini cards. The Copperhead supports an optional TPM (Trusted Platform Module) chip for applications that require enhanced hardware-level security functions. The Copperhead offers models that support either PCIe/104 Type 1 or SUMIT expansion. The onboard expansion site provides plug-in access to a wide variety of expansion modules. The PCIe/104 Type 1 interface includes a PCIe x16 lane to support expansion with extremely high bandwidth devices. The SPX expansion interface provides additional plug-in expansion for low-cost analog, digital and CANbus I/O. Available in both standard (0° to +60°C) and industrial temperature (-40° to +85°C) versions, the rugged Copperhead boards meet MIL-STD202G specifications for mechanical shock and vibration. The high tolerance +12V power input allows the board to operate with power ranging from 9 to 15 volts. This eliminates expensive precision supplies and makes the Copperhead ideal for automotive applications. Optional high-reliability Ethernet connectors provide additional ruggedization for use in extremely harsh environments. Thermal solutions include heat sink, heat sink with fan and heat plate. For extremely high-reliability applications, IPC-A-610 Class 3 versions are available. Copperhead is RoHS compliant VersaLogic, Eugene, OR. (541) 485-8575. [www.versalogic.com].

Get ARM’d with Cogent Visit our Website @ WWW.COGCOMP.COM

CSB1724-Armada 300 Ɣ 1.6Ghz 88F6282 Sheeva Core

CSB1726-Armada XP

CSB1730B-Armada 500

Ɣ 1GByte 16-Bit Wide DDR3-800

Ɣ 800Mhz 88AP510 Processor

Ɣ Dual PCIe (x1)

Ɣ 1.33Ghz Quad Core (MV78460)

Ɣ Dual SATA Gen 2

Ɣ 2GByte 64-Bit Wide DDR3-1333

Ɣ 1GByte 32-Bit Wide DDR3-800 Ɣ Dual PCIe (x1) Ɣ SATA Gen 2

Designed & Manufactured In house !! 17 Industrial Dr. Smithfield, RI 02917 Untitled-1 1

Ɣ PCIe - One x4 and Two x1 ports Ɣ Dual SATA Gen 2

24 hr email support !! Sales@cogcomp.com

6/6/12 3:32 PM RTC MAGAZINE JULY 2012

PRODUCTS & TECHNOLOGY

Wireless Sensors Receive Global Frequencies Support Wireless sensor hardware is now available from Monnit in an added range of frequencies— in both 868 MHz and 433 MHz ISM radio frequency bands. These radio frequencies are available in addition to the standard 900 MHz wireless sensor hardware released by Monnit in 2010. The availability of these additional radio frequencies ensures that wireless sensors can be used for global applications with the 868 MHz frequency band being primarily used in Europe, Middle East and Africa (EMEA) and 433 MHz frequency being used in Asia and South America. “We have made our entire offering of wireless sensors, gateways and accessories available in 900, 868 and 433 MHz operating frequencies to address the immediate demands of our ever growing customer base. These additional radio frequencies allow our sensors to be used worldwide, while ensuring reliable low power and long range operation” said Brad Walters, founder and CEO of Monnit. Key features of Monnit wireless sensors include support for the 900, 868 and 433 MHz wireless frequencies. Cellular, Ethernet and USB gateways are also available. Wireless hardware is optimized for reliable, low power and long range operation and in addition, free online sensor data storage, configuration, monitoring and alerting are available. Monnit currently provides 28 different types of wireless sensors used to detect and monitor functions that are critical to business or personal life, including; temperature, humidity, water, light, access, movement and much more. Monnit’s wireless gateways transmit data between local sensor networks and the iMonnit online sensor monitoring and notification system. All Monnit wireless sensors include free basic iMonnit online sensor monitoring with SMS text and email alerting. Monnit, Kaysville, UT. (801) 561-5555 [www.monnit.com].

Secure Hypervisor Offers Increased Endpoint and Server Protection A new version of a secure hypervisor has been designed to offer military-grade protection for the latest generation of laptops, desktops, servers and embedded systems, helping protect these connected devices from malicious cyber attacks. With its small footprint, high performance and flexible virtualization support on the latest generation of Intel multicore processors, LynxSecure 5.1 from LynuxWorks brings the benefits of virtualization to users that have not had the opportunity to use virtualization before because of size or security issues. LynxSecure offers a truly secure multi-domain platform empowering users to have separate virtual machines for browsing and corporate functions on endpoints, and also giving the ability to securely host multiple tenants on a single blade for cloud and server implementations. The latest LynxSecure version as demonstrated on one of the latest generation of endpoints, a Dell XPS13 Ultrabook, showcases new key features such as focusbased processor optimization, a new secure console for user interaction with the secure platform, and new virtual network support for increased inter-virtual machine communication. LynxSecure 5.1 is also the first version of LynxSecure that supports the 3rd Generation Intel Core processor family. LynxSecure provides one of the most flexible secure virtualization solutions for use in Intel architecture-based embedded and IT computer systems. Designed to maintain the highest levels of military security and built from the ground up to achieve it, LynxSecure 5.1 now offers an industryleading combination of security with functionality, allowing developers and integrators to use the latest software and hardware technologies to build complex multi-operating systems (OS)-based systems. LynxSecure 5.1 offers two types of device virtualization, either direct assignment of physical devices to individual guest OS for maximum security, or secure device sharing across selected guest OS for maximum functionality in resource-constrained endpoints such as laptops. LynxSecure also offers two OS virtualization schemes: para-virtualized guest OS such as Linux offering maximum performance; and fully virtualized guest OS such as Windows, Solaris, Chromium, LynxOS-178 and LynxOS-SE OS requiring no changes to the guest OS. Another key performance feature that LynxSecure offers is the ability to run both fully virtualized and para-virtualized guest OS that have Symmetric Multi-processing (SMP) capabilities across multiple cores. PARTITION 0

PARTITION 1

PARTITION 2

PARTITION n

LynxOS-SE RTOS

WINDOWS 7

WINDOWS XP

SECURE DEVICE SERVER

DIRECT DEVICE ASSIGNMENT

ETHERNET

VIRTUALIZED SHARED DEVICES

LynxSECURE

SEPERATION KERNEL AND EMBEDDED HYPERVISOR

GRAPHICS

USB

SATA

PHYSICAL DEVICE ASSIGNMENT

ETHERNET 2

PHYSICAL DEVICES

LynuxWorks, San Jose, CA. (408) 979-3900. [www.lynuxworks.com].

JULY 2012 RTC MAGAZINE

Rugged COM Express with Third Generation Intel Core i7 Supports USB 3.0 and PCI Express Gen 3 A rugged COM Express module is targeted for airborne and vehicle-mounted military computers and human machine interface (HMI) applications required to function in harsh environments. The Express-IBR by Adlink Technology is a COM Express Type 6 module that supports the quad-core and dualcore third generation Intel Core i7 processors and Mobile Intel QM77 Express chipset. Following Adlink’s Rugged By Design methodology, the Express-IBR is suitable for use in environments prone to severe shock, vibration, humidity and extended temperature ranges. The Express-IBR is powered by a quador dual-core third generation Intel Core processor and provides support for USB SuperSpeed 3.0, PCI Express (PCIe) Gen 3, and up to three independent displays. The COM Express module offers up to 16 Gbyte ECC 1333 MHz DDR3 memory in two SODIMM sockets. Three Digital Display Interfaces can be independently configured for DisplayPort, HDMI or DVI. PCIe x16 (Gen3) can serve for external graphics or general purpose PCIe (optionally configured as 2 x8 or 1 x8 + 2 x4); as well as two SATA 6 Gbit/s, two SATA 3 Gbit/s, Gigabit Ethernet and eight USB 2.0 interfaces. The Express-IBR with dual-core processor is validated for reliable performance in extended temperatures ranging from 40° to +85°C and features a 50% thicker printed circuit board (PCB) for high vibration tolerance. The Express-IBR is a modular, power efficient solution for applications running in space-constrained, extreme rugged environments. It is compatible with the COM Express COM.0 Revision 2.0 Type 6 pinout, which is based on the popular Type 2 pinout, but with legacy functions replaced by Digital Display Interfaces (DDI), additional PCI Express lanes and reserved pins for future technologies. The new Type 6 pinout also supports SuperSpeed USB 3.0 interface, which was unavailable in COM.0 Rev. 1.0. ADLINK Technology, San Jose, CA. (408) 360-0200. [www.adlinktech.com].

REAL-TIME & EMBEDDED COMPUTING CONFERENCE WWW.RTECC.COM W

COME TO RTECC REGISTER

IT’S COMPLIMENTARY! AND MORE AWESOME THAN WORK! COMING TO: IRVINE, CA ON AUG. 21 SAN DIEGO ON AUG. 23

TAKE A DAY TO LEARN A BOUT THE NEWES T IID DEAS IN TH E EMBEDDED INDUSTRY. CHECK OUT THE LATEST DEMOS. LISTEN TO T ALKS FROM THE EXPERTS. GET OUT OF YOUR OFFIC E. RETURN WIT H INSIGHT A BOUT THE FUTUR E OF THE IN DUSTRY.

PRODUCTS & TECHNOLOGY

PICMG 1.3 SHB Features the Latest 22nm Intel Processors A full-size PICMG 1.3 system host board (SHB) provides high-performance graphics and flexible PCI Express expansion, and is suitable for a wide range of applications across several fields including factory automation, image processing, kiosk, medical and military. The ROBO-8111VG2AR from American Portwell is based on the third generation Intel Core processors and the latest Intel Xeon processors manufactured on 22nm process technology with energy efficient architecture. The board features two-channel DDR3 long DIMMs up to 16 Gbyte and ECC to support the Xeon processor E3-1275v2 and the Xeon processor E3-1225v2. The PCI Express 3.0 from the Xeon processors provides three flexible combinations: one PCIE x16, two PCIE x8 or one PCIE x8 plus two PCIE x4 lanes for versatile applications. The Xeon processors on LGA 1155 socket are paired with the Intel C216 chipset. It is also offered with the third generation Core processor family with an integrated, enhanced graphics engine, which provides significant 3D performance, up to DirectX 11 and OpenGL 3.1 for a broad range of embedded applications. Supporting optimized Intel Turbo Boost Technology and Intel Hyper-Threading Technology, the third generation Intel Core processor family provides higher performance and increases processing efficiency. The Core processors are paired with the Intel Q77 Express chipset. The Portwell ROBO-8111VG2AR integrates dual Intel Gigabit Ethernet LAN chips capable of supporting Intel Active Management Technology 8.0 and also features four SATA ports, which support RAID 0, 1, 5 and 10 modes (two ports at 6 Gbit/s and two ports at 3 Gbit/s). ROBO-8111VG2AR with two serial ports (one RS-232 and one RS-232/422/485 selectable) supports legacy devices while it also provides one parallel port for traditional factory automation applications. The new feature, USB 3.0, provides upgraded bandwidth from 480 Mbit/s to 5 Gbit/s, greatly reducing the time for data transfer.

Entry-Level Module for COM Express Type 2 with New Atom Dual-Core Processors An entry-level Type 2 Pin-out COM Express module is available with three variants of the new Intel Atom dual-core processor generation, which are manufactured in 32nm technology. The conga-CCA from congatec is available with the Atom N2600 processor with only 3.5W TDP (1M Cache, 1.6 GHz); the Atom N2800 processor (1M Cache, 1.86 GHz) with 6.5W TDP; or the Atom D2700 processor (1M Cache, 2.13 GHz) with 10W TDP and up to 4 Gbyte single-channel DDR3 memory (1066 MHz). The chipset module, which is based on the Intel NM10, provides improved memory, graphics

American Portwell, Fremont, CA. (510) 403-3399. [www.portwell.com].

Demand-Response Monitoring Unit to Manage Electrical Loads on Demand A compact unit for monitoring energy usage at commercial and industrial facilities such as factories, warehouses, retail stores and office buildings can be connected to utility meters, plant equipment and facility systems. The OptoEMU Sensor DR from Opto 22 gathers real-time energy consumption and demand data. It then delivers that data to enterprise business and control systems and web-based applications for monitoring and analysis. In addition, the OptoEMU Sensor DR helps businesses take advantage of lucrative demand-response (DR) programs from their local utilities. In response to a request from the utility to reduce power use, the Sensor DR can signal electrical equipment to shed load. DR programs can provide revenue to businesses in three ways: first, from discounts for simply agreeing to shed load; second, from actual reductions in use; and third, from selling electricity back to the utility or energy provider. The OptoEMU Sensor DR first gathers energy data from up to two utility meters or submeters that emit a standard pulsing signal. Each pulse emitted corresponds to an amount of energy used, and by counting pulses the OptoEMU Sensor DR can track the total amount of energy used as well as demand. The OptoEMU Sensor DR can also receive power usage and other data from a variety of devices using the widely adopted Modbus communication protocol. Using Modbus over an Ethernet or serial network, the sensor can communicate with devices such as temperature sensors and flow meters, Modbus-enabled current transformers (CTs) and power analyzers, as well as larger facility systems such as plant equipment, building management systems and HVAC systems. Once gathered by the OptoEMU Sensor DR, real-time energy data is sent to web-based â&#x20AC;&#x153;software-as-a-serviceâ&#x20AC;? (SaaS) energy management applications and enterprise business systems, where it can be viewed and analyzed to develop effective energy management strategies that reduce costs. The OptoEMU DR is available in two models, one for use on both wireless and wired Ethernet networks, and one for use on wired Ethernet networks only. Pricing is $1,095 for the OPTOEMU-SNR-DR1, and $895 for the OPTOEMU-SNR-DR2. Opto 22, Temecula, CA. (951) 695-3000. [www.opto22.com].

JULY 2012 RTC MAGAZINE

and display functionalities plus intelligent performance and greater energy efficiency. The highlight of the COM Express module is the graphics performance of the integrated Intel GMA 3650 graphics chip. With a clock rate of 640 MHz it is twice as fast as the GPU of the previous Atom generation. In addition to VGA and LVDS, it has two digital display interfaces that can be executed for DisplayPort, HDMI or DVI. Four PCI Express x1 lanes, two SATA 2.0, eight USB 2.0 and a Gigabit Ethernet interface enable fast and flexible system extensions. Fan control, LPC bus for easy integration of legacy I/O interfaces and Intel High Definition Audio round off the feature set. The conga-CCA module is equipped with the new embedded firmware solution UEFI. The congatec board controller provides an extensive embedded PC feature set. Independence from the x86 processor means that functions such as system monitoring or the I2C bus can be executed faster and more reliably, even if the system is in standby mode.A matching evaluation carrier board for COM Express Type 2 is also available. The conga-CCA is priced starting at less than $225 in evaluation quantities. congatec, San Diego, CA. (858) 457-2600. [www.congatec.com].

PRODUCTS & TECHNOLOGY

Two Controller Kits Tap into High-Speed PCI Express Bus Two new controller kits for the ScanWorks platform for embedded instruments from Asset InterTech can accelerate test throughput by plugging into the high-speed PCI Express bus in the personal computer where ScanWorks is running. The new controller kits can apply ScanWorks non-intrusive board tests (NBT) to a circuit board. The PCIe-1000 is a single-TAP controller used for cost-effective JTAG testing. The PCIe-410 controller kit has a four-port interface pod that can test as many as four circuit boards simultaneously, reducing test times in high-volume manufacturing applications. Both the PCIe-1000 and the PCIe-410 controllers can also program memory or logic devices that have already been soldered to a circuit board. Both of these controllers take advantage of the high speeds of the PCI Express bus in the PC that is hosting ScanWorks. That means that ScanWorks will execute faster on the PC. Then, when the tests are applied to the unit under test (UUT), ScanWorks can execute JTAG tests at the speed of the processor, the FPGA or the boundary-scan devices on the board, up to the maximum speed supported by the controller. An additional advantage to the PCIe-410 controller is that it supports parallel test and programming operations via JTAG. Pricing starts at $4,995. ASSET InterTech, Richardson, TX. (888) 694-6250. [www.asset-intertech.com].

3U OpenVPX SBCs Bring 10 Gig Ethernet and PCI Express 3.0 A pair of third generation 3U OpenVPX single board computers (SBC) supports the latest interface technology based on the third generation Intel Core i7 processors. The two 3U OpenVPX SBCs from Kontron have native support for 10 Gigabit Ethernet and PCI Express 3.0 to meet the high bandwidth demands of network centric military, aerospace and transportation applications. The Kontron VX3042 and VX3044 are specifically designed to provide the appropriate combination of leading-edge performance, power efficiency and bandwidth to long-lifecycle applications. The Kontron VX3042 is based on the 2.2 GHz dual-core Intel Core i7-3517UE processor with configurable TDP between 14W and 25W. It offers up to 16 Gbyte soldered ECC DDR3 SDRAM and one XMC site to enable application-specific customization by populating the XMC slot with additional specialized XMCs including I/O, field bus and storage modules. Specifically designed for high-performance embedded computing, the leading-edge Kontron VX3044 integrates the Intel Core i7-3612QE quad-core processor with 2.1 GHz and up to 16 Mbyte soldered ECC DDR3 SDRAM. Combined with its powerful I/O backbone, multiple Kontron VX3044 enable HPEC systems with an unprecedented computing density in the compact 3U form factor. Common to both SBCs are the comprehensive Ethernet connectivity with 10GBASE-KR, 1000BASE-T and 1000BASEBX, eight lane PCI Express gen 3.0 and x1 PCI Express gen 2.0, 1x USB 3.0 and 4x USB 2.0. Storage media can be connected via 2x SATA 3 and 2x SATA 2, both with RAID 0/1/5/10 support. As an option, onboard soldered SATA-Flash is available to host OS and application code. Three DisplayPort interfaces provide the increased graphics power of the integrated Intel HD Graphics 4000 to three independent monitors. And with the Kontron Smart Technologies with VXFabric’s ability to use Ethernet TCP/IP over PCI Express, VXControl for monitoring and control of critical parameters, and the PBIT system test solution, OEMs have the ability to simplify and accelerate the development of optimized and highly reliable applications. Their 100 percent backward and upward compatible pin-out to all Kontron based 3U VPX SBCs, enables system upgrades without a redesign of the backplane. OEMs can also profit on the software side, as they just have to write the code once and can run it on the complete Kontron OpenVPX product range, enabling real drop in replacements. Kontron, Poway, CA. (888) 294-4558. [www.kontron.com].

AdDigital Index Small Isolated I/O Signal Conditioning Boards

Two new isolated digital signal conditioning and termination Get Connected with technology andat boards for OEM embedded computer system designers are aimed companies providing solutions applications requiring signal isolation between a computer andnow its moniGet Connected is a new resource for further toring/control points. The ISM-TRM-RELAY has 16 independent, Sin-exploration into products, technologies and companies. Whether gle Pole Double Throw (SPDT) relays. The ISM-TRM-COMBO (pic- your goal is to research the optically latest datasheet from ainputs, company, speak directly tured) provides a combination of eight isolated eight with an Application Engineer, or jump to a company's technical page, the optically isolated outputs, and eight Form C relays on one board. To goal of Get Connected is to put you in touch with the right resource. ensure reliable connection removal and for insertion Whicheverwith level easy of service you require whateverwith type offield technology, wiring, industry-standard, 3.5 mm pluggable connectors are used. Get Connected will help you connect with the companies and products you are searching The ISM-TRM-RELAY has 16for. independent Form C relays on board. www.rtcmagazine.com/getconnected There are two signal lines, Normally Open (NO) and Normally Closed (NC) plus a Common associated for each relay. The contacts can handle 6A @ 250 VAC / 24 VDC for applications requiring medium current capacity plus isolation. The ISM-TRM-COMBO offers three Getcircuits Connected with technology and companies prov different signal conditioning for Get Connected is a new resource for further exploration into pro maximum configuration flexibility. There datasheet from a company, speak directly with an Application Engine are eight isolated inputs that have the catouch with the right resource. Whichever level of service you requir pability to support eitherinan active high or active low Get Connected will help you connect with the companies and produc signal from 5 to 30 volts. Every input line is optically isolated from the www.rtcmagazine.com/getconnected others and from the computer interface circuits with an isolation voltage rating that exceeds 2500V. Then each input line is wired to a contact bounce eliminator to remove extraneous level changes that result when interfacing with mechanical contacts such as switches or relays. There are eight isolated outputs for applications requiring medium current capacity. Each output has a NPN Darlington transistor pair with an integral clamp diode for switching inductive loads and transient suppression. The collector-emitter voltage can withstand up to 30 volts and each output is capable of sinking 500 mA of current. The isolation voltage rating exceeds 2500V. The ISM-TRM-COMBO also has eight independent SPDT Form C relays that can handle 6A @ 250 VAC / 24 VDC for applications requiring medium current capacity plus isolation. Get Connected with companies and Both modules are RoHS-compliant and can operate over an industrial products featured in this section. temperaturewww.rtcmagazine.com/getconnected range of -40° to +85°C.Special configurations can be populated for OEM applications. Quantity one pricing for the ISM-TRMRELAY is $299 and for the ISM-TRM-COMBO $255.

Products

WinSystems, Arlington, TX. (817) 274-7553. [www.winsystems.com]. Get Connected with companies and products featured in this section. www.rtcmagazine.com/getconnected

RTC MAGAZINE JULY 2012

with an Application Engineer, or jump to a company's technical page, the goal of Get Connected is to put you in touch with the right resource. Whichever level of service you require for whatever type of technology, Get Connected will help you connect with the companies and products you are searching for.

www.rtcmagazine.com/getconnected

Advertiser Index Get Connected with technology and companies providing solutions now Get Connected is a new resource for further exploration into products, technologies and companies. Whether your goal is to research the latest datasheet from a company, speak directly with an Application Engineer, or jump to a company's technical page, the goal of Get Connected is to put you in touch with the right resource. Whichever level of service you require for whatever type of technology, Get Connected will help you connect with the companies and products you are searching for.

www.rtcmagazine.com/getconnected

Company Page Website

ACCES I/O Products, Inc............................. 7.......................................www.accesio.com

Phoenix International.................................. 60.................................... www.phenxint.com

Advanced Micro Devices, Inc...................... 68.........................www.amd.com/embedded

Phoenix Technologies, Ltd.......................... 20..................................... www.phoenix.com

American Portwell...................................... 21.....................................www.portwell.com End Real-Time & Embedded of Article Products Calculex..................................................... 67.....................................www.calculex.com

Computing Conference............................... 63......................................... www.rtecc.com

Cogent Computer Systems, Inc................... 61................................... www.cogcomp.com Get Connected with companies and products featured in this section. Commell.................................................... 50................................www.commell.com.tw www.rtcmagazine.com/getconnected Datakey Electronics.................................... 24................................. www.ruggedrive.com

RTD Embedded Technologies, Inc............ 34, 35..........................................www.rtd.com Schroff....................................................... ........................................ www.schroff.us with companies mentioned in this 30. article. www.rtcmagazine.com/getconnected Super Micro Computer, Inc.......................... 5................................. www.supermicro.com

Dolphin Interconnect Solutions..................... 4...................................www.dolphinics.com

Themis Computer....................................... 37.......................................www.themis.com

ECCN.com.............................................. 42, 43.......................................www.eccn.com

Get Connected with companies mentioned in this article. USB Modules & Data Acquisition Showcase. ... 15................................................................. www.rtcmagazine.com/getconnected Xembedded................................................ 31............................... www.xembedded.com

Get Connected with companies and products featured in this section.

Elma Electronic, Inc..................................... 2...........................................www.elma.com www.rtcmagazine.com/getconnected

Get Connected

Extreme Engineering Solutions, Inc............. 11...................................... www.xes-inc.com Flash Memory Summit................................ 53.................. www.flashmemorysummit.com Inforce Computing, Inc............................... 17.......................www.inforcecomputing.com Innovative Integration.................................. 52........................... www.innovative-dsp.com Intel Corporation...................................... 18, 19........................................www.intel.com Intelligent Systems Source.......................... 39............ www.intelligentsystemssource.com JK Microsystems, Inc.................................. 60...................................... www.jkmicro.com Logic Supply, Inc........................................ 56................................ www.logicsupply.com Measurement Computing Corporation......... 14.....................................www.mccdaq.com MEDS........................................................ 33...............................www.medsevents.com

ARE YOU

Men Micro, Inc........................................... 41.................................. www.menmicro.com

A seasoned embedded technology professional?

Microsemi Corporation............................... 25................................. www.microsemi.com

Experienced in the industrial and military procurement process?

MSC Embedded, Inc................................... 36...........................www.mscembedded.com Nallatech.................................................... 57................................... www.nallatech.com Ocean Server Technology, Inc..................... 51............................. www.ocean-server.com One Stop Systems, Inc............................... 49.........................www.onestopsystems.com

Interested in a career in writing? CONTACT SANDRA SILLION AT THE RTC GROUP TO EXPLORE AN OPPORTUNITY sandras@rtcgroup.com

RTC (Issn#1092-1524) magazine is published monthly at 905 Calle Amanecer, Ste. 250, San Clemente, CA 92673. Periodical postage paid at San Clemente and at additional mailing offices. POSTMASTER: Send address changes to RTC, 905 Calle Amanecer, Ste. 250, San Clemente, CA 92673.

JULY 2012 RTC MAGAZINE

THE OTHER GUY’S HO-HUM SOLID STATE RECORDER

T BT

CALCULEX ‘s

MONSSTR®SOLID STATE RECORDER ON STEROIDS!!! TM

Recorder Integrated Processor and Router

For details, contact:

CALCULEX

132 W. Las Cruces Avenue, Las Cruces, NM 88001 575-525-0131