Page 1

Industrial Systems Go Modular for Efficiency and Flexibility OpenGL Speeds High-End Graphics for On-Chip GPUs Power Conversion Boosts Renewables on the Smart Grid Real World Connected Systems Magazine. Produced by Intelligent Systems Source

14 28 32

Vol 16 / No 11 / NOV 2015

FPGAs Speed Interfaces for Solid State Storage

An RTC Group Publication

Critical Recording in Any Arena When You Can’t Afford to Miss a Beat!


Introducing Pentek’s expanded line of Talon COTS, rugged, portable and lab-based recorders. Built to capture wideband SIGINT, radar and communication signals right out-of-the-box: • • • • • • • • • •

Analog RF/IF, 10 GbE, LVDS, sFPDP solutions Real-time sustained recording to 4 GB/sec Recording and playback operation Analog signal bandwidths to 1.6 GHz Shock and vibration resistant Solid State Drives GPS time and position stamping ® Hot-swappable storage to Windows NTFS RAIDs Remote operation & multi-system synchronization ® SystemFlow API & GUI with Signal Analyzer Complete documentation & lifetime support

Pentek’s rugged turn-key recorders are built and tested for fast, reliable and secure operation in your environment. Call 201-818-5900 or go to for your FREE High-Speed Recording Systems Handbook and Talon Recording Systems Catalog.

Pentek, Inc., One Park Way, Upper Saddle River, NJ 07458 • Phone: 201.818.5900 • Fax: 201.818.5904 • • Worldwide Distribution & Support, Copyright © 2013 Pentek, Inc. Pentek, Talon and SystemFlow are trademarks of Pentek, Inc. Other trademarks are properties of their respective owners.


The Magazine of Record for the Embedded Computing Industry




“Platforms” Make the Foundation for the Future of Embedded Development by Tom Williams, Editor-in-Chief




Modular Platform Approach Enables Predictive Productivity by Maria Hansson, Kontron

22 NVMe Over Fabric Technology Enables New Levels Of Storage Efficiency In Today’s Data Centers DEPARTMENTS 06




Robert Pierce, Altera, Conor Ryan and Joe Sullivan, NVMdurance



How Ever-Smaller Things Get Really Big





Latest Developments in the Embedded Marketplace

Newest Embedded Technology Used by Industry Leaders

Flash Management and FPGAs Pave Way for Reconfigurable SSD

NVMe Over Fabric Technology Enables New Levels Of Storage Efficiency In Today’s Data Centers by Shreyas Shah, Xilinx


Confused by Embedded SSDs? Don’t Be by Scott Phillips, Virtium



C and Its Offspring: OpenGL by Sean Harmer, KDAB



32 14

Help Wanted: Wind, Solar, and On-Grid Battery Storage by Brett Burger, National Instruments

Modular Platform Approach Enables Predictive Productivity RTC Magazine NOVEMBER 2015 | 3



Qseven IoT Gateway Development Kit PUBLISHER President John Reardon, Vice President Aaron Foellmi,

EDITORIAL Editor-In-Chief Tom Williams, Senior Editor John Koon,

A complete starter set for the rapid prototyping of embedded IoT applications.

Contributing Editors Colin McCracken and Paul Rosenfeld

ART/PRODUCTION Art Director Jim Bell, Graphic Designer Hugo Ricardo, 6262 Ferris Square | San Diego CA 92121 858-457-2600|

ADVERTISING/WEB ADVERTISING Western Regional Sales Manager Mark Dunaway, (949) 226-2023 Eastern U.S. and EMEA Sales Manager Ruby Brower, (949) 226-2004


Vice President of Finance Cindy Muir, (949) 226-2021

WE ASSURE YOU HIT A BULLSEYE EVERYTIME... TO CONTACT RTC MAGAZINE: Home Office The RTC Group, 905 Calle Amanecer, Suite 150, San Clemente, CA 92673 Phone: (949) 226-2000 Fax: (949) 226-2050 Web:

WITH JUST A COUPLE CLICKS. • See Instructional Videos • Shop Boards Online • Read Articles & More • Request a Quote

4 | RTC Magazine NOVEMBER 2015

Editorial Office Tom Williams, Editor-in-Chief 1669 Nelson Road, No. 2, Scotts Valley, CA 95066 Phone: (831) 335-1509 Published by The RTC Group Copyright 2015, The RTC Group. Printed in the United States. All rights reserved. All related graphics are trademarks of The RTC Group. All other brand and product names are the property of their holders.

NEW Low Power PC/104 SBC with Long Term Availability

Rugged Intel® Quad-Core Computer with Two Ethernet and USB 3.0

Industrial Quad-Core Freescale i.MX 6Q Cortex A9 ARM® SBC

Single Board Computers COM Express Solutions Power Supplies I/O Modules Panel PCs

Rugged Products for Your Critical Mission

When the success of your mission is on the line, you can’t afford to waste time and money on poor technical support or products with long lead times. With over 30 years of being a key supplier to the MIL/COTS industry, WinSystems understands your needs of products that withstand harsh environments, continuous uptime and long term availability. Our engineers are ready to guide you through product selection, customization, implementation and lifelong support. Considered by our clients, to be The Embedded Systems Authority, WinSystems focuses on the latest rugged embedded technology and provides a consultative approach to your systems design.

New Functionality, Exclusive Content, Fresh Design The NEW 715 Stadium Drive I Arlington, Texas 76011 Phone: 817-274-7553 I Fax: 817-548-1358


How Ever-Smaller Things Get Really Big by Tom Williams, Editor-In-Chief

Time was you could put together an embedded controller on a single board with a 32-bit microcontroller, some memory, an I/O device like a UART and a couple of lights and switches. Then it would sit there and maintain the temperature and monitor the volume of your kettle full of whatever. Today you can still develop an embedded system on a single board, some even smaller than before, but the landscape looks like a vast metropolis of functions and connectivity. The central unit today is often more than just a multicore (most commonly, quad core) 32-bit processor. It at least includes integrated graphics and a variety of cache and other assorted on-chip memory and memory controllers. Increasingly, such central units are SoCs with additional processing units such as DSPs and often programmable logic, FPGA fabrics connected with a high-speed internal bus. There are on-chip components for system services like power management, debug interface and security. There are internal buses connecting to on-chip peripherals like PCIe, UART, SPI, USB and others. There are network interfaces like 10GB Internet. The lists go on. Wi-Fi and other connectivity abounds. Then there’s video, audio, maybe other smaller core for dedicated purposes, timers, all kinds of other I/O. The small boards themselves are loaded with memory, flash and many have interfaces for SSD. Often, a large portion of the board space is taken up simply by the connectors needed—unless there is a COM-like interface to another carrier board. Need I continue? The other shoe to drop is of course, the software needed to even get started with development of these complex devices. At the very least, developers selecting them expect to be supplied with support libraries with proven drivers, for all on-chip peripherals and communication

6 | RTC Magazine NOVEMBER 2015

modes, communication between on-chip cores and basic board support utilities. That alone can be a tall order for the manufacturer. I was once interviewing a well-known silicon vendor about the introduction of one of these microcontroller-based SoC families when I asked, “What portion of your employees are software engineers?” They wouldn’t tell me. But it is becoming increasingly incumbent on silicon companies to offer more ready-to-go development support in the form of pre-qualified operating systems, network stacks and protocols, compilers, functional algorithm libraries, In addition, such developers need access to development tool chains with editors, debuggers, profilers, analysis tools and the like in order to start adding value and win the competitive battle for time-tomarket. And the trend is also growing to supply evaluation boards and/or board-based SDKs or platforms. And the pressure is also on for that semiconductor vendor to be the single point of support for all this to avoid the well-known cross finger pointing among various vendors when there is a problem—all of which contributes to frustration and adds to time-to-market. Selection of a central unit for a planned new design will rest heavily on the answers vendors can supply to questions about these issues. The explosion of integrated functionality and the co-explosion of system and application software have been accompanied by a parallel implosion of size and power consumption. At the same time, we hardly need mention again that the nature and scope of what we refer to as embedded systems has changed radically as well with the obvious examples of tablets and smartphones but also with the explosion of all manner of mobile and connected devices—both consumer and industrial—on the Internet of Things.

It is almost a natural law that software functionality expands to eventually pose a burden on existing hardware to the point where the hardware must be improved. Again, this has been and is the case for modern mobile, connected consumer and industrial devices. These devices are in the process of coalescing on graphical user interfaces with the similarities between the user experience of iOS and Android reaching a level of commonality. As Windows 10 carves out a bigger niche in the IoT, that commonality of experience can be expected to grow (Was that Bill Gates sticking a pin in a wax doll?). It thus seems inevitable that operating systems like iOS, Android and Windows 10, along with the underlying Linux and Java capabilities offered by Android, will become dominant in the world of embedded systems and the IoT with RTOSs playing an important but subordinate role. The performance of today’s processors has become so fast that many of what were once considered “real-time” demands can be adequately serviced by something like the present Linux kernel. For those applications and devices that still have hard and deterministic real-time requirements, an actual RTOS with interrupts and scheduling, etc., can often be assigned to one of the multiple processor cores, given its own I/O and memory and be able to operate under the veil of an OS such as Android within the context of whatever other apps are running, some of which will require these real-time services. Indeed, the world has gotten much bigger in the process of getting smaller.


Nwave Launches Free Weightless-N IoT SDK The Weightless SIG has announced the launch of a complete Weightless-N development environment for low-power wide-area network connectivity in IoT projects. The development kits include a desktop base station with the same functionality as a commercial grade base station packaged in a non-ruggedized casing, a base station antenna, an end product module mounted on a development carrier board with external connections, a module antenna and a complete set of cabling. The SDK incorporates an ARM Cortex-M3 MCU with 128KB of flash memory. The kit also includes supporting software tools, Simplicity Studio and a GNU C Compiler (GCC), together with a complete user guide. Weightless CEO, William Webb, commented “Designers have been keenly anticipating the launch of the Weightless SDK - that wait is now over” adding “We’re keen to see LPWAN projects commence rapidly so to celebrate the launch we’re making it easy to engage with Weightless technology by offering a limited number of kits available for free.” Nwave Technologies is making the development kit available at cost and the Weightless SIG is paying this cost making SDKs free to developers. A refund of the membership fee is also being offered by the Weightless SIG to Associate Members that submit product for certification. Weightless SIG Members get access to all of the open standard specifications, the test specifications, the test and certification programme and the right to sell product using the technology on a royalty free basis. Weightless technologies offer unique open standard access to IoT connectivity IP with substantive competitive advantage over alternative proprietary LPWAN technologies.

Zeidman Technologies and Codasip Collaborate on IoT Designs Zeidman Technologies, a provider of tools for building custom operating systems, and Codasip, a provider of application-specific instruction-set processor (ASIPs) design tools and IP, have announced that Zeidman Technologies has joined the ASIP Design Network. This collaboration will allow joint customers to benefit from complementary processor and operating system design tools that help drive efficient performance of IoT devices. Combining application-specific processors and application-specific OSs can drastically improve performance as well as reduce hardware resource and memory requirements. “Together, Zeidman Technologies and Codasip will help customers meet the need for lean, customized and highly efficient embedded systems to drive billions of IoT devices,” said Bob Zeidman, president of Zeidman Technologies. “Developers need the application-specific processor and operating system design resources that our companies provide to optimize the performance of their products.” The ASIP Design Network (ADN) brings together a rich ecosystem of companies spanning service providers, IP companies and embedded software suppliers. ADN member companies are working together to accelerate adoption of ASIPs for IoT and system-onchip (SoC) designs. The two organizations will provide introductions to their respective customers, who are providers of IoT devices that perform specific tasks as efficiently as possible while occupying the smallest possible footprint.

Icon Labs and Renesas Team up on Security for IoT and Industrial Automation Icon Labs has announced the integration of Icon Labs’ Floodgate security products with Renesas’ R-IN32M3 industrial network controller ICs and the Renesas Synergy Platform. The integrated solution creates a secure platform for IoT and industrial automation and extends the Internet of Secure Things initiative into industrial control systems. Icon Labs’ Internet of Secure Things Initiative defines a platform for developing secure, connected devices. The platform ensures that security is intrinsic to the architecture of the device itself and incorporates security management, visibility, device hardening, data protection and secure communications. These capabilities provide the foundation for the Industrial Internet of Secure Things. Natively securing the devices simplifies protection, audit, and compliance - independent of the secure perimeter, reducing the need for expensive and complicated security appliances. “Security has become a critical requirement for our customers in all segments especially in industrial and IoT applications. Partnering with Icon Labs allows us to provide a complete security solution that is fully integrated with our hardware platforms,” stated Semir Haddad, Senior Marketing Director, MCU and MPU Products and Solutions, Renesas Electronics America. “Icon Labs Floodgate product family provides a comprehensive security platform for developing secure, embedded devices using the new Renesas Synergy™ Platform for IoT or the R-IN32M3 ICs for Industrial Automation.” The integration of Icon Labs’ Floodgate products and Renesas’ hardware platforms provides an integrated embedded firewall, Modbus protocol filtering and Intrusion detection in addition to secure communication and authentication. They also provide small footprint crypto libs, integration with security management systems and security policy management with event and command audit log reporting. RTC Magazine NOVEMBER 2015 | 7


LTE Predicted to Become the Leading Technology for Cellular IoT Devices in 2019 A new research report from Berg Insight predicts that LTE will become the leading technology for cellular IoT devices in 2019. Berg Insight forecasts that global shipments of cellular IoT devices will grow at a compound annual growth rate (CAGR) of 20.1 percent to reach 239.7 million units in 2020. LTE device shipments started to take off in 2015 and are expected to surpass GPRS devices in four years’ time. “2G is still growing rapidly in emerging markets and has a clear cost advantage in Europe. The economics of 4G is however dramatically improved

with LTE Cat-0 and the upcoming LTE-M standard. Once these are in place there will be no more significant barriers left against migration from 2G”, says Tobias Ryberg, Senior Analyst, Berg Insight and author of the report. As a result of the direct move from 2G to 4G, Berg Insight believes that 3G will only serve as an interim technology in cellular IoT. Annual shipments of 3G cellular IoT devices are predicted to peak in 2018. Instead the main alternative to 4G cellular technologies will be Low Power Wide Area (LPWA) networking technologies. Berg Insight believes that the 3GPP’s recent initiative to define a new narrowband radio technology for IoT (NB-IOT) is highly significant and creates a unique opportunity for the mobile industry to include a new set of applications into its domain. “A global universal standard for lightweight IoT communication on public networks is essential for driving the market forward”, Ryberg concludes.

SITRI Launches in Silicon Valley to Accelerate Innovation in “More than Moore” The Shanghai Industrial µTechnology Research Institute (SITRI), the innovation center for accelerating the development and commercialization of “More than Moore” (MtM) solutions to power the Internet of Things (IoT), has announced the opening of SITRI Innovations in Belmont, California. SITRI Innovations addresses a gap that exists in the current “More than Moore” and IoT innovation ecosystem and provides a path for new entrepreneurs in the hardware space to bring their ideas to fruition. “More than Moore” is the next wave of semiconductor innovations such as MEMS, Sensors, Optoelectronics, RF, Bio, and micro-Energy that do not depend on feature-size driven CMOS technology (the “Moore’s Law”). The first of its kind for “More than Moore” and IoT hardware startups, SITRI provides entrepreneurs a full spectrum of services and resources designed to help them succeed in their development and commercialization phases. “The Internet of Things represents a vast opportunity and “More than Moore” technologies are at the heart of it,” said Charles Yang, CEO of SITRI Group. “However, the MtM silicon innovations needed requires a fusion of multi-disciplinary technologies which raises a new set of challenges in engineering and manufacturing, leaving the market open to only the largest and most sophisticated companies. SITRI Innovations addresses this by speeding up MtM innovation and commercialization, opening the IoT market to a much broader range of players and their ideas.” By tapping into the global ecosystem for the MtM industry, SITRI Innovations can provide startups with the resources of large corporations to access the R&D platform and critical supply chain partners needed to achieve high efficiency and fast time to market. SITRI’s unique 360-degree platform offers support to the startups in all areas, from proof of concept to engineering to fab to market studies and industry supply chain. 8 | RTC Magazine NOVEMBER 2015

Altera Partners with Intrinsic-ID to Develop Secure High-end FPGA Altera and Intrinsic-ID have announced their collaboration on the integration of advanced security solutions into Altera’s Stratix 10 FPGAs and SoCs. Physicall Uncloneable Function (PUF)-based key storage is a new requirement for many defense and infrastructure applications today to secure and bind software to hardware functions and prevent the cloning of systems. The integration of Intrinsic-ID’s PUF technology within Stratix 10 FPGAs and SoCs will greatly enhance the security capabilities of the devices, addressing the growing need for security for all components used in systems. Today’s FPGAs and SoC FPGAs are sophisticated, multi-function components that demand the latest advancements in hardware security as a defense against greater adversarial challenges. Intrinsic-ID’s PUF security solution adds strong anti-tamper protection to Stratix 10 FPGA-based systems by binding proprietary and sensitive design information to the unique physics of each individual device. Binding hardware functions and software to a PUF provides a very strong device authentication method and protection against cloning. The inclusion of PUF technology and the use of a Secure Device Manager (SDM) for security management, make Stratix 10 FPGAs and SoCs an ideal solution for use in military, cloud security and IoT infrastructure, where multi-layered security and partitioned IP protection are paramount. The partnership between Altera and Intrinsic-ID enables users of Stratix 10 FPGAs and SoCs to license Intrinsic-ID’s PUF technology for a variety of security use cases in their designs. Customer and user support will be enabled by Intrinsic-ID and by their support partner EndoSec for U.S. customers. The Stratix 10 FPGA and SoC device family features a new Secure Device Manager (SDM) available in all densities and family variants. Serving as the central command center for the entire FPGA, the Secure Device Manager controls key operations such as configuration, device security, single event upset (SEU) responses and power management. The Secure Device Manager creates a unified, secure management system for the entire device, including the FPGA fabric, hard processor system (HPS) in SoC devices, embedded hard IP blocks, and I/O blocks.

Why Should Researching SBCs Be More Difficult Than Car Shopping? Today’s systems combine an array of very complex elements from multiple manufactures. To assist in these complex architectures, ISS has built a simple tool that will source products from an array of companies for a side by side comparison and provide purchase support. INTELLIGENTSYSTEMSSOURCE.COM is a purchasing tool for Design Engineers looking for custom and off-the-shelf SBCs and system modules.



“Platforms” Make the Foundation for the Future of Embedded Development Complexity of design and the need to shield developers from time-consuming low-level detail is leading to new generations of development platforms that bring together hardware and software and provide quicker access to adding unique application value. by Tom Williams, Editor-in-Chief

Abstraction is the tool the human mind uses to deal with even start thinking about the true added value of the application increasing complexity. We have certainly seen this tool at work they are trying to build. One thing that this new generation of in technology and in embedded development over the years. A platforms must provide is a single “buck stops here” vendor who simple example is the board support package—a set of pre-instands behind the integration of the different components in the stalled software components such as RTOS, hardware abstracplatform and who is the single source of technical support for tion layer and drivers that help a developer get started adding all aspects of it. This is the message that Renesas is sending with value at a level that does not require excruciating detailed work regard to Synergy. simply to get started. With the increasing pervasiveness and Make no mistake. Renesas is in its core nature a silicon mancomplexity of embedded systems, most of which are now conufacturer. But the amount of software support and integration nected to the Internet of Things, the days when we could make needed to make those silicon products attractive is to a great do with a board support package are long gone. extent based on the user’s recognition that he can select those In order to continue progress in systems development, we are basic core processors and MCUs and know that they come with now in the era of the “platform.” And the mere word platform a hardware and software infrastructure that can speed time really does not do justice to what is now contained in that conto market, help with needed certifications, offer confidence in cept. The recent Renesas DevCon held in Anaheim, CA, revealed licensing, provide a strong tool environment and much more. what the future holds for development with a range of developIt is a game that only a few major manufacturers will be able to ment platforms making up its Synergy family and also including play successfully. an entire running automobile designed to serve as a development platform for engineers to created external sensor systems, motor control, infotainment, body control systems and ranging up to the ambitious goal of the autonomous vehicle. The Renesas Synergy platform is still evolving but has reached a stage of completion that is already yielding successful designs. Developers faced with selecting, qualifying, integrating and verifying a range of different hardware and software components— some open source and others from a variety of different vendors—has always been a time-consuming and Figure 1 involved process that designers have The microcontrollers included with the Synergy platform present a scalable range of performance and power consumption that also offer a high degree of software compatibility. had to go through before they can 10 | RTC Magazine NOVEMBER 2015

Synergy Multi-Layer API Access End Application Code Main ()

Network Thread 1

ThreadX RTOS

Audio Thread

HAL Drivers

Display Thread

Control Thread

Motor Thread

Software APIs



Waveform Thread

Application Framework Audio

Functional Libraries Security





EMAC C Driver

Timing & Control Driver

3 4 GLCDC Driver More

Custom Drivers


Board Support Package (BSP) Synergy Microcontrollers 1 Top API

2 Application Framework

3 HAL Driver

4 Custom Drivers

5 MCU Registers

Figure 2 The Synergy software offers API access from the application layer to the functional elements as well as down to the register level of the MCU, should that be needed for such things as tight real-time control.

Renesas Synergy starts off with a selection of ARM-based microcontrollers that fall into four classes: S1, S3, S5 and S7 and into the four clock frequency ranges shown in Figure 1. The S1 series is optimized for ultra-low power with active power consumption of 77uA per MHz and the S1 MCUs use a Cortex-M0+ CPU core. The S3, S5 and S7 MCUs use Cortex-M4 cores with increasing complements of on-chip Flash and increasing clock frequencies and power consumption—up to 4MB in the S7. The MCUs are also built such that the on-chip peripherals in the S1 and S3 classes, for example are compatible with the S3 having additional features. Likewise for the S5 and S7 devices so that software can be ported to the higher class with little modification and then adapted to the enhanced available features. In addition, care has been taken to preserve pin compatibility to the extent possible when scaling up. Even when moving from 100 to 144 pins, the power and ground pins are compatible minimizing board rework. The MCUs, of course, only provide the foundation for the platform. The rest of the structure is built of software. This Renesas Synergy Software Package (SSP) includes quite a bit because the complexity of hardware and the functions now demanded of embedded systems have grown tremendously. Renesas has partnered with Express Logic to integrate not only that company’s ThreadX RTOS but also its Filex file system and NETX TCPIP

Full Range of Displays Including Single Panel and Tri-Fold • Designed and Built to Meet MIL-STDs • Complete Revision Control • Designed and Built in the USA

your fast, flexible and responsive partner.

13469 Middle Canyon Rd., Carmel Valley, CA 93924 •

RTC Magazine NOVEMBER 2015 | 11


Figure 3 The Renesas Skyline Fleet offers a complete platform with sensors, processors and software for high-level development of a host of automotive applications up to the autonomous vehicle.

stack, its GUIX GUI toolkit, its USBX host/device USB embedded stack. The SSP also includes an application framework with functional code for such things as audio, SPI and touch sensing along with functional libraries. Beyond that are sets of hardware abstraction layer drivers and even direct access to MCU registers so that developers can access these lower-level hardware functions if needed. They are

12 | RTC Magazine NOVEMBER 2015

not shut out from that but are provided with an API layer that allows access to all these levels of the platform from the developer’s application code because these components are integrated to the RTOS to manage conflicts and to arbitrate among multiple threads. Mostly, access will be to the SSP and functional libraries where much of the low-level detail has already been solved and verified. In addition there is provision for the inclusion of qualified and/or verified software add-ons. Qualified software is selected, serviced and maintained by Renesas while verified third-party software goes through a rigorous documentation and verification process in order to be included and supported as a part of Synergy. Renesas provides its e2studio toolkit, which includes a ThreadX debugger, Smart Manual, compiler and configuration tools. In addition, IAR Systems has partnered with Renesas to mate its Embedded Workbench development tools to Synergy, bringing a complete embedded development platform with editor, project manager, compiler, liker, assembler, etc. We can expect to see this level of platform concept start to appear in other areas due to the need for pre-integration and abstraction in so many areas of embedded development.

The Platform on the Road

And surprise, surprise—a development platform has recently appeared in the form of a complete drivable automobile. The

Renesas Syline Fleet in cooperation with Harbrick’s Polysynch robotics software is a complete modified Cadillac with built-in sensors and a host of control units and built-in development boards (Figure 3). Among these are the H2 development boards that support the Renesas R-Car processors, which is a family of ARM-based processors. The first generation is based on the quad-core Cortex-A9 and the second generation on the quadcore Cortex-A15. These are SoCs with other processor cores and functions such as HD video integrated on-chip. The car includes built-in long and short range RADAR along with LIDAR sensors, GPS and vehicle-to-infrastructure (V2X) radio. In addition a wide number of other microcontrollers including the RH850 for things like power train, instrumentation and safety and the RL78 for body control (e.g., windows, wipers, etc.) are distributed. The car is not meant as a demos system but as a development platform. For example, the dashboard display brings in signals from RADAR, LIDAR, camera and other sensors but in the raw product, they just show up as signals on the display. It is up to the developer to design the visual character of how that data will be presented. Presumably, a developer could even decide to bring in a display or dashboard system from another vendor to customize a design. The Skyline Fleet is aimed at a range of development choices from driver assistance systems on up to developing a fully

autonomous vehicle. To that end the V2X radio will also be important since autonomous vehicles will need to interact with elements of the road and traffic infrastructure as well as with other vehicles. For example, even now stoplights are gaining intelligence such as the ability to sense the number of cars waiting at a stop and adjust their timing to the current conditions. Now such lights can be fitted with transmitters to signal approaching cars they are about to change and initiate braking automatically. Obviously, the subject of intelligent and autonomous vehicles involves much more than just the design of electronics inside the car. It must tailor those electronics to the overall driver experience based on style, the interior, driving feel and more. It must also bring in infotainment, driver distraction, Internet connection and other communication issues with the surroundings. The fact that in this area, a “platform” is based on a whole automobile—and even includes test tracks such as the 33-acre outdoor lab at the University of Michigan—speaks to the enormity of the task and the need for high levels of abstraction and of real-world interaction. Renesas Electronics America Santa Clara, CA (408) 588-6000

Designed and Built to Meet MIL-STD 810G & 901D

• Servers Ranging from 1U to 5U • Complete Revision Control • Custom and Off-the-Shelf Solutions • Designed for Shipboard, Airborne and Ground Tactical Systems

your fast, flexible and responsive partner.

13469 Middle Canyon Rd., Carmel Valley, CA 93924 •

RTC Magazine NOVEMBER 2015 | 13


Modular Platform Approach Enables Predictive Productivity As manufacturing becomes more automated and distributed, it also needs to be configurable and reliable so that manufacturers can react quickly to changing conditions yet protect their investments in systems that are increasingly connected to the IoT. by Maria Hansson, Kontron

Figure 1 The world’s factories are growing smarter and more connected driven by developments in industrial computer platforms, the central components for the control, interaction and connection of machines and processes. IoT-ready industrial computer platforms enable automation system developers to prepare production facilities for Industry 4.0 paving the way for highly efficient and flexible production capabilities.

14 | RTC Magazine NOVEMBER 2015

Factories need to become smarter and more connected in order to increase productivity. In conjunction, more manufacturing facilities are implementing Industry 4.0 (‘Smart Automation’) enabling more efficient production processes that allow changes to be made quickly and more easily while keeping downtime to a minimum. Industry 4.0 is the industrial Internet of Things (IoT) and plays an important role in defining how individual devices need to be more intelligent, manageable and connected to support predictable productivity. But what is really needed from IoT-ready industrial computer platforms to meet all these and future requirements? What features and capabilities are essential from embedded hardware solutions to ensure that industrial automation developers have the solid foundation required for the control and connection of machines and processes?

Implementing Smart Automation

Interoperability is seen as a key challenge for industrial system developers in implementing Industry 4.0. Not only must they securely connect and communicate with each device, it is also essential that individual devices are able to access data in real time. Going a step further, Industry 4.0 brings with it the expectation that these same devices may operate autonomously. Embedded computing platforms have advanced to a point of solving many of these issues acting as gateways and making predictive productivity possible by controlling, and connecting machines and sensors. Next-generation embedded platforms provide a modular, building block approach that gives developers the flexible resources to easily deploy system upgrades. Platforms with a modular design pave the way for developers to maximize their innovation potential and offer new system benefits by breaking CPU and other related technology obsolescence barriers. Module-based solutions allow the customer to migrate to the latest processor technology by only replacing the module and not redesigning the full system. They also extend the life of legacy systems. In addition, building-block platforms such as those based on COM Express offer the ability to easily handle thermal management requirements without the need for mechanical changes by employing a thermal transfer plate so that cooling can be achieved in the same space. So that developers can take advantage of remote management functionalities, corresponding software resources are very helpful. Developers can streamline software implementation using application program interface (API) middleware. For example, Kontron offers its wide range of systems and boards with the option of adding the company’s Kontron Embedded Application Programming Interface (KEAPI), which standardizes the access and control of hardware resources for embedded applications from its rich and sophisticated library of API functions. This cross-platform resource enables OEMs to simplify the development of applications such as monitoring of processors, control internal temperature or to facilitate access to the I²C bus. Using

Figure 2 The Kontron KBox A-202 Box PC is an industrial gateway solution based on the low power (memory down) concept, making it a virtually maintenance-free design for the industrial environment.

the Cloud, dependable BIOS updates that employ an additional security layer to protect against unauthorized access are possible with KEAPI. By unifying the way all Kontron embedded platforms are handled independent from form factor or OS significantly reduces development time and costs giving industrial OEMs a more expedient way to meet their system integration objectives (Figure 1).

IoT-Ready Capabilities

Making industrial systems IoT-ready requires the ability to efficiently handle data in a unified approach that turns individual devices into connected, intelligent systems. To facilitate intelligent industrial applications, devices at the edge (devices distributed in the field close to client machines providing operational technology) need to be connected to the larger network. Connecting these endpoints into the larger realm of the IoT requires two tiers of technology. The first is gateways that can connect multiple end points, providing data aggregation and formatting, protocol conversion, security, and other services. Effective connectivity must be able to support a variety of edge devices. Anticipating these needs, the latest platforms offer multiple connectivity possibilities – from working solely as a gateway, to providing machine control (Figure 2). The second tier of technology is infrastructure solutions that bring together data from gateways and enterprise sources. Connectivity enabled products provide needed functions such as provisioning and management, analytics, and linking of IoT data to mobile and enterprise applications. These platforms transform data into actionable information that can be used throughout an organization.

Ensuring Maintenance-Free Reliability

Predictive productivity can be severely hampered by unscheduled downtime. Many large industrial operations need to manage a considerable number of installed systems distributed across multiple facilities. These distributed installations have a tendency to cause maintenance issues. Very sizeable and, in many cases, nationwide or international networks are the reality

RTC Magazine NOVEMBER 2015 | 15


Figure 3 Kontron’s KBox family is based on low- power CPUs and offers high performance delivered in a compact format with up to 4th generation Intel Core processors. Designed for a variety of industrial applications, KBox computing solutions are suitable as a controller platform, its advanced graphics capabilities match needs in HMI – MES applications, and the built-in communication options and environmental specification make it suitable to handle remote monitoring applications.

today. In these instances, maintenance involves travel that can quickly become very costly. Having a higher level of system dependability also means that service technicians can concentrate on crucial service calls rather than using their valuable time for spares requests. These benefits reinforce why remote management and maintenance-free reliability are becoming mandatory

expectations of computing platforms used in the industrial environment. A primary requirement of maintenance-free systems is that they need to be free of moving parts or rotating components, such as fans or HDD storage that have a tendency to wear out more easily. Moving parts are also vulnerable to shock and vibration, which in harsh industrial environments can increase the risk of failure. Consequently, fanless industrial computer platforms with flash-based mass storage offer the appropriate solution. Energy-efficiency technology is important characteristics as well. Components such as the latest processors offer optimal performance-per-watt ratios generating minimal heat to eliminate the need for active fan cooling (Figure 3). A continuous power supply makes sure that system settings and the real-time clock do not require resetting resulting in a manual reboot. Battery-free operation is the answer. Wear-free double layer capacitors, also called gold caps, do not require replacement and can ensure a continuous power supply to the BIOS or the EFI memory and the internal clock system even when the system is switched off and disconnected. It is said that a system can only be as good as its weakest component so selecting platforms that feature long-life components with a high MTBF is critical. In addition, all the components

Intelligent Networking

Peet to Peer Tranfers

Reflective memory multicast 16 | RTC Magazine NOVEMBER 2015

and parts of the embedded computing solution such as CPUs and memory modules must be specified to withstand ambient temperatures and temperature fluctuations over the life of the system.

Making Predictive Productivity Possible

The right industrial computing platform can make all the difference in helping factory operators meet productivity goals. Maintenance-free reliability is a must so the best platforms remove moving parts to feature fanless operation, flash memory and gold caps. High MTBF rates are recommended to ensure industrial automation 24/7 and multi-year lifecycle requirements demonstrated by Kontron’s KBox C-101 Box PC that supports a high MTBF of more than 135,000 hours at an ambient temperature of 30°C. Industrial-grade and long-term availability are also crucial enabling customers to obtain systems in an identical configuration for years to come. Working with a supplier who has control of the entire process starting with the board design and continuing to the production of the complete platform, furthers the trust in automation system reliability and longevity. Today’s powerful computing technologies not only make industrial automation systems smarter and more connected, they also help reduce operating costs and minimize production downtime to prevent unnecessary profit loss. Seeking out maintenance-free industrial computer platforms systems provides a strong foundation for predictive productivity and supports the Industry 4.0 evolution. Kontron, Poway, CA (888) 294-4558

RTC Magazine NOVEMBER 2015 | 17


Flash Management and FPGAs Pave Way for Reconfigurable SSD Increasing the life expectancy of the NAND array reduces the total cost of ownership of the system. Also, the ability to reconfigure the controller to implement improved algorithms, or to adapt to a new generation of flash chips, extends the system life. by Robert Pierce, Altera, Conor Ryan and Joe Sullivan, NVMdurance

NAND Flash has revolutionized the computing markets by improving ease of access to and availability of data from the data center and your mobile device. Although the rate of geometry shrink (reduction in the size of structure in silicon) has slowed, the industry continues to find innovative ways to increase the capacity and reduce cost, but often at the expense of reliability, which is particularly problematic at the enterprise level. Solid State Disk (SSD) controller design has lagged behind, and also has been unable to solve the issue of reliability without introducing other limitations to the system as a whole. The life of NAND Flash can be extended and new viability for SSDs can come from today’s FPGA-based implementations for SSD that have overcome the limitations of controllers, permitting virtually all their operations to be conducted in hardware. The reconfigurable nature of FPGAs enables manufacturers to change and tune their controllers on the fly in hardware. SSDs are superior to traditional hard disk drives (HDDs) in almost every way. They are much faster, smaller and consume less power, and produce less heat and noise, as they have no moving parts. However, an area in which they are not superior to HDDs is the average lifespan; flash memory is the cornerstone technology in SSDs but wears rapidly through usage. This is known as the endurance problem; similarly, there is a retention issue, in that, although flash is non-volatile, it isn’t permanent storage, and data effectively “leaks out” over time.

Flash Memory

At a high level, NAND is divided into three classes, in order of development: SLC, MLC and TLC. SLC stores a single bit of data per memory cell (one bit-per-cell) where the presence or absence of charge represents data. MLC, on the other hand, stores two bits per cell and rather than the presence or absence of charge, it is the quantity of charge stored that determines which of four states (00, 01, 10 and 11) are stored in the memory element. The maximum data density is provided by TLC (three bits per cell), which differentiates between eight different voltage states. In a perfect world, the NAND vendors could just continue cramming 18 | RTC Magazine NOVEMBER 2015

Table 1 A comparison of the different classes of flash memory.

more bits into each cell, but increasing the number of bits per cell increases the access time and reduces the data reliability. The constant pressure to produce flash with smaller and smaller geometries compounds the wear-out issue. Smaller geometries mean that more bits can be packed into the same area, but this leads to faster wear-out. Table 1 summarizes the characteristics of each. The endurance and retention issues lead to errors in the data. The flash controller typically uses Error Correcting Codes (ECC) to identify and correct these errors. The number of errors that can be handled in this way is directly related to the level of extra data written, and the amount of processing time spent handling the errors. Until recently, particularly with SLC and MLC devices, the most common error correction used was the Bose-Chaudhuri-Hocquenghem (BCH) method. BCH error correction worked quite well in most cases, offering a predictable operating time, or latency, and is not particularly difficult to implement in hardware for solutions requiring up to 50 bits of error correction per chunk of data. However, it has scaling issues. Above 50 bits of error, BCH begins to consume hardware resources at an alarming rate and, in many TLC implementations, BCH has been abandoned in favor the more powerful Low Density Parity Check (LDPC) approach, which, while considerably more expensive in terms of resources, scales more gracefully than BCH as the level of ECC required increases. LDPC operates using both hard and soft information decode. Hard decode is analogous to BCH, while soft decode can be used

to add extra correction capacity in the event that hard decode cannot correct the errors, but soft code pays for this by using hints extracted by characterizing the degradation of the NAND chip at the foundry. These hints are coded into a read re-retry mechanism, where the new read is tailored to the degraded state of the NAND device. This is a neat trick, but these re-reads introduce several issues for the NAND channel as they occur randomly and increasingly with age. As each re-read causes a delay, it can be difficult to predict exactly how long a read will take; this is particularly problematic when the reads need to be re-inserted into a structured pipeline with many overlapping operations. Given that in most cases a file is dispersed across many NAND devices, or even many SSDs, this can introduce unacceptable delays for file retrieval, (known as tail latency). And if a portion of a file is delayed, the whole file will be delayed. This is especially troublesome in striped applications and can hasten the end of life because the SSD has become too slow at delivering cleaned-up data. It would be much better if this characterization effort were used to reduce the error rate in the data stream in the first place, rather than to correct the errors after they have occurred. To understand how the controller can minimize error creation, we should first understand what kinds of errors are generated. • Read Disturb: These errors affect a single read and may clear when the same location is read again. The more reads that occur without refreshing the data, the more likely they are to occur. • Program/Erase Cycling: In changing the state of a cell, electrons are forced through an insulator that keeps the charge on the memory element (the floating gate). The effect of changing state many times is akin to punching holes in the insulator and trapping electrons in the insulator media, like firing ball bearings through the side of a bucket in order to fill it up—the medium that needs to be penetrated is the same one that is required to keep the balls in. • Retention: Over time, the electrons will drift off, and the stored voltage will change, especially once there are a lot of holes in the insulator. The rate at which flash wears out is directly related to the stress of the writes, or from our earlier example, the number of ball bearings put into the bucket. Higher stress, in the form of higher voltages, or longer write times leads to more holes and implantation and faster wear-out. However, if not enough electrons are passed onto the gate, the flash will suffer from retention issues, and data will be lost. This implies that retention and endurance are somewhat interchangeable. By relaxing the retention requirements, higher endurance levels may be enjoyed, and vice versa. It also implies that early on in life, when there is little damage, relatively low stress (fewer ball bearings) could be used to write the flash, while later in life, when the flash has endured thousands of cycles, considerably higher stress could be used

to ensure the retention constraint. However flash doesn’t have a way to actively respond like this, so the worst-case scenario, high stress writes, must be used, hastening the chip’s demise. Discovering the values for the internal control registers that give the best (or even just a tolerable) trade-off between retention and endurance is known as trimming. This involves discovering and setting key operational parameters such as voltages and write times, read thresholds, and so on. The factory usually trims the devices to meet the industry standard (JEDEC) specifications, which may or may not match requirements for a particular application. In general, the trims are static, and never change during the lifetime of the flash/SSD, even though, particularly early in life, quite different sets can be used. Furthermore, trading endurance for retention may be useful in data centers, where data is rarely kept on SSDs for long periods of time, as so-called cold data (which is rarely accessed) is usually moved to HDDs. If the retention constraint were reduced, less stress could be used, and so, higher levels of endurance could be enjoyed. The solution to this must involve active management, based on the health of the memory elements using the SSD controller.

Active Management of Flash

Active management of flash takes the approach of dynamically varying the register values throughout the lifetime of the device. In particular, early on in life, when relatively low stress can safely be used, wear can be minimized so that the least possible amount of damage is being done, while still safely satisfying retention. Similarly, later in life, as the flash requires higher levels of stress to ensure reliability, the values of the registers can slowly be increased, but, because so much less wear was accumulated early on in life, the eventual end-of-life of the part occurs much later. Once an SSD is capable of performing this sort of active management, there are all sorts of use cases available. In particular, SSD manufacturers can now tailor specific flash parts to certain use cases; for example, one SSD model might require twelve months’ retention, while a hyper-scale customer might require an SSD to be re-tasked to retention of just a single week on the fly. The SSD is capable of using the same flash in both configurations simply using different management techniques.

Figure 1 Managing Flash with NVMdurance Navigator.

RTC Magazine NOVEMBER 2015 | 19


Figure 2 Automatically discovering parameter sets with machine learning.

To treat the flash in this manner, the SSD controller needs to run special flash management software that monitors the degradation of the flash and decides at what point it is necessary to change parameter sets. One such method is NVMdurance Navigator, which constantly monitors flash wear. Actively managing flash can be a delicate balancing act: too aggressive early on and the flash will wear more quickly than it needs to, but not aggressive enough and retention may be compromised. Furthermore, not all flash cells, even those contained on the same die, wear at the same rate. So it is entirely possible that a set of registers that will safely see one cell to 500 cycles and 12 months retention, for example, will only get a nearby cell to 400 cycles and 12 months retention. This sort of variation is usually dealt with by “guard-banding” the flash, that is, ensuring rh850_advertisement.pdf 1 24.07.2015 14:23:52

that the even the weaker cells can attain the specified endurance, even if that means specifying a lower endurance than many blocks are capable of, e.g. 400 cycles instead of 500 cycles. This sort of deration of cells results in wasted cycles. Using sufficiently powerful active management techniques, however, one can exploit these spare cycles. By managing outlier blocks, that is, those that are less likely to make it to the target cycling level, one can temporarily rest them, and instead, spread the load across the rest of the blocks. Thus, there are two different ways in which extra cycles can be wrung from the flash; first, by using weaker stress early on in life, and second, by ensuring that any outlier blocks on the devices are identified and dealt with early in life.

Parameter Discovery

Key to the entire enterprise of actively managing parameters is to ensure that the parameters being used at any given time are either optimal, or near optimal. Current generation devices have anywhere from fifty to three hundred control registers, and this number is only likely to get bigger with time. In flash factories today, highly-skilled and experienced engineers are relied upon to produce these sets of register values, through a mixture of engineering experience and massive characterization efforts, often basing new sets on the most recent generation of device that was the most similar to the next generation. This is a slow and expensive process which requires months of testing, and it costs millions of dollars to produce a single set. The complexity of current and next generation devices means that this undertaking is becoming unmanageable, and manufacturers are looking for ways to formalize and simplify the process.

The tool set for RH850, ICU-M and GTM C







► AUTOSAR debugging ► Multicore debugging ► Support for JTAG, LPD4, LPD1 ► Support for serial flash programming ► SFT Trace for instrumented code ► Multicore tracing for emulation devices

K 20 | RTC Magazine NOVEMBER 2015

► One connector for debug ► One connector for trace

The problem becomes nearly impossible to solve manually for actively managed flash because the characterization load is five to ten times that of single-set factory methods and each change of trim settings results in a new characterization run. The enabling technology for actively managed flash is machine learning. NVMdurance Pathfinder uses machine learning and model building to automate register value discovery and testing. The machine-learning engine experiments with variations in register values and monitors the effects in both hardware and in simulation models. In this way, hundreds of millions of permutations can be trialed, with the results of the hardware trials continually improving the software simulations. Unlike the task faced by the manufacturers, however, NVMdurance isn’t trying to produce a single set of registers that can guarantee every cell of every die ever made, rather, the task is to produce a set of registers that, when actively managed by NVMdurance Navigator, will permit the flash to last substantially longer. In particular, because this is done through different stages of life, stage is relatively independent, thus, a register set that works well early in life would not be expected to perform well at the end of life (as its stress values would be too low). This is like moving to a higher ECC level in LDPC (with associated tail latency and re-reads), when the current noise floor makes correction impossible, but instead of correcting the errors, we are preventing them from occurring in the first place.

Run time considerations

Armed with close-to-optimal register values for particular times of life, and a way to navigate between them, it is now possible to achieve close to as many Program / Erase cycles as the NAND chip can deliver for a particular retention scenario (i.e., a determination of how much retention will be required). Similarly, by monitoring and identifying outlier blocks and ensuring that none are over-worked, no data is ever prematurely lost. In fact, the parameters can be further tuned when more information is available about the use case, for example, information about how various operation times can be varied. There are costs associated with the active management of flash. The greater the degree of management, the higher the cost, both in terms of activity monitoring (e.g. bit error rate, timing etc.) and controller resources. Tight integration between the SSD controller and the active management system is essential to make sure that the host is never kept waiting while the flash is being managed. This is achieved by only involving the active management system when absolutely necessary, and by ensuring that any operations it does are carried out in the background. Conservative examples put the extra cost incurred by NVMdurance Navigator at substantially less than 1% of the total processing conducted by the controller. Thus, the system is virtually unnoticeable, and simply sits in the background observing and learning.

Memory, in particular RAM, is often a sparse resource on SSDs, and often the amount of RAM required for a process can be estimated in terms of the number of blocks it will be operating on. A full featured NVMdurance Navigator implementation requires 300 bytes per block (typically a block is of the order of 4-8 megabytes), so the RAM overhead is less than 0.01% of the size of the SSD. Many of the benefits can still be gained using a more cut-down version, which doesn’t require RAM to store data about the flash.

Hardware Realization

Altera Corporation has recently released a reference design that enables active management of flash in a highly configurable and upgradeable FPGA-based system. The growing complexity of flash memory management algorithms has made controller designs complex, and has impacted the performance and the diversity of NAND devices that a single controller can support. The NAND array has become the dominating factor for the cost of the drive. Increasing the life expectancy of the NAND array obviously reduces the total cost of ownership of the system. Performance predictability and throughput are key concerns for data-center managers. Throughput must be predictable not only from transaction to transaction, but throughout the life of the drive. With conventional controllers, however, performance drops off with age. There are many reasons for this, tail latency: block reclaim, large writes, and multi-cycle error correction, for example. Conventional controllers cannot overcome these problems. Altera, NVMdurance and Mobiveil have come together to create a new kind of flash memory controller on a single FPGA SoC chip. This device will be field reconfigurable and upgradeable; not only can the NVMdurance software extend the lifetime of the flash, but the design of the system paves the way for a new class of field reconfigurable SSDs. Not only can the flash be simply removed when it has worn out, truly commoditizing the part, the controller itself can be reconfigured to deal with different use cases. For example, a highly write intensive application could be treated differently to an application focused more on reads, because the hardware can change around the data. It is this ability to so easily modify the hardware that makes this an ideal approach for actively managing flash. Altera San Jose, CA (408) 544-7000 NVMdurance Limerick, Ireland +353 87 223 5462

RTC Magazine NOVEMBER 2015 | 21


NVMe Over Fabric Technology Enables New Levels Of Storage Efficiency In Today’s Data Centers The explosion of Big Data from the IoT is posing challenges for conventional storage, be it on hard drives or SSDs. A new storage architecture, Non-Volatile Memory Express, implemented in hardware and SSDs is now able to radically increase performance. by Shreyas Shah, Xilinx

22 | RTC Magazine NOVEMBER 2015

In today’s connected world, the volume and variety of data being generated is enormous, putting a tremendous burden on storage. The rise of Cloud computing, big data, social media and IoT makes the storage problem even worse. The cost of acquisition (capital expenses) and management of data (operating expenses) are skyrocketing for Cloud service providers. The computational power is not enough to make sense of relevant data, which can get lost within the rapidly growing sea of information. Big data analytics can be a big nightmare for Cloud service providers attempting to quickly monetize data. When information is relevant and has economic value, the data loses that value in time. It’s imperative to extract value from relevant data almost in real time. Current data center operators and enterprises are scratching their heads to solve their immense storage needs and are attempting to monetize the data stored by applying big data analytics technologies. This old storage architecture has challenges with regard to performance, power consumption, management and monetizing data while scaling systems to zettabytes and beyond. Figure 1 shows old storage architectures implemented storage services in software running on a processing sub system. The older generic system architecture uses the most powerful processors (2 x86s), switch, and I/O cards—with support for a variety of protocols like FC, FCoE, Infiniband, Ethernet at various speeds—to connect to the fabric of data centers. The expanders connect to Serial Attached SCSI (SAS)/ Serial ATA (SATA) hard disk drives (HDDs) and solid state drives (SSDs). All storage services run in software on processors. Even when SSDs replace HDDs, the older architectures are still limited to ~50K IOPs vs Non-Volatile Memory Express (NVMe) SSDs with performance measured in excess of 1M+ IOPs. The NVMe is a new data bus that supports memory-based storage. SSDs with ~1M+ IOPs (input/output operations per sec)

Figure 1 Generic storage system architecture

Figure 2 Vertically integrated, scalable NVMe over Fabrics with storage in FPGA

at <1/2 the latency of traditional SSDs are creating huge waves in the market. Performance advantages of NVMe SSDs compared with traditional SAS and SATA-based SSDs come out at half the latency and three times the IOPs. This case study was performed by the Storage Networking Industry Association (SNIA). The NVMe over Fabrics subcommittee created a standard to scale out architectures offering higher performance SSDs connected over the switching fabric with large capacity storage. These high-capacity (zettabytes and beyond) storage devices coupled with real time analytics allow end customers to extract efficiency out of their storage systems and optimize management of these systems. FPGAs enable the implementation of higher-level storage services in NVMe over Fabrics systems including compression/decompression, security, de-duplication, hashing, and erasure coding, thus delivering significant system-level performance benefits. Programmable logic is, indeed, a key component in reducing data center power consumption and accelerating computation. FPGAs can be used as hardware accelerators and can be reconfigured, as in the shell and role model, thus significantly increasing their value in the data center. The Xilinx SDAccel™ Development Environment for data center workload acceleration can be used to reconfigure FPGAs to be purpose-built while supporting different applications on the same hardware. The new NVMe over Fabric architecture shown in Figure 2 is scalable and optimized for a 3x-5x increase in performance and half the latency with services implemented in FPGA-based hardware acceleration. The implementation of these services in Xilinx’s Multiprocessor System on Chip (MPSoC) has resulted in a 30x latency improvement compared to a standard x86 (for compression of files). The new storage architectures are evolving around scale-out storage, aka fabric-attached storage. The storage servers are distributed across multiple servers with NVMe-based all-flash storage devices, all connected via fabric. This scalable architecture supports multiple data centers as a single storage domain to scale the storage needs across the globe. The advantage users get is that they can independently scale the network attached storRTC Magazine NOVEMBER 2015 | 23


Figure 3 Storage architecture with NVMe over Fabrics

age (NAS) heads, and additional storage can be attached without forklift upgrade of NAS heads. The initiative that started around NVMe over Fabrics has developed into hardware-based accelerators in storage systems. These hardware-based accelerators implement functions such as matrix multiplication for machine learning, caching, de-duplication, comp/de-comp, storage security, hashing, erasure codecs, key value stores and more. As shown in Figure 3, the NVMe over Fabrics architecture supports Network File System (NFS) or Common Internet File System (CIFS) or block storage over Internet Wide Area RDMA Protocol (iWARP) or RDMA over Converged Internet (ROCEV2) to transfer data from application servers to storage servers. The storage servers are distributed across multiple servers in a scale-out architecture. The storage devices are connected to storage servers via fabric. The fabric technology is implementation-dependent and could support PCIe, Ethernet, Converged Ethernet, Fibre Channel or Infiniband. The storage services can run on storage devices (aka target devices) to be accelerated in hardware. The services are configurable based on the end user, and the capacity of the service is use-case dependent.

Figure 4 SDAccel with Vivado and partial reconfiguration with storage services

24 | RTC Magazine NOVEMBER 2015

Xilinx provides its SDAccel Development Environment which supports C/C++/OpenCL language as input. The toolset converts this input file format to Register Transfer Level (RTL) in Verilog or VHDL, and Xilinxâ&#x20AC;&#x2122;s Vivado Design Suite converts the data to bit stream that gets downloaded into the FPGA. The bit stream configures logic functions in the FPGA. Figure 4 shows the SDAccel and Vivado toolset with its partial reconfiguration flow along with the shell and role model for configuring/reconfiguring storage services in Xilinx FPGAs The purpose of the partial reconfiguration flow is to implement and reconfigure a portion of the FPGA on the fly while the rest of the FPGA is still running other functions. Making use of this partial reconfiguration flow supports the industry-wide shell and role model for configurability, where shell includes connectivity such as PCIe, NVMe controllers, DDR memory controller, NVMe over Fabrics module etc. The shell is always on

Figure 5 Industry-wide shell and role model in NVMe over Fabrics

vs the role of the FPGA which is design-dependent. In this case, the role implements various storage services such as hashing, comp/de-comp, erasure codecs, storage security, de-dupe etc The role has standard AXI interfaces so that various IP can come from different sources. This type of hardware acceleration has shown performance benefits in excess of 30x-50x compared to storage services implemented in software on processors. Figure 5 depicts one of these storage services, compression/ de-compression, with the shell and role model. Initially, the PCIe, memory controller and NVMe controllers are configured in FPGA with Flash- based PROM in less than the PCIe time limit of 120 ms. The processor enumerates and PCIe links get enumerated. Once this task is completed, the processor can download the function or set of functions via PCIe. This partial reconfiguration of the FPGA allows the user to purpose-build the accelerator. In this example, we built the compression algorithm in C language and compared that with the same algorithm running on an x86 processor. The compression performed in software for a 100-GB file took 2.5 hours. That compares to FPGA hardware that took 4 minutes

Figure 6 Performance comparison of X86 with FPGA

to compress the file. The experiment suggests that hardware accelerators are incomparable to software. The algorithms are implemented in C/C++/OpenCL as shown with the SDAccel tool flow with partial reconfiguration (Figure 6). Xilinx’s SDAccel flow provides C/C++/OpenCL language programmers the ability to code their algorithms in their preferred language and download the bit stream to FPGAs without much knowledge of the hardware. Hardware-based accelerators are becoming more popular as software hits the wall and as programming languages shift from Verilog/VHDL to higher level languages like C/C++/OpenCL. Xilinx’s NVMe over Fabrics implementation along with partial reconfiguration and SDAccel Development Environment provides various services including comp/de-comp, security, hashing, erasure coding, LDPC error correction, caching etc. as shown in Figure 7.

Figure7 Xilinx’s implementation of NVMe over Fabrics solution

As shown earlier, comp/de-comp was the first service implemented inside the solution. The RNIC inside a Xilinx MPSoC could include low-latency Ethernet MACs with PFC, IP and TCP terminated in hardware. The RDMA portion including iWARP or ROCEV2 can also be terminated inside FPGA fabric. The host bridge of various NVMe drives is implemented in the solution in Figure 7. The advantage of this architecture is that it supports AXI interfaces on all IP making it easy to add/delete/shrink/ increase the capacity of the storage services on the fly with the SDAccel toolset. The services can be configured and reconfigured based on end customer requirements. With SDAccel flow, coupled with partial reconfiguration and connectivity interfaces like PCIe, memory controllers, Ethernet MACs, NVMe controllers, you can use the shell and role reconfigurability model with FPGAs to implement hardware accelerators in the NVMe over Fabrics architecture. This fabric-attached storage with scalable performance and highly efficient platform for analytics provides much lower total cost of ownership for Cloud service providers. Future areas will provide updates on porting other services like security, matrix multiplication, Spark machine-learning (ML) libraries acceleration for analytics and de-duplication in hardware accelerators in these same architectures. Xilinx San Jose, CA (408) 559-7778


Confused by Embedded SSDs? Don’t Be Next time someone tells you they’re confused by SSDs -- the multitude of interfaces, form factors, standards, and acronyms – tell them to sit back, relax and just marvel at what SSDs are bringing to embedded systems. by Scott Phillips, Virtium

As a vast array of solid-state storage products steadily penetrates the embedded systems space, it is not uncommon for designers to get a little overwhelmed by flash-storage options. SLC vs. MLC SSDs; form factors such as 2.5”, 1.8”, M.2, Slim SATA, mSATA, CompactFlash, CFast, and eUSB; interfaces like SATA, PCIe and PATA; and AHCI, NVMe and other aspiring standards – these all make for quite an acronym stew and a dizzying array of selections designers must make. Making the most-appropriate selections is critical to achieving an optimal balance of an embedded system’s functionality, reliability, environmental considerations, and budget. For such designs, SSDs developed specifically for embedded and industrial applications are emerging as key difference-makers, yet the form, function and standards they adopt vary widely. If you’re confused by the seemingly countless SSD options out there and the alphabet soup of acronyms surrounding them, don’t be. Let’s look at how today’s embedded-focused SSDs are adopting new protocols, reliability techniques and form factors. Of course, we also must acknowledge their heritage in the older, tried-andtrue storage technologies that remain important to the embedded market – a space in which designs lean more toward stability and reliability than trendiness and cutting-edge speed.

To SLC or MLC: That is The (Bit) Question

The fundamental distinction between single-level cell (SLC) and multi-level cell (MLC) flash is that SLC, in storing one bit per cell, is widely considered more reliable than MLC, which stores twice the number of bits. However, SLC is more expensive, on a bit-to-cell basis, than MLC. For the embedded-system designer, therefore, the decision of whether to go with SLC- or MLC-flash SSDs is neither about just density, reliability or cost, but rather a combination of them all. Power-fail protection is essential to retain critical data in the event of a power failure, and that protection challenge is more significant in MLC designs than with SLC because of a phenomenon known as paired page writes. When an SSD writes to MLC

26 | RTC Magazine NOVEMBER 2015

Figure 1 Embedded, industrial-grade SSDs come in a variety of form factors.

NAND flash, there are two blocks open at the same time, and the data in those blocks is subject to corruption if the power goes out during a write. SLC doesn’t use paired pages and, therefore, won’t face this challenge. Some SSD manufacturers seem to believe that power-fail protection via a hardware solution is enough but this is often not the case. Because of the inherent challenges with many industrial system designs, power-fail is a complex problem due to varying voltages, power supplies, and communication between host and storage. This multi-faceted challenge requires a multi-faceted approach to power-fail protection that includes both hardware- and firmware-mitigation techniques. Additionally, some performance enhancement techniques, including “early acknowledgement,” actually increase the risk of data loss as they take shortcuts by indicating to the host system that they “have the data” when in fact that data has not yet been committed to the flash. So, if power is lost, so too is the data.

Therefore, embedded-system developers are strongly advised to design with SSDs whose power-fail protection is integrated into hardware and firmware, whether the system uses SLC or MLC. In terms of both data security and total cost of ownership, this level of integration is a smart investment.

Form Factors Galore

The SLC/MLC selection appears substantially less complex in comparison to choices designers must make from among the myriad SSD form factors. Embedded designs typically favor smaller SSDs able to fit into dense, “set it and forget it” spaces; the systems where the drives reside are usually deployed in tightly packed and/or often inaccessible environments. And while 2.5-inch and even 1.8-inch SSDs may seem miniscule enough – indeed, for many embedded designs they are -- there’s a substantial portion of systems that require even smaller drives. It is for this very segment that vendors such as Virtium provide SSDs with form factors such as M.2, Slim SATA, mSATA, CompactFlash, CFast, and eUSB (Figure 1). The good news is that each of these form factors is based on and has evolved significantly from tried-and-true industry standards – some dating back several decades.

Quick quiz: Can anyone recall the origins of SATA? It emerged from ATA, which itself came from AT, the basic architecture of PCs of the 1980s.)

Solid-State Driving in the Express Lane

Originally developed to replace racks of hard drives, enterprise-class SSDs over time adopted a number of high-speed interfaces to eliminate the throughput and data-integrity limitations in systems’ storage. SAS, for example, became SSDs’ interface of choice for storing mission-critical enterprise data. With its dual-port modes, error-correcting features and other data-integrity enhancements, SAS showed its muscle, delivering greater performance and higher reliability than SATA. However, the SAS-SSD “marriage” quickly brought designers to the realization that traditional hard drive interfaces, while perhaps cost-effective, still posed a performance bottleneck, sparking a search for even greater interface speeds. All roads on that quest led to PCIe, now the interface of choice for today’s most demanding applications and deployed throughout the ecosystem – computing, communications, networking, and storage. PCIe didn’t just unleash the performance potential of SSDs; it enabled a slew of new form factors ideal for embedded and industrial uses. It’s given rise to M.2, Mini Card and NVMe SSDs. And speaking of NVMe…

Let’s Not Forget the Protocols

Some PCIe-based SSDs, including selected models from Virtium, will support the ATA Host Controller Interface (AHCI), the protocol supported by the peripheral controller hubs that connect to chipsets by Intel, AMD and others. Since AHCI is based on ATA, the software commands are the same as for SATA and, to an extent, even PATA. So using AHCI instead of NVMe on a particular PCIe interface is a trade-off of performance vs. software familiarity. Of course, embedded systems can take advantage of AHCI only if the capacities are high enough; higher capacities are rarely needed for industrial embedded designs. Furthermore, although NVMe is supported in Windows 8.1, Mac OS and some versions of Linux, designers building embedded systems with custom OSes may have a big challenge implementing NVMe. Expect that these challenges will smooth out as NVMe takes hold in the industrial/embedded market. Virtium, for example, is currently developing its own approach to NVMe so it can fully support this protocol once it’s ready for embedded-system primetime. One more point about the vast array of SSD options for embedded systems: They give designers the opportunity to optimize with SSDs featuring the most-appropriate form factor, interface, protocols, data protection, and capacity. That last item is significant because many, if not the majority of, embedded designs don’t require the same high capacities that, say, data centers demand; 128MB for a embedded boot drive will be far more reasonably priced than a 4TB enterprise drive. So, lower-capacity SSDs give embedded systems a distinct budget-friendliness. Virtium Rancho Santa Margarita, CA (949) 888-2444

RTC Magazine NOVEMBER 2015 | 27


C and Its Offspring: OpenGL OpenGL offers an API used to meet the demands of users and engineers to improve the visual quality and computational throughput of systems. A mental model must be understood in order to drive OpenGL, which enables the creation of applications and allows them to be integrated with the rest of your system. by Sean Harmer, KDAB

The advent of the iPhone-era has ushered in a step change in the paradigms of visual display and user experience from desktop through mobile and down to embedded applications. This change is being driven by the user’s expectations and by the capabilities of modern hardware. No longer will users accept mechanical buttons as the primary method of interaction. Even on machine shop and factory floors, the users are demanding visually pleasing, fluid and intuitive user interfaces. At the same time, today’s hardware is capable of so much more than that of yesteryear both in terms of graphical output and compute processing.

OpenGL for Graphics and Compute

OpenGL has been around in one shape or another for 22 years. Over this time it has always had a dedicated set of followers but amongst the wider technical audience who have dabbled with OpenGL it has often left them with a sense of confusion, wild-eyed wonderment or perhaps even fear. To a large extent this level of impenetrability was caused by the mental model of engineers not matching the reality of what OpenGL was executing under the hood. This is no fault of the engineers placed in this situation. Legacy OpenGL was a beast. Lots of global state, an archaic binding-to-edit object model, and cruft gathered as graphics hardware evolved into its current form, which is vastly different to when OpenGL was first conceived. Silicon Graphics started developing OpenGL in 1991 and since 2006, it has been further developed by the non-profit consortium, the Khronos Group. Since then, OpenGL has become very popular in the fields of CAD, virtual reality, scientific visualization, information visualization, flight simulation and video games. Fortunately, modern OpenGL is much more approachable, flexible and has higher performance than the original. It is, however, still necessary to have a good mental model of how OpenGL operates. The key to this is understanding the so-called pipeline and how it interacts with the OpenGL C API. The pipeline describes the flow of data through OpenGL and it can be configured in numerous ways to achieve all manner of rendering algorithms such as environment mapped reflections, stylized

28 | RTC Magazine NOVEMBER 2015

shading (toon, ink, pencil), shadows, global illumination and many more. Before we can learn about such higher level algorithms, we need to need understand the basic pipeline that forms the fundamental building block. So take a deep breath and let’s dive in.

The OpenGL Graphics Pipeline

Figure 1 shows a simplified schematic view of the OpenGL pipeline. It begins with data being fed in from the CPU (we will see how shortly). The data usually boils down to a set of vertex positions and their associated attributes (color, normal vector, texture coordinates etc.) but this data can be anything we can encode into a few floats, booleans or integers. Modern OpenGL allows us to be flexible. No longer are we tied to what the designers of the original OpenGL thought we should be using. Each vertex and its attributes are passed into the vertex shader – a programmable piece of logic. We’ll find out later why a shader is so named. A modern GPU may allocate many cores to processing vertices in parallel, but each instantiation of the vertex shader can only operate on a single vertex at a time. The typical task a vertex shader performs is that of coordinate system transformations. This may be to transform from model space to eye space for lighting calculations; to world space for environment mapping; to tangent space for normal or parallax mapping or one of many other possibilities. One thing a vertex shader must do however, is to output the vertex position in clipspace as this is used as input to the rasterizer. Actually it’s the final stage before rasterization that must output the clip-space coordinates. That means the geometry shader if present, otherwise the tessellation evaluation shader if present, otherwise the vertex shader as in the simplified pipeline introduced here. Understanding coordinate systems and the transformations between them is key to making effective use of OpenGL. Trying to shortcut this only leads to misery down the line. Be sure you understand the important coordinate systems, when each one is of use, and how to get your data into that coordinate system. As the transformed vertices pop out of the vertex shader, they are processed by the first piece of fixed functionality in the

Extreme PC/104 Designed for harsh environments and extended temperature Resists shock and vibration Extended product lifecycles


Figure 1 Simplified overview of the OpenGL graphics pipeline. Data flows from the top-left to the bottom right. To begin with the data is geometric in nature (with accompanying data). The vertex shader stage performs coordinate transformations. In the rasterization stage the geometric primitives are converted into fragments and are later given a color by the fragment shader. Those fragments that successfully pass a set of tests eventually get displayed on the render target.

pipeline – primitive assembly and clipping. This is where the individual vertices that make up a graphical primitive (point, line or triangle usually) get pulled together into a logical entity. This construct is then clipped against the volume that is eventually mapped to the current render target (usually the back buffer of a native window surface or a texture). The OpenGL graphics pipeline consists of several programmable and fixed function stages. The programmable stages are controlled by writing short programs in the OpenGL shading language called GLSL that execute directly on the GPU. Where such flexibility is not required the pipeline uses blocks of fixed functionality implemented directly in silicon. Some of these fixed stages can be tweaked to some extent by calling OpenGL API functions from C/C++. Think of these as levers and dials on a machine that change how the machine operates. Armed with the clipped primitives, the rasterizer is then able to generate fragments for each primitive. Think of a fragment as a pixel in training. Our nascent fragments still have a long journey ahead of them before they may graduate to become a fully-fledged pixel. To help them on their way, each fragment contains data for not only its position but also potentially a host of other data too. Recall the attributes that we have associated with each of our vertices. Each of these attributes is interpolated across the primitive by the rasterizer. As an example imagine the simple case of the three vertices shown in Figure 2. The three vertices have position and a color attribute: red, green and blue respectively.

Extreme Rugged™ PC/104 SBC with Intel® Atom™ E38xx Series SoC Bus type: PC/104, PC/104-Plus, PCI-104 SATA, GbE, USB 3.0, USB 2.0 Up to 4GB DDR3L SO-DIMM Supports VGA and LVDS


Extreme Rugged™ PC/104 SBC with Vortex86DX3 System-on-Chip Follows Form Fit Function approach: CM-435/430 Full ISA bus support 2GB soldered DDR3L memory SATA, CFast, GbE, Fast Ethernet VGA and 18/24-bit single channel TTL/TFT


PCI/104-Express SBC with Intel® Core™ Processor Max. 4GB soldered DDR3 memory HDMI, VGA, LVDS display interfaces 8GB industrial grade SSD SATA, CFast, GbE, Fast Ethernet PCI/104-Express Type 1

ADLINK TECHNOLOGY, INC Tel: 1-800-966-5200

RTC Magazine NOVEMBER 2015 | 29 14.10.2015 18:08:02

pc104_sbc_ad_us.indd 1

TECHNOLOGY DEVELOPMENT C AND ITS OFFSPRING: OPENGL For each fragment generated by the rasterizer, these three colors are interpolated to give a color at the position of that fragment. At the precise center of the resulting triangle (assuming the center is conveniently aligned to the pixel grid) there will be a fragment whose color consists of equal amounts of red green and blue – perfectly grey. Depending upon the detail of the geometry sent into the pipeline relative to the resulting projected sizes of the rasterized primitives you will likely find that at this stage of the pipeline there is somewhat of a data explosion. Each of those rasterized fragments must be lovingly crafted by the next programmable stage – the fragment shader. It was the fragment shader that gave rise to the general term shader, because the fragment shader’s prime responsibility is determining what color, or shade, the fragment should be given on its way to becoming a pixel. Just as with the vertex shader, each instantiation of the fragment shader executes in isolation from all others. This is to allow many cores on the GPU to process fragments in parallel without data dependencies between them – remember, there are a lot of fragments to churn through. Actually, that is a bit of a generalization. It is possible to get limited amounts of information about neighbouring fragment processing into a fragment shader. This is often achieved via the GLSL functions dFdX and dFdY which allow getting information about gradients between fragments. This is possible because the GPU processes blocks of fragments together and in lock-step. This allows peeking into

the registers for neighbouring fragments. Given the expressive power and flexibility of the GLSL language, a skilled developer can craft all manner of effects in the fragment shader. For some convincing examples of what can be achieved with a fragment shader and rendering a full-window quad (two triangles, since quads are now relegated to the annals of history), take a look at the impressive examples at https:// Although the order in which vertices are fed into the pipeline is well defined, it is sometimes useful for a shader stage to be able to sample from a chunk of data in an arbitrary manner. To enable this, data can be exposed to the pipeline in the form of textures, images and special types of buffer object (uniform buffers and shader storage buffers). These can be accessed from any shader stage. Access to textures can also optionally include quite sophisticated sampling and filtering implemented in hardware. The fragments exiting the fragment shader, sporting a (hopefully intended) color now go into another piece of fixed functionality that performs a number of tests that must be passed if our fragment hopes to graduate to pixeldom. Two common examples are the depth test and the stencil test. The depth test is often referred to as z-testing due to the key role played by the z component in this test. Both of these tests operate by comparing the data in each fragment to the data in another buffer—the depth buffer or stencil buffer respectively. Exactly how the data gets into these additional buffers is

RTC PRODUCT SHOWCASE CB30C; Rugged COM Express Safe Computer MEN Micro’s CB30C safe CPU board is a Rugged COM Express module for use in safetyrelated applications. The CB30C is a single-processor board with a reactive fail-silent design. It is based on a Freescale™ QorIQ™ single-core P1013 or dual-core P1022 processor, running up to 1 GHz.

TECHNICAL SPECIFICATIONS: • Freescale™ QorIQ™ P1022 CPU • Up to 2 GB DDR3 SDRAM with ECC, soldered • Fail-safe and fail-silent board architecture • SIL 2 with report from TÜV SÜD (EN 50128, EN 50129) • EN 50155, class TX compliance •-40°C to +85°C Tcase guaranteed with qualified components • VITA 59 in process, compliant with COM Express® Basic, type 6

MEN Micro Phone: (215) 542-9575 • FAX: (215) 542-9577 Email: Web:

30 | RTC Magazine NOVEMBER 2015

Figure 2 The three vertices of a triangle are submitted to OpenGL. After transformation the vertices are assembled into a triangle. The rasterizer performs a scan-line conversion of the triangle and any additional attributes associated with the vertices are interpolated across the surface of the triangle to create fragments. The fragments are then fed into the fragment shader to be processed further.

beyond the scope of this article but suffice it to say that it is very common and very easy to populate the depth buffer. The incoming fragment and the data at the corresponding position in the buffer are compared, using a user-specified comparison operator. If the comparison is true, the fragment passes the test and is allowed to carry on. If the fragment fails, it is thrown away. If blending is disabled, that is the end of the story. The successful fragments get written to the render target and eventually get displayed on the screen, or used as input to a subsequent render pass. If blending is enabled, then the incoming fragments get combined with any fragments that went before them at the same pixel location by way of a user-specified blending operation. At this time, blending is still classified as a fixed function, but configurable pipeline stage. Who knows, perhaps in time, blending will also evolve into a full-blown programmable stage. KDAB Houston, TX (866) 777-5322

Breaking the Chains! Open and Flexible System Architecture for Safe Train Control Rugged Computer Boards and Systems for Harsh, Mobile and Mission-Critical Environments n

Modular, SIL 4-certifiable systems for safety-critical railway applications


Configurable to the final application from single function to main control system


Communication via real-time Ethernet


Connection to any railway fieldbus type like CANopen, MVB, PROFINET, etc.


Comes with complete certification package including hardware, safe operation system and software


Compliant with EN 50155

RTC Magazine NOVEMBER 2015 | 31


Help Wanted: Wind, Solar, and On-Grid Battery Storage

The previous article in this series examined how the platform design concept, as used by smart phones, is needed to help bring more engineering expertise to the grid problems of today. He we take a closer look at an application, power conversion, that directly impacts some of the biggest challenges facing the grid—renewables. by Brett Burger, National Instruments

The benefits of a design platform extend beyond the first development cycle. Looking back at a smart phone, upgrading software is a fairly common and simple task. About once a year, with the push of a virtual button, the entire operating system is upgraded fixing bugs, adding features, and tweaking the UI. About once every two years a new hardware upgrade is available and the software stack is migrated from a backup. The system picks up right where it left off but with more storage space, a faster processor, new co-processing units, better camera, speakers and so on. This demonstrates the value of building on a platform; abstraction of the hardware technology through a layer of software. From a hardware standpoint, changing processor architectures and other new hardware is not a trivial update, but to the end users that have adopted the platform it becomes as trivial as plugging in the new phone and performing a “sync” operation. Instant technology refresh. This is happening with software platforms for the grid as well. One specific application in need of platform adoption is that 32 | RTC Magazine NOVEMBER 2015

of power conversion, or power electronics. Power conversion happens in inverters which are the hardware nodes that enable DC power sources, like renewables, to connect to an AC grid. Inverters are one of the problems referenced in the previous article because they add harmonic noise to the grid and, by their digital nature, have no inertia to help absorb grid issues. In fact, as their penetration grows they reduce the overall grid inertia that exists from spinning base load. Wind turbines, solar arrays, battery storage systems like Tesla’s PowerWall, all need inverters. At the core of the inverter is an “intelligent power module” that consists of a processing control board, analog to digital converters (ADCs) and digital to analog converters (DACs) for connection to sensors and insulated-gate bipolar transistors (IGBTs). With given set points, the control board drives the IGBTs to digitally create a sine wave of voltage potential (AC) from the constant output of DC potential (Figure 1). Most inverters today are the

Figure 1 3-Phase Inverter Diagram. Upgrading the control system of an inverter leads to efficiency gains and new application use cases, but traditional design makes it difficult to keep up with market trends and silicon level technology.

product of design teams comprised of experts in digital hardware design, board layout, analog front end design, hardware design language (HDL), signal processing, and of course power and inverter control theory. All of these engineers work in concert to design inverters. This design process is lengthy and much of the expertise is used solving problems that are not core to the task at hand, such as glue logic for mating ADCs to an FPGA, writing middleware for application logic to interface with the hardware, and verification of the hardware layout. Time spent on these tasks is time that isn’t spent on improving inverter control algorithms, responding to customer requests, and differentiating the inverter in a growing market. There is a better way to approach design that will let inverter manufacturers focus more on business core competencies. Platform based design is the better path for development and will benefit not only inverter manufacturers, but their end customers and utility companies as well. Looking back to the smart phone analogy, switching to platform development democratizes system design and lets experts focus on their core strength. iOS and Android enable a wealth of application developers who know little to nothing regarding hardware, operating systems, and middleware development to bring unique products to market that would otherwise live only as a “what if ” idea. The same concept applies to inverter design. Traditionally, inverter and control experts would have their software simulation tools and then pass the results “over the wall” to

the embedded experts to implement. This takes time and is subject to a broken toolchain. With a platform approach, engineers can focus more of their effort on new inverter design techniques and let companies that are experts in control board layout and embedded system platforms innovate on the under the hood components. The inverter experts interface with the platform API at a level that abstracts away the low level complexities. This concept eliminates some of the walls that exist with traditional design methods, results in a shorter design cycle and abstracts away much of the hardware dependencies that often cause longer redesigns. Inverter design is currently pressured from both sides of the industry. From the market side, applications are becoming more diverse. Wind and solar arrays are growing in size and demand more efficient inverters to meet the requirements of the generation owners and cleaner outputs to satisfy the requirements of system operators. On-grid storage systems, mostly batteries, are growing in demand to help control the dynamic generation from wind and solar systems. Wind, solar, storage…there is not necessarily a one size fits all inverter that covers all of these applications and accounts for the unique requests that always emerge from the end users. How does time spent laying out the front end analog circuitry or FPGA glue logic help a company compete and win business in this evolving inverter market? It doesn’t. The inverter market is just one source of pressure. The other is the rapid growth of silicon level technology. Newer, more advanced hardware technology rarely, if ever, is a drop-in phys-

Figure 2 Heterogeneous System-On-a-Chip (HSoC). New chip technologies that incorporate multiple processing elements (FPGA, CPU, DSP) on a single die are more difficult to design into a system and require specialized skills and tools. Traditional design manufacturers are having to devote more and more resources just to keep up with existing technology.

RTC Magazine NOVEMBER 2015 | 33

INDUSTRY WATCH BUILDING OUT THE SMART GRID ical upgrade to the previous generation. DIP chips, socketed processors, and quad-flat packages are giving way to the ball grid array (BGA) packages, a significantly more complex component to integrate into a board level design. Companies that take on the effort of a custom built electronics package, despite their core differentiator being in the inverter and control features, now must spend more time and effort to update their in house knowledge and tools to deal with these complex form factors. Beyond the physical form factor is the technology trend of heterogeneous processing elements (Figure 2). New components like the ZYNQ system on a chip (SoC) from Xilinx blend DSP cores, FPGA logic fabric, and microprocessors on a single die and offer a great value in performance flexibility and computations per dollar. Should an inverter design company spend resources to redesign around the latest processing elements or continue to focus on their market differentiator? The latest chip available on the open market does not a differentiator make. In house full design can be attractive from the “control” standpoint, as in the manufacturer controls all of the design, but with this comes added responsibility and burden. Components like flash memory and non-volatile storage have shorter market cycles and require a large last time buy purchase, a board spin, or at least another round of testing to validate the replacement. Companies that do choose to stick with traditional design will see their resource allotment to ancillary and overhead in-house services grow and thus may see new competitors enter the market with niche solutions that they could otherwise have covered if they had a more democratized design approach. Dynapower used a platform design approach for an inverter targeted at advanced carbon battery based on-grid storage solutions. Power engineers programmed the new product without needing embedded software engineers in the middle for every step of the process. “The key to this design was the ability for our power engineers to directly program their product without a software engineer in the middle. This new platform and method of development changed our development time from 72 weeks to 24 weeks….These tools take design to the next level. We can have a 90 percent confidence factor in a first design and minimize hardware iterations during the prototype stage.” said Kyle Clark of Dynapower. The platform used for this new inverter was the LabVIEW RIO architecture from National Instruments. Specifically, a single-board RIO control board is mated to an I/O board with circuitry designed to control IGBTs. The single-board RIO general purpose inverter controller (GPIC) used has a Freescale PowerPC processor and a XILINX Spartan-6 FPGA. NI recently released an update to the Single-Board RIO that mates to the GPIC. For future designs, Dynapower can port their existing software IP from the 400 MHz PowerPC processing board to a dual-core ARM9 processing board with ARTIX-7 FPGA fabric. This means an immediate performance upgrade with no hardware redesign required. Staying current with technology while focusing on the core problem is the power of a platform. It also represents an enabling technology that is going to help engineers make the next 34 | RTC Magazine NOVEMBER 2015


3b Figure 3 Image of sbRIO/GPIC from 3 years ago (a) next to an almost identical module with the new sbZYNQ product (b), which plugs directly onto the existing platform. Designers that use a platform can focus on their market differentiator and let the platform vendor focus on implementing new technology. The two modules above use completely different processor architectures from two different silicon vendors, but because they are on the same platform the upgrade to the end designer is a software port.

generation grid more accommodating to wind, solar, on-grid storage, and any other technology that requires conversion between DC and AC. National Instruments Austin, TX (512) 683-0100 Dynapower South Burlington, VT (802) 860-7200

FIND the products featured in this section and more at

¾-Length PCIe Board Supports Dual-Core ARM Hard Processor System and 100/40/10 GigE Connectivity

A ¾-length PCIe board based on Altera’s Arria 10 FPGAs and SoCs integrates the Arria 10 FPGA and SoC with a wide variety of features. The A10P3S board from BittWare supports a range of applications such as network processing and high performance computing for applications in financial services, data centers, and cyber security/signal intelligence. The board offers flexible memory configurations supporting over 48 GB of memory (featuring the latest DDR4 and QDR options), sophisticated clocking and timing options, and four front-panel QSFP cages that support 100 Gbps (including 100GigE) optical transceivers. A comprehensive board management controller (BMC) with host software support for advanced system monitoring greatly simplifies platform management. The board also supports Altera’s SDK for OpenCL. Built on 20nm process technology, Arria 10 FPGAs and SoCs boast higher densities, higher performance, and a more power-efficient FPGA fabric than previous generations. They also integrate a richer feature set of embedded peripherals, high-speed transceivers up to 28Gbps, hard memory controllers, and protocol controllers. Arria 10 FPGAs are the first FPGA to integrate hardened floating-point (IEEE 754-compliant) DSP blocks, delivering breakthrough floating-point performance of up to 1.5 TFLOPS. Arria 10 SoCs are also the industry’s only 20nm FPGA to integrate a dual-core ARM® Cortex™-A9 MPCore™ hard processor system (HPS). The A10P3S is the first high-performance offering that fully supports System-on-Chip (SoC), combining leading edge FPGA technology with an embedded dual-core ARM Hard Processor System. As an FPGA board, it delivers unparalleled performance and power-efficiency with a unique SODIMM-based memory configuration, high-speed I/O capability, and OpenCL support. As an SoC board, it opens up a new world of options for complete, stand-alone distributed processing with little to no host traffic or intervention attractive for security, network, and financial applications.


AMD R-Series 1U platform with up to 24 GbE LAN Ports

A 1U rackmount hardware platform is designed for network service applications. The PL-80660 from WIN Enterprises supports a network-specific AMD R-Series CPU (formerly codenamed eTrinity). Quad- and dual-core processing options are provided for customer flexibility in addressing different market segments. The second generation AMD Embedded R-series APU supports Heterogeneous System Architecture for increased processing performance, power efficiency and multimedia immersion. The platform supports two DDR3 1600MHz unbuffered/nonECC SODIMM sockets that provide up to 16GB of memory. The device offers storage interfaces supporting 2.5”/3.5” SATA 3.0 6Gbps hard drives and CompactFlash™. In order to further enhance network performance the PL-80660 is built with 8 onboard GbE ports with an option of expanding to 24 GbE ports via PCIe. To prevent network problems during an unexpected shut down, PL-80660 supports two segments of LAN bypass function through WDT and GPIO pin definitions. For local system management, maintenance and diagnostics; the front panel is equipped with dual USB ports, one RJ-45 console port and LED indicators that monitor power and storage device activities. Additionally the PL80660 provides one standard golden finger connection via PCIe x8 for add-on modules. WIN Enterprises will work with customers to modify this COTS design to more specific specifications when ordered in standard OEM or above quantities, it can be streamlined to more specific requirements and manufactured for up to seven years. WIN Enterprises, North Andover. MA (978) 688-2000.

BittWare, Concord, NH (603) 226-0404.

RTC Magazine NOVEMBER 2015 | 35


FIND the products featured in this section and more at

XMC Module Family Features User-Programmable FPGA

Two Switched Mezzanine Card (XMC) compatible modules provide a user-programmable FPGA Xilinx XC6SLX45T-2 or Xilinx XC6SLX100T-2 Spartan6 FPGA. The TXMC633 and TXMC635 from TEWS Technologies are designed for industrial, COTS, and transportation applications, where specialized I/O or long-term availability is required. They provide a number of advantages including a customizable interface for unique customer applications and a FPGA-based design for long-term product lifecycle management. The TXMC633 module versions are available with 64 ESD-protected TTL lines or 32 differential I/O with EIA 422 / EIA 485 compatible, ESD-protected line transceivers or 32 TTL I/O and 16 differential I/O with Multipoint-LVDS Transceiver. The TXMC635 module features 48 TTL I/O, 8 channels single-ended 16 bit analog output with up to ±10.8V output voltage range, and 32 single ended or 16 differential 16 bit analog inputs with full-scale input voltage range of up to ±24.576V. For customer-specific I/O extension or inter-board communication, the TXMC633 and TXMC635 provides 64 FPGA I/O lines on P14 and 3 FPGA Multi-Gigabit-Transceiver on P16. P14 I/O lines could be configured as 64 single ended LVCMOS33 or as 32 differential LVDS33 interface. The User FPGA is connected to a 128 Mbytes, 16 bit wide DDR3 SDRAM. The SDRAM-interface uses a hardwired internal Memory Controller Block of the Spartan-6. The User FPGA is configured by a platform SPI flash or via PCIe download. The flash device is in-system programmable. An in-circuit debugging option is available via a JTAG header for read back and real-time debugging of the FPGA design (using Xilinx “ChipScope”). User applications for the modules with XC6SLX45T-2 FPGA can be developed using the design software ISE Project Navigator (ISE) and Embedded Development Kit (EDK). IDE versions are 14.7. Licenses for both design tools are required. TEWS offers a well-documented basic FPGA Example Application design. It includes an .ucf file with all necessary pin assignments and basic timing constraints. The example design covers the main functionalities of the modules. It implements local bus interface to local bridge device, register mapping, DDR3 memory access and basic I/O. It comes as a Xilinx ISE project with source code and as a ready-todownload bit stream. TEWS Technologies, Reno, NV (775) 850-5830.

36 | RTC Magazine NOVEMBER 2015

Wireless ‘PingPong’ IoT Edge Node Platform Connects Field Devices to the Cloud

A new wireless IoT edge node, called the PingPong, is a flexible and powerful hardware platform for connecting field devices to the cloud. The module from Round Solutions is a small form factor board supported by an RTOS running with a Microchip PIC32MZ 32-bit MCU and a high-speed cellular Telit module. It is based on a modular hardware design principle that simplifies the integration of custom-specific applications and communication standards into a single solution platform. Target applications range from cloud-connected oil tanks and intelligent waste bins up to cloud-connected gateway systems for manufacturing robots. The PingPong platform offers software engineers an application-ready, pre-validated PingPong data exchange mechanism which eases the IoT integration of any field device. With the supplied Round Solutions Open Source PingPong software development kits, the platform can be configured for nearly all IoT/ M2M applications, such as sensor reading, asset tracking, routers, measurement technology, telemetry and security control. Thus, Round Solutions’ PingPong offers companies wanting to transfer their data to IoT cloud-servers a complete solution including software libraries and source code. The PingPong hardware platform offers high-speed cellular modules for the IoT connectivity as well as numerous interfaces to the field – which can also be controlled via the Cloud. The standard interfaces include, for example, Ethernet, USB and CAN as well as a high-precision GNSS (Global Navigation Satellite System). Developers can add various expansion cards to create a nearly unlimited number of application scenarios. Application-ready expansion cards are available for WLAN, Bluetooth, I/0, Iridium satellite communications, ISM/RF, SigFox, NFC/ RFID and camera connectivity. All functions are ready for use as soon as the expansion cards are attached to the motherboard. Round Solutions Neu-Isenburg, Germany +49 (0) 6102 799 28 0.

FIND the products featured in this section and more at


Intel Braswell-Based Nano-ITX for Robust Systems

A new Nano-ITX embedded system board is based on the latest Intel Celeron and Pentium processor N3000 product families (4W~10W), formerly codenamed Braswell. The NANO-6061 from American Portwell is an extremely low-power, high-performance single board computer (SBC) with embedded qualities. It is a computer board, designed in the popular Nano-ITX form factor measuring 120mm x 120mm. It is predestined for 24/7 operation, and supports industrial features as well as a guaranteed long-term availability of at least 7 years. The flat design—measuring 27mm in height with I/O shield— allows space-saving installation in display and panel PCs, making the realization of digital signage and control solutions for industry and business applications a quick and easy task. And the Portwell NANO-6061-based systems are ideal for passively cooled and hermetically sealed systems that can be used in various environments. The powerful integrated graphics eliminates any need to compromise on ease-of-use, something that is particularly important for POS (Point of Sales) and retail applications, such as kiosk systems. Three independent displays, VGA, LVDS, and DisplayPort with high resolutions make it possible to realize sophisticated user interfaces, such as touch solutions. The NANO-6061 is based on the Intel Celeron and Pentium processor N3000 series (codenamed Braswell), and its one DDR3L 1333/1600 MT/s SODIMM socket can be equipped with up to 8GB of onboard DDR3L memory. It also integrates Intel Gen8 graphics supporting DirectX 11.1, OpenGL 4.2, OpenCL 1.2 plus high-performance, flexible hardware decoding to decode multiple high-resolution full HD videos in parallel. In addition, it supports up to 3840 x 2160 pixels with DisplayPort and 2560 x 1600 pixels with VGA, and 2x24-bit LVDS up to 1920 x 1200 pixels, and is designed with the flexibility for connecting up to three independent display interfaces. Two USB 3.0 SuperSpeed ports support ensures fast data transmission with low-power consumption. And one additional USB 2.0 port is made available. Two 5 Gb/s PCI Express 2.0 lanes can be used as 1x half-size mPCIe (shared with PCIe x1) and 1x M.2 slot. Three SATA 3.0 interfaces with up to 6 Gb/s (one of them also available as mSATA and two ports for M.2) allow quick and flexible system expansions. And Intel I211AT Gigabit Ethernet controller provides dual Gigabit Ethernet LAN access via the two RJ45 ports.

Fully Managed 3U VPX Rugged Ethernet Switch with Switch Management Software A fully managed 3U VPX Layer 2/3 Ethernet switch offers a variety of combinations of Gigabit Ethernet and 10Gigabit Ethernet connectivity. The NETernity GBX411 from GE Energy Management’s Intelligent Platforms allows customers a range of options to meet their network requirements. Its GE Rugged design enables it to be deployed with absolute confidence on air, ground and sea platforms in applications such as surveillance, reconnaissance, radar, sonar and imaging. The GBX411 – which supports precision time protocol (1588) - is characterized by significant flexibility through its use of GE’s OpenWare switch management software, offering comprehensive and powerful management features for Layer 2/3 switching and routing. A wide range of networking protocols and management features is supported, together with extensive capabilities for Multicast, Quality of Service, VLANs, and Differentiated Services. OpenWare can also be customized to meet a range of customer requirements. Supported access methods include Telnet, SSH, serial console, SNMP and a Web interface. The new switch also responds to the growing demand for high security with its access control, authorization and declassification features. Compliance with the US Army’s VICTORY initiative and specifications for an Ethernet switch is built in to the GBX411’s capabilities. GE Intelligent Platforms, Charlottesville, VA. +44 (0) 1327 322821.

American Portwell Technology, Fremont, CA. (510) 403-3399.

RTC Magazine NOVEMBER 2015 | 37


All-in-one Embedded Box PCs for Infotainment Industry

A new industrial PC product line has specialized features for retail and gaming applications. The new line from Adlink Technology is initially being launched with two highly integrated box-PC models that include necessary interfaces for the most common peripheral devices, as well as intelligent API middleware to simplify application development. The API middleware allows application development without dependencies on the peripheral devices, and a vast library of commonly used peripherals is available. The ADi-SA1X and ADi-SA2X integrated box PCs are based on AMD and Intel processors and offer both state of the art graphics performance and an onboard GPU option. The ADi-SA1X and ADi-SA2X support up to eight independent displays and are equipped with external, field-removable storage, Wi-fi, two GbE ports, four USB 3.0 and three USB 2.0 ports, 4x DP on board (4096x2160, 60 fps), 1x PCIe x16 Gen3, 1x PCIe x1 Gen2, 2x Mini PCIe slots, 4x RS232 + 1x RS485/ RS422, and 7.1 channel audio. In addition, the new ADi-SA1X and ADi-SA2X deliver multiple hardware and software security options suited for retail, vending, and gaming applications and are designed to meet GLI (Gaming Laboratories International) standards. Adlink has also developed several special peripherals including the ADi-SIOG, an Intelligent USB I/O Controller Hub for the Class III/II, VLT/SBG, AWP, POS/POI and retail markets. The controller is equipped with 1MB battery backed NVRAM and a crypto and authentication security chip that combines power-off monitoring and date/time stamping features to enhance security. The ADi-SIOG has 32 inputs and outputs to support retail/gaming-specific peripherals. Adlink has also introduced the ADi-BSEC intelligent SECURITY and NVRAM PCI Express Card. The ADi-BSEC is a high-speed PCIe card with up to 16MB NVRAM. It also offers a crypto and authentication security chip featuring SHA-256, RNG, UID, EEPROM, and OTP. Adlink’s iAPI intelligent middleware simplifies the development of new applications. By offering standard APIs for development, iAPI provides a hardware abstraction layer (HAL), a protocol translation layer (PTL), and supports ccTalk, ID003, EBDS, TCL, and SSP protocols. Combined with Adlink’s iAPI intelligent middleware, the ADi-SIOG and ADi-BSEC accessories deliver an extensive device library out of the box.

ADLINK Technology, San Jose, CA. (408) 3600200. 38 | RTC Magazine NOVEMBER 2015

FIND the products featured in this section and more at

Extending VMEbus Shipments Beyond 2020

A 6U processor board extends the choice of long-life VMEbus products designed to be available beyond 2020 without significant end of life component issues. In particular, VP F1x/msd from Concurrent Technologies uses the same tried and tested VME64 bridge device as the two other new VME boards announced in 2015 by Concurrent Technologies. VP F1x/msd offers the ability to host two local PMC/XMC modules plus two additional PMC modules mounted on an expansion carrier. PMC modules are still widely used to provide simple I/O connectivity and VP F1x/msd enables customers in the military, aerospace and other similar markets with long life-cycle programs to more easily transition to a newer generation of processor. Two processor variants are offered: a low power consumption 2-core Intel Core i5-4422E processor (3M cache, up to 2.9GHz) for simple technology transitions; and a 4-core Intel Core i7-4700EQ processor (6M cache, up to 3.4GHz) for higher performance applications. To enable upgrades, configurations are offered with backwards compatible rear I/O pinouts for VP 717/x8x and VP 91x/x1x boards, introduced in 2010 and 2012 respectively. In addition, Concurrent Technologies enhanced VME device drivers provide a consistent Application Programming Interface (API) to simplify the upgrade process. Glen Fawcett, CEO, Concurrent Technologies, commented, “Having announced three new VMEbus boards in 2015, we have demonstrated our commitment to support VME customers needing long-life cycle boards based on Intel processors. We continue to enhance our portfolio with dependable hardware products that can be deployed in a range of environmental conditions, complimented with software and firmware packages to enhance security, speed up boot times and simplify integration.” Air-cooled VP F1x/ msd boards are available for customer shipment in Q4 2015 with rugged conduction-cooled variants in 2016. Concurrent Technologies, Woburn, MA 781 933 5900.

FIND the products featured in this section and more at

2U Rackmount Appliance with Haswell E3 1x PCIe X4 slot and 4x 3.5” SATA HDD

Network platforms for OEMs, announces the. The new PL80640 from WIN Enterprises is a 2U rackmount hardware platform designed for network service applications, such as general Internet and data center support, SPAM and security filtering, file serving, etc. Built with Intel Embedded IA components that are guaranteed for long product life, the device supports Intel 22nm Haswell (codename) core i3/i5/i7/Pentium/Celeron and E3-1200V3 processors. These Intel processors provide high performance with operating efficiency. The platform supports four un-buffered and non-ECC or ECC DDR3 1333/1600 MHz DIMM sockets with maximum memory capacity of up to 32 GB. Key features include support for Intel Haswell Core i7/i5/i3/Pentium/Celeron and E3-1200V3 Series processors, LGA1150. The unit offers maximum 32GB DDR3 1333/1600 MHz system memory as well as maximum 24x GbE LAN (4x standard) with optional copper and SFP removable PCIe expansion and 4x removable 3.5” SATA HDD Storage interfaces include four 3.5” SATA HDD and one CompactFlash SSD. The device comes standard with 4x GbE LAN, but features a PCIe X4 expansion slot to enable LAN expansion to a maximum of 24 copper and/or fiber ports. The front panel of the appliance has one USB 2.0 port, one RJ-45 console port and LED indicators to monitor power and storage device activities for local system management, maintenance and diagnostics. The PL-80640 is RoHS, FCC and CE compliant. WIN Enterprises will work with solutions manufacturers to modify this COTS design to more specific specifications when ordered in standard OEM quantities. WIN Enterprises, North Andover, MA (978) 688-2000.


New 14 nm Intel Pentium and Celeron Processors Integrated on COM Express Mini

The new 14 nm Intel Pentium and Celeron processors (codenamed Braswell) have now been integrated on COM Express Mini modules. The new conga-MA4 module from congatec further enhances the computing and graphics performance of their predecessors and offers a new class of performance density. Despite the increase in performance, heat dissipation has been lowered to a scenario design power (SDP) of 4 Watts, enabling very compact, passively cooled system designs. Comprehensive Microsoft Windows 10 support makes the new COM Express Mini modules attractive for system designs with state-of-the-art operating systems. The modules also make it easy to connect two 4k screens thanks to the powerful new Intel Gen 8 graphics with up to 16 execution units. The high level of computing and graphics performance is complemented by a full set of COM Express Type 10-compliant interfaces. This makes the new COM Express Mini modules a good choice for graphics-intense low-power applications requiring a small footprint. The conga-MA4 COM Express Mini Type 10 modules are equipped with 14 nm Intel Pentium and Celeron processors with a 4 Watt SDP or 6 Watt thermal design power (TDP), as well as up to 8 GB of fast dual channel DDR3L 1600 RAM. The integrated Intel Gen 8 graphics enable applications with high quality visuals at resolutions of up to 4k with the latest 3D features i.e. DirectX 11.1 and OpenGL 4.2 on two screens. The video engine provides judder-free decoding of H.265/ HEVC compressed videos while offering maximum CPU offload and real time encoding of two 1080p H.264 video streams at 60 Hz. Innovative, interactive applications with facial recognition and/or gesture control can also be supported through the optional feature connector, which allows direct connection of two CSI2 cameras. The new congatec computer modules support COM Express Type 10 pin-out with 3x PCI Express Gen 2.0 Lanes, 1x Gigabit Ethernet, 2x SATA 3.0, 2x USB 3.0, 8x USB 2.0, 2x UART along with I²C, SPI, LPC and HD Audio. congatec’s operating system support covers Linux distributions and Microsoft Windows variants - including Microsoft Windows 10. An extensive range of accessories that simplify design-in, such as heatsinks, carrier boards and starter kits as well as SMART battery management modules, round off congatec’s offering. congatec, San Diego, CA (858) 457-2600.

RTC Magazine NOVEMBER 2015 | 39


Robust 4x 4K Digital Signage Player with Hardware EDID Emulation Function

A four (4) HDMI output digital signage player supports up to 4096 x 2160 4K resolution for every display channel. SI-304 from IBase Technology is not only suitable for 2x2 video walls, but also suitable for menu boards in restaurants, as well as in electronic displays in banks, airports and shopping malls, to convey dynamic information and targeted promotions. SI-304 is powered by a second generation Embedded R-Series accelerated processing unit (APU) with Radeon HD 9000 graphics to deliver graphics performance and power efficiency. It has rich I/O connectivity with 2x USB 3.0, 1x USB 2.0, and 3x RJ45 for 2x Gigabit LAN and 1x RS232, with expansion features via 2x mPCIe and 1x UIM/SIM card slot for WiFi, Bluetooth or 3G/LTE functions. In addition, the player is built with hardware EDID emulation, a feature that allows system setup with a variety of screen configuration, power and signals, to troubleshoot itself without the need of resetting the unit when there is a problem. SI-304 is designed with a segregated ventilation to keep contaminants out and enhance system stability. It also supports Eyefinity™ Technology that allows users to easily configure multiple display configurations. Measuring 269mm by 193mm by 29.5mm, this compact player is available with DDR3 SO-DIMM 2133 memory of up to 32GB, M.2 64GB Solid State Drive (SSD), and a 150W power adaptor. For more information, please contact an IBASE sales representative, or visit our website – www. IBASE Technology, Taipei, Taiwan + 886-2-26557588.

FIND the products featured in this section and more at

New AMD Embedded R-Series SOC Now on COM Express Module

A set of new COM Express basic modules are being introduced parallel to AMD’s launch of its new generation of highend embedded processors. The new conga-TR3 modules from congatec with dual- or quad-core AMD Embedded R-Series SOCs offer not only a much more broadly scalable TDP from 12 to 35 watts over earlier modules and a significantly improved performance per watt but two further prominent new features: the extremely high-performance AMD Radeon graphics as well as full support of the Heterogeneous System Architecture (HAS) specification 1.0. The new conga-TR3 COM Express Basic modules with Type 6 pinout are equipped with highly integrated AMD Embedded R-Series SOC processors and support up to 32 GB fast DDR4 RAM with optional ECC. The AMD Radeon GPU is based on AMD’s Graphics Core Next (GCN) Generation 3 architecture and provides up to three independent 4k displays with 60 Hz via eDP, DisplayPort 1.2, and HDMI 2.0. OpenGL 4.0 and DirectX 12 are also supported for especially fast, Windows 10-based 3D graphics. The integrated hardware accelerators allow energy-efficient streaming of HEVC videos in both directions. Thanks to HSA 1.0 and OpenCL 2.0 support, workloads can be immediately allocated to the most effective processing unit. In security-critical applications, AMD Secure Processor provides hardware-accelerated encryption and decryption of RSA, SHA, and AES. Together with the optional Trusted Platform module, the conga-TR3 thus offers high security for IoT, big data, and telecommunications applications. The new congatec computer modules support the COM Express Type 6 pinout with PEG 3.0 x8, Gigabit Ethernet, 4x USB 3.0/2.0, 4x USB 2.0, SPI, LPC as well as I²C, SDIO, and 2x UART. Operating system support is offered for Linux and Microsoft Windows 10, 8.1, and optionally Windows 7. congatec, San Diego, CA. 858-457-2600

40 | RTC Magazine NOVEMBER 2015

FIND the products featured in this section and more at

New R-Series Processors Deliver Superior Graphics, First DDR4 Memory Support

AMD has announced the new AMD Embedded R-Series SoC processors that establish performance across a targeted range of embedded application market requirements for digital signage, retail signage, medical imaging, electronic gaming, media storage and communications and networking. Designed for demanding embedded needs, the new processors incorporate the newest AMD 64-bit x86 CPU core (“Excavator”), plus third-generation Graphics Core Next GPU architecture, and state-of-theart power management for reduced energy consumption. The single-chip system-on-chip (SoC) architecture enables simplified, small form factor board and system designs from AMD customers and a number of third party development platform providers, while providing astounding graphics and multimedia performance, including capability for hardware-accelerated decode of 4K video playback. Features include a robust suite of peripheral support and interface options, high-end AMD Radeon graphics, the industry’s first Heterogeneous Systems Architecture (HSA) 1.0 certification, and support for the latest DDR4 memory. With the latest generation AMD Radeon graphics as well as the latest multimedia technology integrated on-chip, the Embedded R-Series SoC provides enhanced GPU performance and support for High Efficiency Video Coding (HEVC) for full 4K decode and DirectX 12. The new AMD Embedded R-Series SoCs offer 22 percent improved GPU performance when compared to the 2nd Generation AMD Embedded R-Series APU and a 58 percent advantage against the Intel Broadwell Core i7 when running graphics-intensive benchmarks, Specifications for the integrated AMD Radeon graphics include up to eight compute units and two rendering blocks, GPU clock speeds up to 800MHz resulting in 819 GFLOPS along with DirectX 12 support. HSA is a standardized platform design that unlocks the performance and power efficiency of the GPU as a parallel compute


engine. It allows developers to more easily and efficiently apply the hardware resources in today’s SoCs, enabling applications to run faster and at lower power across a range of computing platforms. The Embedded R-Series platform incorporates a full HSA implementation which balances the performance between the CPU and GPU. Leveraging the heterogeneous Unified Memory Architecture (hUMA) allows for reduced latencies and maximizes memory access to both the CPU and GPU to increase performance. The AMD Embedded R-Series SoC was designed with embedded customers in mind and includes features such as industrial temperature support, dual-channel DDR3 or DDR4 support with ECC (Error Correction Code), Secure Boot, and a broad range of processor options to meet an array of embedded needs. Additionally, configurable thermal design power (cTDP) allows designers to adjust the TDPs from 12W to 35W in 1W increments for greater flexibility. The AMD Embedded R-Series SoC also has a 35 percent reduced footprint when compared to the 2nd Generation AMD Embedded R-Series APU, making it an excellent choice for small form factor applications. The processors support Microsoft Windows 7, Windows Embedded 7 and 8 Standard, Windows 8.1, , Windows 10, and AMD’s all-open Linux driver including Mentor Embedded Linux from Mentor Graphics and their Sourcery CodeBench IDE development tools. Advanced Micro Devices, Sunnyvale, CA. (408) 749-4000.

RTC Magazine NOVEMBER 2015 | 41

ADVERTISER INDEX GET CONNECTED WITH INTELLIGENT SYSTEMS SOURCE AND PURCHASABLE SOLUTIONS NOW Intelligent Systems Source is a new resource that gives you the power to compare, review and even purchase embedded computing products intelligently. To help you research SBCs, SOMs, COMs, Systems, or I/O boards, the Intelligent Systems Source website provides products, articles, and whitepapers from industry leading manufacturers---and it's even connected to the top 5 distributors. Go to Intelligent Systems Source now so you can start to locate, compare, and purchase the correct product for your needs.

Company...........................................................................Page................................................................................Website Adlink.... congatec, Inc....................................................................................................................... Dolphin................................................................................................................................... 16.................................................................................................................. Innovative Integration..................................................................................................17....................................................................................................... Intelligent Systems Source....................................................................................4, La Men Middle Canyon................................................................................................................11, One Stop Systems..................................................................................................... 16, Pentek...................................................................................................................................... 2........................................................................................................................... Product Gallery.................................................................................................................30...................................................................................................................................................... Super Micro Computers, WinSystems......................................................................................................................... 5...............................................................................................................

RTC (Issn#1092-1524) magazine is published monthly at 905 Calle Amanecer, Ste. 150, San Clemente, CA 92673. Periodical postage paid at San Clemente and at additional mailing offices. POSTMASTER: Send address changes to The RTC Group, 905 Calle Amanecer, Ste. 150, San Clemente, CA 92673.

42 | RTC Magazine NOVEMBER 2015

Embedded/IoT Solutions Connecting the Intelligent World from Devices to the Cloud Long Life Cycle · High-Efficiency · Compact Form Factor · High Performance · Global Services · IoT

IoT Gateway Solutions

Compact Embedded Server Appliance

Network, Security Appliances

High Performance / IPC Solution



SYS-5018A-FTN4 (Front I/O)

SYS-6018R-TD (Rear I/O)

Cold Storage

4U Top-Loading 60-Bay Server and 90-Bay Dual Expander JBODs

Front and Rear Views SYS-5018A-AR12L

SC946ED (shown) SC846S

• Low Power Intel® Quark™, Intel® Core™ processor family, and High Performance Intel® Xeon® processors • Standard Form Factor and High Performance Motherboards • Optimized Short-Depth Industrial Rackmount Platforms • Energy Efficient Titanium - Gold Level Power Supplies • Fully Optimized SuperServers Ready to Deploy Solutions • Remote Management by IPMI or Intel® AMT • Worldwide Service with Extended Product Life Cycle Support • Optimized for Embedded Applications

Learn more at © Super Micro Computer, Inc. Specifications subject to change without notice. Intel, the Intel logo, Intel Core, Intel Quark, Xeon, and Xeon Inside are trademarks or registered trademarks of Intel Corporation in the U.S. and/or other countries. All other brands and names are the property of their respective owners.

RTC Magazine  

November 2015

RTC Magazine  

November 2015