RTC Magazine

Page 19

Sensoary_RTC_May2014_Ad.pdf

1

4/28/2014

TECHNOLOGY CORE

P RO D U C T S P OTLIG HT not be ported to a GPU. With HSA, this is handled transparently.

HSAIL and Workflow Efficiency

HSA exposes the parallel nature of GPUs through the HSA Intermediate Language (HSAIL). HSAIL is translated onto the underlying hardware’s instruction set architecture (ISA). While HSA GPUs are often embedded in powerful graphics engines, the HSAIL language is focused purely on compute and does not expose graphics-specific instructions in the base instruction set (HSAIL extensions may target additional accelerators in the future). The underlying hardware executes the translated ISA without awareness of HSAIL. The smallest unit of execution in HSAIL is called a work-item. A work-item has its own set of registers, can access assorted system-generated values, and can access private (work-item local) memory. Work-items use regular loads and stores to access private memory, which resides in a special private data memory segment. Work-items are organized into cooperating teamsz called work-groups. Workgroups can share data through group memory, again using normal loads and stores. Memory shared across a work-group is identified by address. Each work-item in a work-group has a unique identifier called its local ID. Each work-group executes on a single compute unit, and HSA provides special synchronization primitives for use within a work-group. A work-group is part of a larger group called an n-dimensional range (NDRange). Each work-group in an NDRange has a unique work-group identifier called its global ID, available to any work-item within the NDRange. Workitems within an NDRange can communicate with the CPU through memory, because the address space (excluding group and private data) is shared across all workitems and is coherent with the CPU. Figure 3 shows an NDRange, a work-group and a work-item. The finalizer translates HSAIL into the underlying hardware ISA at runtime. The finalizer also enforces HSA virtual machine semantics as part of the translation to ISA. Because of that, the underly-

ing hardware architecture does not have to strictly adhere to the HSA Virtual Machine. The HSA Virtual Machine can also be implemented on a CPU, by having the finalizer convert HSAIL to the CPU’s ISA. HSAIL has a unified memory view. The virtual address, rather than a special instruction encoding, determines whether a load or store address is private, shared among work-items in a work-group, or globally visible. This can relieve the programmer of much of the burden of memory management. Memory between the CPU and GPU cores is coherent. The HSA Memory Model is based on a relaxed consistency model and is consistent with the memory models defined for C++11, .NET and Java. Because the entire address space of the CPU is available to the GPU, the programmer can handle large data sets without special code to stream data into and out of the GPU. The current state of GPU high-performance computing is not flexible enough for many of today’s computational problems. HSA helps meet this challenge by unifying CPUs and GPUs into a single system with common computing concepts, allowing the developer to solve a greater variety of complex problems more easily. This ultimately empowers software developers to easily innovate and unleash new levels of performance and functionality on modern computing devices, leading to powerful new experiences such as visually rich, intuitive, human-like interactivity. The HSA Foundation (HSAF) was formed as an open industry standards body to unify the computing industry around this common approach. The foundation’s ultimate goal is to provide a unified install-base across all platforms and devices, which will simplify the software development process by enabling software developers to “write once and run everywhere.”

C

M

Y

CM

MY

CY

CMY

K

Advanced Micro Devices Sunnyvale, CA. (408) 749-4000. www.amd.com www.hsafoundation.com

SENSORAY embedded electronics experts Made in USA

Specializing in the development of off-the-shelf and custom OEM embedded electronics for medical, industrial and security applications.

Model 953-ET

Rugged, Industrial Strength PCIe/104 A/V Codec

• 4 NTSC/PAL video input/outputs • 4 stereo audio inputs/outputs • H.264 HP@L3, MPEG-4 ASP, MJPEG MJPEG video; AAC, G.711, PCM audio • Ultra-low latency video preview concurrent w/compressed capture • Full duplex hardware encode/decode • Text overlay, GPIO

www.SENSORAY.com/RT05/953 Model 2253P Ideal for video pipeline inspection, radar processing and video surveillance

A/V H.264 Codec with GPS and Incremental Encoder Interfaces

• Simultaneous encode/decode • Low preview latency; Text overlay • H.264 HP@L3, MPEG-4 ASP, MJPEG video compression • GPS receiver and two incremental encoder interfaces/dual GPIO • Pause/resume capture of a video stream • Encoder counts and GPS data can be overlaid onto video

www.SENSORAY.com/RT05/2253P

2.8” 3.9”

Model 2224 | Small Form Factor

USB H.264 HD-SDI A/V Encoder

• High Profile H.264 HD/SD encoding • Transport Stream Format • HD (1080p/1080i, 720p), SD (NTSC, PAL) • Audio from SDI or external • Blu-ray compatible • HD/SD uncompressed frame capture • Full screen 16-bit color graphics/text overlays with transparency • Low latency uncompressed preview

www.SENSORAY.com/RT05/2224

SENSORAY. com | 503.684.8005 RTC MAGAZINE MAY 2014

19


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.