Xcell Software Journal issue 1

Page 32

XCELL SOFTWARE JOURNAL: XCELLENCE WITH SDACCEL

for (int y=0; y < height; y++) { int offset = y * width; int prev = offset - width; int next = offset + width;

The compiler in the SDAccel development environment provides three features that help programmers tackle those challenges: automatic extraction of parallelism among statements within a loop and across loop iterations, automatic memory architecture inference based on read and write patterns to arrays, and architectural awareness of the type and quantity of basic logic elements inside a given FPGA device. We can illustrate the importance of these three features with regard to source code for a median filter (Figure 2). The median filter operation is expressed as a series of nested loops with two main sections. The first section fetches data from an array in external memory called input and stores the values into a local array RGB. The second section of the algorithm is the “for” loop around the getMedian function; getMedian is where the computation takes place. By analyzing the code in Figure 2, the SDAccel environment understands that there are no loop-car-

for (int x=0; x < width; x++) { // Get pixels within 3x3 aperture uint rgb[SIZE]; rgb[0] = input[prev + x - 1]; rgb[1] = input[prev + x]; rgb[2] = input[prev + x + 1]; rgb[3] = input[offset + x - 1]; rgb[4] = input[offset + x]; rgb[5] = input[offset + x + 1]; rgb[6] = input[next + x - 1]; rgb[7] = input[next + x]; rgb[8] = input[next + x + 1]; uint result = 0; // Iterate over all color channels or (int channel = 0; channel < 3; f channel++) { result |= getMedian(channel, rgb); }

}

// Store result into memory output[offset + x] = result;

Figure 2 — Median filter code

Figure 1 — Median filter operation

ried dependencies on the array RGB. Each loop iteration has a private RGB copy, which can be stored on different physical resources. The other main characteristic that the SDAccel environment can derive from this code is the independent nature of calls to the getMedian function. 32

The version of the algorithm in Figure 2 executes the getMedian function inside a “for” loop with a fixed bound. Depending on the performance target for the filter and the FPGA selected, the SDAccel environment can either reuse the compute resources across all three channels or allocate more logic to enable spatial parallelism and run all channels at the same time. This decision, in turn, affects how memory storage for the array RGB is implemented. From an application programmer’s perspective, the steps described above are transparent and can be thought of as –O1 to –O3 optimizations in the GNU Compiler Collection (GCC).


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.