stp-2008-04 by Morten Kristiansen

Publication

: ST ES BE CTIC tion A a g PR plic ilin Ap Prof

VOLUME 5 • ISSUE 4 • APRIL 2008

Tinker With Tests to Crank Up Reports Turn the Tables On Your Bug List s A Sound Approach: Trust But Verify

If SOA Were Like Audio The Tuner Could Be Yo , u

4AKE THE

HANDCUFFS OFF

QUALITY ASSURANCE

Empirix gives you the freedom to test your way. Tired of being held captive by proprietary scripting? Empirix offers a suite of testing solutions that allow you to take your QA initiatives wherever you like. Download our white paper, “Lowering Switching Costs for Load Testing Software,” and let Empirix set you free.

www.empirix.com/freedom

VOLUME 5 • ISSUE 4 • APRIL 2008

Contents

Publication

COV ER STORY

Fine-Tuning SOA in Search Of Sweet Harmony

Like your home entertainment system, an SOA’s modular apps combine independent services from varied sources. This three-pronged testing approach lets you tune into and eliminate SOA trouble spots. By Hon Wong

Whip Those Tests Into Shape

When your tests are up for review by management, you need to find a thoroughbred solution. Learn to build smarter, more modular tests to deliver more meaningful and By Glenn Stout informative results.

Depar t ments

A Bug-List Rx Can Turn The Tables on Testing

What if your defect records could reveal more? With a therapeutic analysis of your product defect data, you can uncover gaps and fix your processes before they sicken your By Colleen Voelschow system.

7 • Editorial What’s with all the konked-out kiosks? These untrapped errors are driving me crazy.

8 • Contributors Get to know this month’s experts and the best practices they preach.

Trust, But Verify: Close-Up On Tester Objectivity

No matter how thorough your testers are, there’s something missing: objectivity. That’s why independent verification and validation works wonders to make bugs easier to find—and to fix. By Sreenivasa Pisupati

9 • Feedback It’s your chance to tell us where to go.

11 • Out of the Box New products for testers.

36 • Best Practices Application profiling is important—but is it becoming obsolete? By Geoff Koch

38 • Future Test Finally! Here to liberate the masses, it’s the Software Tester’s Bill of Rights. By I.B. Phoolen

APRIL 2008

www.stpmag.com •

Ed Notes VOLUME 5 • ISSUE 4 • APRIL 2008 Editor Edward J. Correia +1-631-421-4158 x100 ecorreia@bzmedia.com

EDITORIAL Editorial Director Alan Zeichick +1-650-359-4763 alan@bzmedia.com

Copy Editor Laurie O’Connell loconnell@bzmedia.com

Contributing Editor Geoff Koch koch.geoff@gmail.com

ART & PRODUCTION Art Director LuAnn T. Palazzo lpalazzo@bzmedia.com SALES & MARKETING Publisher

Ted Bahr +1-631-421-4158 x101 ted@bzmedia.com Associate Publisher

List Services

David Karp +1-631-421-4158 x102 dkarp@bzmedia.com

Lisa Fiske +1-631-479-2977 lfiske@bzmedia.com

Advertising Traffic

Reprints

Phyllis Oakes +1-631-421-4158 x115 poakes@bzmedia.com

Lisa Abelson +1-516-379-7097 labelson@bzmedia.com

Director of Marketing

Accounting

Marilyn Daly +1-631-421-4158 x118 mdaly@bzmedia.com

Viena Ludewig +1-631-421-4158 x110 vludewig@bzmedia.com

READER SERVICE Director of Circulation

Agnes Vanek +1-631-443-4158 avanek@bzmedia.com

Customer Service/ Subscriptions

+1-847-763-9692 stpmag@halldata.com

Audio equipment on this month’s cover appears courtesy of Pioneer Electronics, USA

President Ted Bahr Executive Vice President Alan Zeichick

BZ Media LLC 7 High Street, Suite 407 Huntington, NY 11743 +1-631-421-4158 fax +1-631-421-4130 www.bzmedia.com info@bzmedia.com

Software Test & Performance (ISSN- #1548-3460) is published monthly by BZ Media LLC, 7 High Street, Suite 407, Huntington, NY, 11743. Periodicals postage paid at Huntington, NY and additional offices. Software Test & Performance is a registered trademark of BZ Media LLC. All contents copyrighted 2008 BZ Media LLC. All rights reserved. The price of a one year subscription is US $49.95, $69.95 in Canada, $99.95 elsewhere. POSTMASTER: Send changes of address to Software Test & Performance, PO Box 2169, Skokie, IL 60076. Software Test & Performance Subscribers Services may be reached at stpmag@halldata.com or by calling 1-847-763-9692.

APRIL 2008

Caught With Its UI Pants Down Some things are just a matmight even expect a pubter of time. While walking lic museum’s electronic through an airport a while displays to fail half the back, I came across a large time (which I’ve borne flat-panel monitor that I witness to). believe was supposed to be But an ATM? That’s displaying some sort of right, an ATM. I once advertisement. walked up to an airport Instead, the screen was money machine to find a awash in that special azure blue screen where a hue that we in the comWindows NT app was supEdward J. Correia puter generation have posed to be. I was incredcome to know as the blue screen of ulous. I thought about writing down death. the error message and contacting the In a small corner of the screen, machine’s owner, but I was soon disdwarfed by the enormity of the tracted by my wife, who might have cerulean real estate, was a message in been saying “Why do you care? Let’s white courier that included the name go!” She doesn’t understand my fasciof the operating system and what nation. looked to me like gibberish. We’ve all experienced broken (or I didn’t have to understand the empty) ATM machines. Usually they cryptic message—I knew exactly what just fail to give out money. I can’t happened there: A kiosk application recall ever hearing about a rogue ATM threw an exception that developers giving out free cash. failed to handle and testers failed to To their credit, ATM developers reproduce. and testers always get it right when it Am I the only one who notices? It’s comes to capital preservation. On the surprising how often I see this type of other hand, they have the luxury of a untrapped error. I’ve seen these no-brain decision: application down“active billboards” fail once or twice in time always wins over money loss. movie theaters, but most times they’re Some errors are harder to find than in airports, where I suppose the high others. If your app’s host operating systraffic justifies the use of this expentem insists on a dialog when virtual sive medium and its requisite developmemory is running low, for instance, ment and maintenance. be sure you’re testing for and trapping Is it just me? Too often, I’ve walked that. through airports and other public For some general exception faults, places and seen a blue-screened error a blue screen might be unavoidable. message where a kiosk-style applicaDon’t get caught with your pants tion should have been. For testers and down. Instead, force a screen that the companies that employ them, this reads, “This space available. Call 1-800type of embarrassment is simply inexYOUR-AD-HERE!” ý cusable. CORRECTION Time-to-market pressures are surely Keith Ellis is vice president of IAG a factor for these displays, particularly Consulting. His company was misidentified if they’re hawking a cinematic release in my Test & QA Report newsletter article on or other time-sensitive product. And I Feb. 12, titled “Bad Requirements Cause suppose that very few kiosks would be Failure. Really?” considered mission-critical. People www.stpmag.com •

Contributors While working for Intel in the 1980s, HON WONG defined the product specifications for the pivotal 386SX and other successful microprocessors. He went on to assist 3Com in development of LAN Manager for OS/2 and later helped IBM with that operating system’s server architecture. During the 1990s, Hon founded several prosperous companies to capitalize on the growing market of so-called WinTel servers. Today, Hon serves as CEO of Symphoniq, the Web performance monitoring and optimization company he founded in 2000. Beginning on page 14, he describes SOA performance tuning techniques using a threephased approach.

DR. GLENN A. STOUT is quality and methods manager in the Internal Business Systems group at Hewitt Associates, an HR business process outsourcing consultancy. His duties include management of a quality assurance team and responsibility for the software development methodology of the group. Glenn currently serves as president of the Chicago Rational User Group. A frequent attendee at the Rational User Conferences, Glenn has presented at the conference nine times, generally in the quality and requirements tracks. Turn to page 20 for his fine tutorial on adapting your existing test cases to garner more meaningful and useful results.

Engaged in quality assurance practices for more than nine years, COLLEEN VOELSCHOW is currently a senior staff consultant at ProtoTest, a software QA consultancy. Colleen conducts QA assessments to evaluate processes and identify areas for improvement for ProtoTest clients, and is frequently contracted by those clients to oversee implementation of her recommendations. Colleen approached Software Test & Performance several months ago with a technique she developed to turn a company’s lists of defects into an inventory of tasks for process improvement. Now we bring those ideas to you, beginning on page 26.

SREENIVASA PISUPATI is assistant VP of testing at AppLabs Technologies, a global IT services company specializing in quality management, testing, and certification solutions. Sreenivasa has been in the computer industry for more than 18 years, including 14 years as a software tester and process consultant. He holds numerous testing certifications, including two from Mercury Interactive. Starting on page 31, learn from Sreenivasa’s experience as he explores how independent verification and validation can help reduce defect creation, increase defect detection, mitigate risk and give greater visibility to the financial, managerial and subject matter aspects of any project. TO CONTACT AN AUTHOR, please send e-mail to feedback@bzmedia.com.

• Software Test & Performance

APRIL 2008

Feedback REMEMBER H-1B! Regarding Edward J. Correia’s “Outsourcing Controls May Be Unnecessary” (Test & QA Report, February 19, 2008), while the numbers in this article are compelling, there’s a big piece of broader context missing from the analysis (on the part of SD Times as well as too many politicians): the number of H-1B visas awarded to foreign IT workers. Do the companies surveyed employ significant numbers of H-1B or green-card holders in IT? How do those employees’ or contractors’ rates compare to citizen FTEs? Are they planning to increase those numbers? It would be interesting to see another version of Table 4; I expect more satisfaction re: the effort and cost savings, and would be interested to see the numbers for quality of work, morale of U.S.based workforce and security concerns. Joe Niski Portland, Oregon

NOMINATIONS OPEN MAY 1 FOR THE 2008 TESTERS CHOICE AWARDS "The Testers Choice Awards recognize excellence in software test and performance tools. The awards encompass the full range of tools designed to improve software quality." Nominations open on May 1 and close on June 13. There is no limit on the number of products that may be nominated by a company, or on the number of categories in which a product may be nominated. There will be a processing fee for each nomination. All nominations must be received by June 13, 2008. VOTING STARTS JULY 1 WATCH YOUR E-MAIL BEGINNING JULY 1 FOR YOUR INVITATION TO VOTE. Online voting opens on July 1 and closes on July 30. Only qualified subscribers to Software Test & Performance may vote.

Winners will be announced at the Software Test & Performance Conference Fall 2008, Sept. 24-26, 2008, in Boston, MA.

AMMO FOR THE GOOD FIGHT Thanks to Edward J. Correia for “Testers Are Idiots” (Test & QA Report, March 4, 2008). Very representative of what I am currently fighting. I will use this as ammo. Linda Reiher Via e-mail

TWENTY YEARS OF FAILURE “Bad Requirements Cause Failure. Really?” (Test & QA Report, February 12, 2008). How true. I have seen over 20 years of this happening. It’s a shame that people who do not know anything about software development continue to make decisions that result in this situation. It seems that all the certification and development models have not achieved anything except extra work and increase the cost of doing business. I know I am generalizing, but a number of developers, even those who have the best credentials, fail to realize what needs to happen and provide the resources needed to accomplish the job. Jim Watson New Albany, Ohio FEEDBACK: Letters should include the writer’s name, city and state. Send your thoughts to feedback@bzmedia.com. Letters become the property of BZ Media and may be edited for space and style.

APRIL 2008

The awards will appear in the Nov. 2008 issue of Software Test& Performance. Questions? Contact editor Edward J. Correia at ecorreia@bzmedia.com.

AWARD CATEGORIES FROM 2007 (may change for 2008)

Functional Test Solution Test/QA Management Solution Load/Performance Test Solution SOA/Web Services Test Solution Security Test Solution Test Automation Solution Defect/Issue Management Solution SCM/Build Management Solution Static/Dynamic Code Analysis Solution Data Test/Performance Solution .NET Test/Performance Solution Java Test/Performance Solution Integrated Test/Performance Suite Free Test/Performance Solution Best Solution From a New Player Commercial Test/Performance Solution Under $500/seat Commercial Test/Performance Solution Under $500/seat Embedded/Mobile Test/Performance Solution

stpmag.com/testerschoice www.stpmag.com •

7FSTJPO

4FOTJCMF QSJDF 4VQFSJPS BVUPNBUFE UFTUJOH

5FTU /&5 %FMQIJ +BWB

/&8 3FDPSE 5FTU $IFDLQPJOUT

8FC "VUPNBUFE 8JOEPXT 5FTU %FTLUPQ

/&8 #FUUFS 'BTUFS 8FC 5FTUT

-PBE &BTZ 7JTUB 5FTU 4ZODISPOJ[FE 'BTU %JTUSJCVUFE 1PXFSGVM $MJFOU 4FSWFS

/&8 5FTU :PVS CJU "QQT /&8 5FTU :PVS 8FC 4FSWJDFT

5SZ 5FTU$PNQMFUFÂ&#x2122; GSFF GPS EBZT BOE FOUFS UP XJO BO "QQMF J1PE 5PVDI

XXX UFTUDPNQMFUF DPN TUQ

U FTU EFC VH EF MJWFS

Out of t he Box

ReplayDirect Goes Deep, Gets Code Replay Solutions, a five-year-old maker of test tools for the gaming industry, has released ReplayDirector for Java EE, a version of its flagship execution recorder that it claims dives deeper than mouse-clicks and keystrokes to capture the actual code being run, and can replay it line by line. The tool works by identifying “every potential source of non-determinism in an application” and recording it. During playback, recorded data is “fed back to the application, and asynchronous timings are reproduced.” This allows test engineers to “precisely and immediately reproduce an exact sequence of program instructions,” according to the company, including events brought about by network data, user input, interrupts, callbacks and “thousands of other data sources.” Users can pause execution right before a problem occurs and analyze the state of

ReplayDirector, a new execution recorder from gaming industry test-tools maker Replay Solutions, uses compiled binaries and standard debuggers to capture and replay actual code line by line.

the debugger. The company claims that Replay Director works with compiled binaries (yours or those of third parties), requires no changes to source code and causes only minimal performance delays. It works with most standard debuggers, including those in Eclipse, and permits

breakpoints, single-stepping and data inspection. ReplayDirector is free for Tomcat and starts at US$399 for JBoss, WebSphere and WebLogic. Tomcat and JBoss editions were scheduled to begin shipping by the end of March; the others by the end of June.

Get a Total View With Workbench Manager TotalView Technologies has taken a step toward unifying commercial and open source tools for multi-core debugging. With the February release of Workbench Manager 1.1, the company gives teams a “dashboard-style” GUI for Linux, Mac OS X and Unix that lets testers “view, manage and launch” any of the applications in their tool chain, including the TotalView Debugger and MemoryScape tools. “We wanted to provide an easy and useful mechanism for integrating and managing… tools,” said Kelly Cunningham, vice president of engineering at TotalView. “By facilitating the integration of our debugging tools with commercial and open source tools, [teams] can now access and run the tools they actually use on a daily basis more efficiently and effectively.” The release of Workbench Manager, which is free to existing customers, APRIL 2008

comes just weeks after the introduction of the TotalView Multi-Core Debugging Framework, part of a suite it says is designed to “simplify the complexities of multi-core debugging.” The suite includes an enhanced version of TotalView Debugger for multilanguage and multi-processing protocol apps, and introduced MemoryScape, a new interactive memory profiling tool that it claims can identify and help resolve problems on running applications. MemoryScape, which is now at version 2.2, works with the TotalView Debugger to find memory leaks, heap allocation overwrites and other such problems that can be detected only during execution. The company identifies source code, memory, performance, data-centricity and active Web as the five areas essential to creating and debugging the mul-

ti-threaded, multi-process applications needed for emerging platforms. Now at version 8.4, TotalView Debugger gives Linux, Mac OS X and Unix developers and testers a “single view of the complete application, parallel process acquisition and control; advanced breakpoints and watchpoints,” and offers the ability to test fixes to code in real time, according to company documents. The latest version enhances breakpoint-setting capabilities, and lets testers set C++ breakpoints on all methods of a class or all functions that have the same name but different arguments. Also new is the ability to show class static variable data and to set rules for source code searches. Licensing restrictions also have been loosened, making the tool more accessible to smaller teams and individuals, the company said. Pricing was not disclosed. www.stpmag.com •

Network Emulation: The Good, the Bad and the Ugly It’s one thing to emulate a network. But emulating its traffic is another thing entirely. So says network emulation tools company iTrinegy, which in late February began shipping INE Companion, an appliance that it says represents not just the good things about a network, but also its bad and ugly components. “[Testers] recognize the value of using an emulator to provide realistic network conditions and to validate the test,” said iTrinegy product director Frank Puranik. “But we have also noted that there is an increasing need to re-create, in full, the live network conditions and create a background network load.” Addressing that need, INE Companion monitors actual network conditions as they’re being experienced by production applications and packages them into scenarios. Those network traffic scenarios can then be loaded into the iTrinegy appliance and “played back” during testing. “INE Companion is in essence ‘a companion’ for delivering the whole picture of how applications are performing and affecting each other, and transfers that information to the emulator for real-time testing,” he said. In addition, the tool also measures application response time, network general health, application and overall bandwidth utilization, can change packet profiles to “fool” network devices into thinking they are from different applications, and can decode packets for debugging and performance diagnostics, the company said.

Excel Software Makes Licensing A Drag (and Drop) Is your team spending time building license-enforcement components for its Mac OS X and Windows apps? If so, you might want to know about QuickLicense 2.2 from Excel Software. The tool can wrap an executable with a variety of licens-

• Software Test & Performance

es simply by dropping it onto an icon. A few seconds later, a license-protected application appears with the original app’s icon—without coding of any kind. The US$495 tool for Windows or Mac OS X ($795 for both platforms), supports time- or execution-limited trial licensing, machine- or name-activated licenses and software subscriptions. The Windows version supports Windows 98 through Vista; the Mac OS X version supports Universal Binaries for Intel and PPC processors. The licensing fees include royalty-free distribution of the QuickLicense runtime. According to the company, Quick License works in one of two ways: The runtime and encrypted ticket files, which hold the license information, can be deployed with the application. On startup, the application calls the runtime to validate the license. Alternatively, the runtime and ticket are embedded into the desktop application file. This flexibility permits the licensing of plugins, spreadsheets, multimedia files and the like.

Latest Chip-Testing Tools Are Super, Novas Says Novas Software, which makes debugging tools for ICs and SoCs, in February announced updates for its Verdi Automated Debug and Siloti Visibility Enhancement solutions. The company was scheduled to begin shipping the new tools, which both now support SystemVerilog verification methodologies, by the end of March. According to company documents, Verdi Automated Debug now brings “assertion dumping with on-demand evaluation of property, sequence and local variable data, function evaluation with computation of return values for annotation and tracing, and incremental behavior analysis at the module level,” which it claims improves performance and simplifies setup for automatic tracing using Verdi’s temporal flow view. New features in Siloti Visibility Enhancement include essential signal analysis (ESA) modes, which the com-

pany claims can simplify optimization of the essential signal list for debugging tasks at hand. “One-time ESA creates [database oriented] single ES lists for the whole design from which users can select signals to dump by scope/level at simulation runtime.” Siloti now supports command-line operation for ESA integration with a user’s own tools and for quickly starting “debug-ready mode” for Siloti data expansion and Verdi debugging. The tool also now can automatically determine the time window for data regeneration based on user activity. Both are available now.

TrueView Does End To-End SOA Testing TrueView for SOA gives testers browserbased visibility into Java EE and .NET applications from the client side through to the back-end server and everywhere in between. That’s the claim of Symphoniq, which makes performance-monitoring tools. It released TrueView for SOA in mid February. TrueView for SOA provides end-toend visibility with the use of tags to track transactions from start to finish across all tiers of an architecture, be it internal or external, homegrown or that of a third party. According to Symphoniq cofounder and CEO Hon Wong, TrueView for SOA allows organizations to optimize application performance “by monitoring real users and real transactions in real time,” allowing them to “detect and isolate the exact service causing the problem.” Capabilities include time-based measurement of the user experience regardless of the number of services they participate in, visibility into which services and machines have federated together to provide functionality and service response, detailed drill-down information including problem code, and support across heterogeneous (Java and .NET) applications and environments. Send product announcements to stpnews@bzmedia.com APRIL 2008

Show Report

‘Testers Are Idiots’ That’s right: Testers are idiots. The practice of testing offers no innovation. Testing is boring, manual and repetitive. It’s not a career. Testers aren’t as smart as developers. They’re nit-picky, pencilpushing quality/process geeks. They’re beside the point and are easily replaced. Testing is not a career; it’s a necessary evil between application users and the brilliance of developers. Believe it or not, some of these assertions came from an audience of testers at FutureTest, a conference I attended Feb. 26 and 27 in New York City. The politically (and in all other ways) incorrect answers were the result of a question by Cisco’s Jeff Feldstein: “What are developer’s perceptions of test engineering?” Feldstein, who manages a team of 40 software engineers across the U.S, India and Israel, presented a fine talk on how to attract, recruit and retain the most highly talented test engineers—whom he believes have the same skills as development engineers. “But everyone we recruit for testing positions only wants to develop code” was a common audience complaint. But the truth, Feldstein pointed out, is that testers often do a fair amount of coding as the development teams do. For example, testers often build their own test utilities, harnesses and scripts. “If I feel I need something, all I have to do is convince my boss of the reasons, and he says, ‘OK, go off and do it.’ And I can build it any way I like,” Feldstein said. And since test teams are smaller than their development counterparts, testers often are able to see and work on more of the main application being developed, if not the entire thing. Developers—particularly those on large teams—sometimes see only a small part of the project. Feldstein’s presentation, “Software Testing Is About Software Testers,” was a treasure trove of knowledge amassed from his 27-year career about the ways and means of retaining a good team of test engineers. For example, testers and developers should receive equal pay, have comparable career paths and equal say APRIL 2008

with other company organizations about product decisions. “Maybe not as far as the ship/don’t ship decisions, but they should have a say in strategy and product decisions. When I was a tester in a small shop, everything [that went wrong] was my fault,” Feldstein said. Meanwhile, the test group should not be the only group in change of assuring

quality, he said, nor should it be treated as a service organization. Testers should be collocated with developers and be viewed by upper management as developer peers, but remain independent of the development group.

The Low-Hanging Fruit Notable at the conference was a panel discussion called “Testing in the Complete Application Life Cycle.” Moderating the discussion was BZ Media EVP and FutureTest conference chair Alan Zeichick. Following a discussion about the need to move beyond the practice of picking any two of “on time, on budget and high quality,” Zeichick asked, “What do you see test organizations doing to improve… what are the lowhanging fruit?” First to answer was technology analyst Theresa Lanowitz, who said that test organizations need to focus more on customer advocacy and less on becoming a police state. “It could be as simple as changing the department name from QA or testing department to the product validation or product verification depart-

ment,” she said. While such a change might seem superficial and simplistic, she continued, the perception of the department’s function over time will “slowly change to one of core business value.” Next to answer was Mark Sarbiewski, who handles design and implementation of Hewlett-Packard’s Quality Center, Performance Center and Application Security Center products. He suggested striving for more “effective testing without automation by application users.” Another tip is the use of metrics and financial incentives. “Put bonuses on the

From left, Cisco’s manager of software development Jeff Feldstein, Empirix GM Larry Timm, RBCS founder Rex Black and BZ Media EVP Alan Zeichick discuss next-generation tools for software testing and QA.

line tied to bugs in production versus bugs found before deployment,” he recommended, adding, “You’ll see an overnight change.” The final word on this topic—along with some comic relief—came from CollabNet CTO Jack Repenning. His suggestion to “change the culture of the organization around you” drew laughs from the crowd, perhaps because such shifts are never easy. “QA is not a filter,” he said, and suggested that test groups “evaluate your perception in the eyes of the company” and emphasize the value of testing. For tutelage on specifically how, refer to page 32 of the September 2007 issue of this magazine. ý If you missed FutureTest (www.futuretest .net) this year, take heart.This unique conference for test managers will take place again next year. Be sure to set aside February 24 and 25, 2009, on your calendar. www.stpmag.com •

By Hon Wong

he modular applications within an SOA have been likened to home entertainment systems

Photograph by Andrzej Burak

perhaps because they too are a collection of services independent of each other. And as in both, it’s imperative that the separate systems work in harmony to keep business humming along. So how best to create this harmony? It can be done using an SOA technology framework, which enables the rapid implementation of business applications with reusable services or functions. SOA provides a standard methodology for finding and consuming these prefabricated services without regard to the services’ underlying technology, computing platform, algorithm or data schema. In other words, an SOA-based application can be developed by orchestrating services based on a defined business process or workflow. By enabling developers to reuse—instead of reinvent— software functions, development time and costs can be saved. Moreover, the functionality of the application can be driven directly from a high-level description of the business process, making it easier to modify the application in response to changing business needs. To ensure that the resultant SOA-based application can function in a production setting and meet service-level expectations, developers can’t simply start mashing together the services willy-nilly. Building a successful SOA-based application requires as much careful planning and process as designing and launching traditional monolithic applications under Web, client-server or mainframe computing architectures.

The Process-Oriented Approach Proven tools and technologies exist to facilitate the effective use of loosely coupled and interoperable services to implement composite applications. But ensuring the overall performance Hon Wong designed the 3865X processor. He is founder and CEO of Symphoniq, which makes Web-app monitoring tools.

• Software Test & Performance

of the final result is another matter entirely. Just as a chain is only as strong as its weakest link, the performance of an SOA application is limited to the service level achievable by the worst of its connected services. It’s virtually impossible for IT to characterize the performance of constituent SOA services or control the numerous moving parts that can affect application delivery and performance when services being used can be supplied by third-party vendors and can run on several different computing platforms. To effectively deliver complex composite applications based on SOA, IT must be processfocused, not technology-focused. To be process-focused, IT must concentrate on delivering repeatable, scalable and end-to-end functions such as services, support, security management and application management. IT can’t operate along technologyor platform-specific silos as in desktop manAPRIL 2008

agement, server administration, database management, network administration and Windows development practices. Taking a holistic view allows IT to better address the needs of consumers of IT services, and more importantly, allows such services to be delivered in a more cost-effective manner. Cost effectiveness is achieved by focusing on business needs or issues faced by users of applications, not server uptime or network bandwidth. IT spending priorities shouldn’t be centered on technology-oriented silos that form impenetrable towers of Babel. The goal of SOA is to effectively deliver business agility. This extends beyond the efficient implementation of business applications and includes the effective deployment and

APRIL 2008

production management of these applications. With so much riding on the success of SOA initiatives, funding for this crucial capability should be part of the corporation’s SOA—and not a discretionary spending item. SOA-based applications, especially those delivered via the Web, are complex by nature. A three-step approach to performance management—known as detect, isolate and optimize—can overcome SOA’s complexity so that its benefits aren’t neutralized by user dissatisfaction, lost business and ineffective IT. Detect. “You cannot manage what you cannot measure.” That adage holds true for SOA apps. The first step in SOA management is to find a quantitative way to determine whether the SOA application meets service-level requirement. In other words, “Is the right application response (data, page, action etc.) delivered to the right user in the right amount of time?” There are numerous QA techniques to ensure that the right application response is delivered, and most organizations have the necessary security to ensure that the right person is receiving the information. But ensuring that the information is delivered at the right time to the end user through the

www.stpmag.com •

JUICING THE SOA

FIG. 1: THE SOA ALBUM Requirements

Optimize

Design Build

Operate

Deploy

Development Phase

Production Phase

}

Discover & fix performance bottlenecks under load prior to rollout

}

Real-time detection & mitigation of performance issues

complex, Web-based SOA infrastructure is another matter. Having the tools to nonintrusively monitor application performance experienced by real users is an absolute necessity. It’s the only way to accurately detect problems experienced by real users of SOA applications for servicelevel restoration and reporting, and it’s a key driver for making process or application response time improvements. The starting point of such monitoring is the end user’s browser, where the application truly “comes together.” It is at the browser that IT can take into account last-mile circumstances and identify whether an incident has occurred that will affect user satisfaction. Data collected by legacy tools that focus on monitoring a particular technology silo—like network routers, Apache Web servers, WebSphere application servers, .NET frameworks, etc.—can’t be extrapolated to determine what users of complex SOA applications are experiencing in the browser. Isolate. Once application performance as experienced by the end user is known, it has to be correlated with the performance profile of all the infrastructure and application components involved in the delivery of the SOAbased applications. Composite applications: • Are made up of services that are “black boxes” whose performance can’t be controlled or tuned by those orchestrating the application. • Run on physical or virtual infrastructure components that aren’t entirely within the control of IT operations.

• Software Test & Performance

• May have different parts of a transaction served by different data centers or servers, including third-party service providers. Therefore, it’s important that each

and dynamically trace it through the entire infrastructure, logging appropriate performance data at each tier. Such an end-to-end view of performance based on the real user’s experience of “consuming” real transactions offers the bird’s-eye view needed to pinpoint the incidents, errors, bugs or bottlenecks that impact end-user response time. Optimize. A holistic, browser-todatabase view of transaction performance provides actionable information so that ad hoc or trial-and-error approaches are no longer needed to identify and respond to performance problems. Without actionable information, IT incident response teams will likely spend more time debating the cause and attempting to re-create the problem than they will implementing a fix and restoring the business function. By analyzing the same correlated

FIG. 2: SOA ORCHESTRATION

Service Library

Business Process

transaction’s performance is reported and correlated across all infrastructure tiers, third-party data centers and application components. Performance correlation can be achieved by painstaking log file analysis and heuristics to match up IP addresses and request times across various tiers, but this methodology is error-prone and difficult if access to all of the logging information is available, and made impossible if the transaction touches a tier outside of the datacenter where log files are unattainable. Another, simpler, mechanism is to nonintrusively tag each transaction originating at the end-user browser

Orchestrate services from service library

transaction performance information over time, IT can also identify leading indicators of performance concerns so they can be monitored and proactively resolved before an incident impacts user satisfaction or business productivity. Furthermore, the information also helps to identify areas for performance improvement within the infrastructure, services and application. Let’s look at how this approach can be used to improve the chance of success in a SOA deployment.

Example SOA Deployment With the growing complexity of Web applications, having a process for conAPRIL 2008

JUICING THE SOA

tinued performance improvement and problem avoidance is critical. Yes, performance problems will occur and are sometimes unavoidable due to situations beyond the control of developers or operations personnel. The key is to “bake in” performance through a culture of cooperation where developers and operations work together so that performance problems can either be resolved proactively or detected and resolved quickly before they impact user satisfaction. Figure 1 is a high-level depiction of the application life cycle from development to deployment. SOA greatly shrinks the time and resources needed to accomplish the requirements-designbuild phases of application development by mapping business process to existing or third-party services without regard to underlying technology, algorithm or schema of the component services (Figure 2). Here, services are orchestrated based on business process. To ensure success, the detect-isolate-optimize approach depicted in Figure 2 should be used during the deployment phase and carry forward into operational management. It’s too risky to release an SOA application into production without this additional process step because of complexity, unknown performance issues of constituent services and lack of control over third-party infrastructure service providers. Also, testing times are often compressed due to timeto-market pressures. After the SOA application is built, it should be deployed onto a staging or production infrastructure (as in Figure 3). Using the detect-isolate-optimize approach during the deployment phase allows developers to trace all transactions initiated by load generator and beta testers from browser to database. This can show the development team where slowdowns and errors are occurring, and allows them to correlate problems and bottlenecks to

FIG. 3: SOA APP DEPLOYED ON STAGING/PRODUCTION INTERFACE

Load Generator

App

Beta Users Web

App

Third-Party Data Center

•

faulty or conflicting application and/or infrastructure components anywhere in the system, including services delivered by third parties. The information gathered from tracking pre-production transactions also provides a road map of “low-hanging fruit” for performance tuning by identifying and ranking heavily used and/or slow software components. Because developers don’t have insight into the architecture and code of constituent services (especially those of third parties), they need this level of actionable information to direct them to opportunities for performance improvement. No matter how hard developers try to guess what will happen when users get their hands on a new application, IT is usually surprised at what really happens when a complex SOA application meets the real world. The detect-isolate-optimize approach during the deployment phase allows the developers and operations teams to jointly establish a baseline performance measurement for the SOA application, and avoid surprises when the SOA application is rolled out to a broad range of real users in production. The process of ensuring SOA application performance can be extended

IT is usually surprised at what really happens when a complex SOA application meets the real world.

APRIL 2008

Web

•

from deployment to 24/7 production management. Ideally, the same browser-to-database monitoring and diagnosis tool used in deployment can be “left behind” for use in production, offering a common source of actionable information bridging the needs of development and operations teams. When end-user slowdowns or errors are detected in production, the operations team can isolate the cause of the problem using data collected by tagging and tracing the ill-performing transaction from end to end, and then call in the right specialist to tackle the right problem. Figure 4 shows a sample workflow diagram of the problem resolution process. The specific information provided about the cause of slowdowns keeps guesswork and triage time to a minimum to reduce the cost of downtime in terms of time, money and frustration. This ability for everyone to share and analyze common, actionable data eliminates “apples to oranges” comparisons of critical performance data by the development and operations teams, and results in a truly repeatable and scalable SOA adoption and deployment.

Practical Implementation An effective SOA management process requires automated monitoring tools that have the following characteristics: Monitoring instrumentation should be deployed non-intrusively, without impacting the end user or the internal design of the services. As most SOA applications are deployed using Web services and deployed on the Web, it’s difficult and costly to convince a large www.stpmag.com •

JUICING THE SOA

number of Web users to download agents or an applet to monitor performance at the browser. In addition, it’s impractical to attempt to build monitoring functions directly into the services, as they might be existing software modules or be developed by third-party service providers. The practical approach is to provide a mechanism to dynamically inject instrumentation—via the Web server or application delivery controller (ADC)—onto the Web page as the end user accesses the Web application. The tools should collect only actionable information without placing an undue load on the throughput or capacity of the infrastructure. The tools should be able to collect performance data from a variety of platforms. Since the services used might involve a mixture of technologies, developers shouldn’t have to pay attention to the underlying technology (Java vs. .NET) of the services. However, tools used to monitor performance and drill down to the method call and SQL query level can be (and often are) dependent on APIs or methods unique to J2EE or .NET. The tools should be intuitive so that the cause of problems can be diagnosed in just a few clicks. The tools also should support a flexible workflow that can meet the varied needs of

developers and operations management teams. Moreover, the data collected should be stored in a form that allows ad hoc reporting to support problem diagnosis, performance optimization, capacity planning and business reporting.

The Unintended Consequences Of Reusable Service Modules For complex SOA applications, developers are constantly called upon to deal with production problems,

FIG. 4: SOA MASTERING Detect Problem Based on End User Response Time

Assess Impact Prioritize Issues Outside

Outside or Inside?

Front or Back End?

Client or Network? Network

Client Identify Individual User

Inside

Identify Individual IP

Front End Which Page, Object, Server?

Back End Which Object and Server? Trace Call Stack

Solve the Problem

• Software Test & Performance

Method Call or SQL Query?

whether to patch a code-level problem that impacts performance or an infrastructure issue that requires a workaround. Because of their knowledge of the application, they are also expected to serve on triage teams attempting to re-create or diagnose potential or real performance problems. These activities can impact the development schedule and developer productivity. The automated and continuous monitoring and diagnosis of transactional problems from the user perspective in a production setting performs three important functions to mitigate the impact of the added complexity brought on by SOA adoption: • Provides a common, relevant set of actionable data such that developers and operations personnel can collaborate to quickly pinpoint the cause of performance incidents, whether inside or outside the firewall. The teams can work on inside problems to pinpoint the tier of the application infrastructure, server, method call, SQL query or combination thereof caused the performance problem. The benefit of this approach is to eliminate, in a majority of cases, the need to organize cross-discipline triage teams to debate, reproduce and diagnose problems. The bottomline effect is improved IT efficiency and quicker time to problem resolution. • Creates a continuous feedback APRIL 2008

JUICING THE SOA

loop whereby developers can gain insight into how the application’s performance is being impacted by actual usage, features and infrastructure issues. For example, if a feature turns out to be a major consumer of computational resources such that it impacts the service level of more critical features, then development might consider allocating resources to re-engineer this problematic feature. Having this information gives developers the foresight to proactively make modifications to the application or constituent service components so that performance issues aren’t just addressed in hindsight. • Assists in regression testing based on real user traffic after code-level and/or infrastructural changes. To facilitate the cooperation between the development and operations teams in matters of SOA application performance, there has to be a common platform for the sharing of performance information (metadata) that is relevant to both teams, and a defined process for acting on the information. In a way, this is similar to replacing the traditional “Chinese Wall” that separates development from operations with a glass wall. This metaphorical wall—actually a predefined and enforceable set of business policies— is important so that developers can’t arbitrarily modify released code or the underlying database or infrastructure running the code without following proper release, change and configuration management protocols. Instead of being opaque, the wall is transparent to allow informational visibility between the two functional groups. Development and operations teams use different tools. Data collected or generated by development tools is not useful to system management or DBA tools used by operations personnel and vice versa. This Tower of Babel situation, if not remedied through the use of a common tool and metadata, makes the implementation of the SOA impractical.

ROI Considerations Performance issues can easily eliminate the expected ROI from SOA initiatives. Slow performance can lead to excess user complaints as well as having a negative productivity impact on IT staff. For most organizations, the business APRIL 2008

impact of customers receiving sub-par application performance is significant, an impact not limited to e-commerce merchants. For example, while frustrated e-banking customers can’t click away to a competitive bank’s Web site to pay bills, performance issues will increase help-desk or customer support costs, and eventually damage brand value, driving customers

costs of SOA should be considered and minimized prior to embarking on a SOA initiative. Without adopting a real-user, endto-end approach, “baking in” performance during deployment and managing performance systematically in production, any expected application benefit can be wiped out by the added complexity of SOA.

HE SOA-TO-NOISE RATIO No list of “likely suspects” can be exhaustive enough to predict what can go wrong and impact performance in such a complex environment as the SOA. Some of the challenges listed here were identified by a survey of 333 U.S. IT decision-makers conducted by the global IT consultancy Ovum Summit. The survey, which found that 27 percent of large enterprises and 17 percent of medium-sized companies have deployed an SOA in some areas of their IT infrastructure, also showed that it’s difficult to: • Quantitatively determine if the service level exceeds end-user expectation or, at a minimum, meets a service-level agreement (SLA). • Quickly determine the existence and source of service delivery problems among the myriad of “moving parts” within the delivery mechanism including the client PC, Web cloud, data center, composite services, third-party provider service or infrastructure. However, satisfaction with the results is patchy at best, with almost one in five of the people surveyed indicating that adopting SOAs had created unexpected complexity. The apparent problem is that traditional IT management processes and tools aren’t always up to the task of monitoring and managing SOA applications, and that SOA deployments require as much support and investment in infrastructure management as they do in development and testing tools. Fortunately, the Ovum study also found a high correlation between a business’s level of satisfaction with SOA and its commitment to managing IT as a set of services in accordance with best practice approaches. In other words, the more your company puts into its SOA, the more successful and satisfied its users will be. Source: www.ovum.com/go/content/c,377,66329

to a competing bank. Lack of application performance will seriously hamper an organization’s ability to generate revenue via the Web, or bring about greater operating costs through the migration of customers away from Web-based self service. Even for internal or non-customer-facing applications, performance issues will lead to productivity loss among employees or partners. Beyond the cost to the business, performance challenges also impact the development cost of SOA applications. The cost to the organization could be in the millions of dollars if a good portion of the developers’ time is spent fixing production problems. All of these potential downside

By using the detect-isolate-optimize approach to monitor load-testing and beta transactions, developers can identify potential bottlenecks in services and infrastructure prior to releasing the SOA application to production. Once in production, the operations team can use the same approach to detect end-user performance issues to quickly pinpoint the cause of the problem, minimizing the time to resolution. Using a common monitoring approach for deployment and production management allows the development and operations teams to share common actionable information to improve SOA application service levels. ý www.stpmag.com •

Whip Your Tests Into Shape For Winning Results By Glenn Stout

or many reading this, it’s not hard to imagine being at a big project status meeting. Sitting across the table from you is your boss.

Next to him is his boss, followed by the big boss. You are the QA manager, and all eyes are now on you to tell them how the application project is doing. In the recent testing period, your team completed 50 tests. Of those, 25 failed and 25 passed. You report that statistic to the group. And for a moment, no one says a word. Breaking the awkward silence is the big boss, who asks: “So 50 percent of the tests passed. What does that mean—that the project is only 50 percent complete?” You mumble something about the defect reports and offer to report back to him later. Certainly not the best outcome. What follows is a way to help you avoid such situations and be better prepared for meetings like this. You’ll learn a few simple test methods that would have allowed you to tell Mr. Big—after about the same amount of testing work based on the same requirements—that regression and error-trapping are meeting requirements, but that security and a few high-priority functions will need further development and testing. Glenn Stout is quality and methods manager at business process consultancy Hewitt Associates and president of the Chicago Rational User Group.

• Software Test & Performance

Using a few specific test methods to create and manage your tests, you’ll have an approach that will help you build smarter, more modular tests that deliver more meaningful and informative results.

Pluses and Minuses We’ll start with the two basic categories of testing: positive and negative. I think of these as two high-level test methods. Think of our original 50 tests, half of which passed. Now let’s use positive and negative test methods and direct the tester to create 100 tests instead of 50. There would be 50 positive tests and 50 negative tests. We would use the same requirements, resulting in the same number of steps. The objective of each test is now to give a “positive” or “negative” result. It’s probably safe to assume that 25 defects resulted from our original example. So we can also assume that we’ll find those same 25 defects again. This time, however, we found them after executing our negative tests. Now, the status report would look like this: • 100 tests total • 50 positive tests passed

• 25 negative tests passed • 25 negative tests failed While still not very detailed, this information does add more value. it allows us to give the big boss the assurance that the system works, but in some cases (where negative data is entered), there are some issues. Based on this new info alone, you’re helping management make better decisions about the project. But it gets better.

Customized Testing Now, let’s extend the approach. We can add tests specifically for security, or perform a deliberate review of the “look and feel” of the application, in accordance with the company’s updated marketing look, for example. Or perhaps your company’s business cycle includes an event- or time-based situation, such as the closing of the books at the end of the month. The requirements that drive these tests are there, so it’s just a matter of repackaging the testing approach to give you an “up or down vote.” We can also change “positive” test methods to be “positive functionality” tests, which are a general method that tests basic application functionality in a positive way, and with positive data. Here’s more on that. Using the same requirements, and creating our tests based on “test methods,” we may come up with the following methods to test with: • Positive basic functionality • Security • Look and feel • Negative (or “validation/edit”) • Business cycle

At this point in our example, it doesn’t matter how many tests we have in each category. In fact, they may not even map directly to our original 50 tests. There may be more or less, depending on the method. So, let’s use percentages of this run for pass/fail (see Table 1). Now when you’re back in the big meeting with this new information, the big boss might want to focus on security tests, the business cycle or something else. The point is that now, you have something to discuss. The number of overall tests that passed or failed is less important than which tests passed or failed. By creating shorter, smarter tests based on simple test methods and outcomes, you get more information from those tests. As an added benefit of this modular approach, you can expect to create tests that are easier to: • Define • Estimate • Prioritize • Automate They also can help you: • Provide better guidance to testers • Identify requirements gaps • Perform requirements traceability • Categorize defects • Provide more meaningful metrics Additionally, since many test teams are now offshore, this prescriptive approach allows for more predictable results. In most cases, you’ll find that you’ll “mandate” a certain set of methods for all applications, and then add to it when the situation warrants it. This sets a

With a Few Techniques, You’ll Have the Inside Track On Where Your Testing Stands and How Close The App Is to the Finish Line www.stpmag.com •

WHIP THOSE TESTS

TABLE 1: PERCENTAGES Test Method Positive basic functionality

Pass 100%

Fail 0%

Security

50%

Look and feel

75%

25%

Negative (or “validation/edit”)

80%

20%

Business cycle

100%

repeatable, predictable baseline of testing that you can count on.

Out of Phase By now you might be wondering which testing phase this approach applies to. The answer is all of them. If every testing shop had the same testing phases (most do not), it wouldn’t be something we’d need to discuss. But since teams use different test phases, it warrants a mention. This approach works regardless of what you name your test phases. Whether your team uses unit, integration, system, user acceptance or whatever, it applies. This becomes clearer when we cover how to frame up the various methods in accordance with our plans.

Method To get to the next level, you need to create a minimum of three artifacts. You probably have two of them already, and perhaps part of the third. They are the: • Project test plan template • Test case template • Test methods reference document The test plan template is a document that many test teams modify to plan their testing for each project. The overall approach for a project, scope, environmental needs, etc. are generally parts of this document. A new version is created for each project. The test case template is the template that the tester would use to create individual tests. Both of these templates probably exist in your organization in one form or another. Finally there’s the test methods document, sometimes called the “best practices” document, which is absent from many organizations. In this context, a test method is a specific description of a deliberate approach to testing particular functionality. It specifically addresses certain risks. It isn’t strictly

• Software Test & Performance

based on the requirements, or how they’re delivered. The IBM Rational Unified Process calls these “test techniques” and puts them in what is known as a “test strategy” document. While many of these socalled test techniques/test methods might already be common practices, the benefit of using this approach is that you can incorporate industry-standard methods with those methods specific to your own company. The framework in Table 2 borrows heavily from the IBM Rational Unified Process, as methods are described. All of your test methods would be included in this document, from the seemingly basic to any special tests that might be necessary your homegrown application. The point is, the test methods are yours: You can create as many or as few of them as you need. Many organizations already do this one way or another. This process sim-

of functionality that need to be tested: Functional areas A, B and C. Now assume we have eight test methods to choose from, Method 1 through 8. Here is how the test plan shapes up: Table 3 shows that some methods are repeated as necessary from phase to phase. For example, if there was a “Positive Basic Functional Test” method, we might repeat that method in all phases. Some teams believe that nothing is adequately tested until it’s tested twice, and testing “density” may be different from phase to phase. Such flexibility is available here. If you were to extend this table for each phase and functional area, you could create additional columns (see Table 4) for the: • Number of tests required for this particular method (could be multiple per functional area) • Estimated time to create each test • Estimated time to run each test

TABLE 2: RUP LIGHT Details

Method Descriptor High-level description

This is a three- or four-sentence summary of what the test method aims to accomplish, and how it would be done.

Test method objective

The objective of the test. For example, with the Security Test Method, the objective would be that the AUT security functions are working properly. What particular risks does it mitigate?

Technique

For each method, a special technique may need to be employed. For example, for a test method of database failover, there would be a very specific technique that would need to be employed. This is generally the step-by-step approach.

Success criteria

In most cases, this is characterized as simply “All X tests are run and pass,” but, based on the method, could be very specifically stated.

Special considerations

For each method, there may be a special consideration. For example, if “Security Penetration Test” was the described method, a special consideration may be to hire a particular security consulting firm to perform it.

When to use

You may prescribe particular methods regardless of the functionality or phase. In other cases, advice on when a particular method is used would be located in this section.

ply provides a different lens pointed at the same materials.

Where It Comes Together For simplicity, we’ll illustrate the example with three test phases; the number of phases in your process might be different. • Development testing (unit/integration) • QA testing (system) • Acceptance testing (user acceptance) Let’s say there are three major pieces

• Priority of test method Using this approach gives testers specific direction on what they’re expected to create. It also can help locate requirements gaps. If you expect that a particular method would be appropriate for a high-level review of a functional area and the requirements aren’t present, you have identified a gap. During regression testing—when there is always limited time—this method allows you to pick functional areas and methods to prioritize. You also can help your acceptance testers APRIL 2008

WHIP THOSE TESTS

RATIONAL APPROACH What follows are two methods copied from the IBM Rational Unified Process, Rational Method Composer for Large Projects, version 7.2. They illustrate the framework RUP prescribes and include examples from two of the test methods that were suggested in the main article. Further, Rational adds a high-level definition to ”outline one or more strategies that can be used with the technique to accurately observe the outcomes of the test.”This is specific and changes based on the application under test, and allows for some project-by-project tailoring. Also, in deference to Rational’s Functional Tester and Manual Tester testing tools, IBM prescribes tools that could be used for the particular test technique in a separate section. Security and access control testing focuses on two key areas: • Application-level security, including access to the data or business functions • System-level security, including logging into or remotely accessing the system

Based on the security you want, application-level security ensures that actors are restricted to specific functions or use cases, or they’re limited in the data available to them. For example, everyone may be permitted to enter data and create new accounts, but only managers can delete them. If there is security at the data level, testing ensures that “user type one” can see all customer information, including financial data; however,“user type two” only sees the demographic data for the same client. System-level security ensures that only those users granted access to the system are capable of accessing the applications, and only through the appropriate gateways. Business cycle testing should emulate the tasks performed on the <Project Name> over time. A period should be identified, such as one year, and transactions and tasks that would occur during a year’s period should be executed.This includes all daily, weekly and monthly cycles, and events that are date-sensitive, such as ticklers.

SECURITY AND ACCESS CONTROL TESTING

BUSINESS CYCLE TESTING

Technique objective

Exercise target-of-test and background processes according to required business models and schedules to observe and log target behavior.

Technique

Testing will simulate several business cycles by performing the following: The tests used for target-of-test's function testing will be modified or enhanced to increase the number of times each function is executed to simulate several different users over a specified period. All time- or date-sensitive functions will be executed using valid and invalid dates or time periods. All functions that occur on a periodic schedule will be executed or launched at the appropriate time. Testing will include using valid and invalid data to verify the following: The expected results occur when valid data is used. The appropriate error or warning messages are displayed when invalid data is used. Each business rule is properly applied.

Oracles

Outline one or more strategies that can be used by the technique to accurately observe the outcomes of the test. The oracle combines elements of both the method by which the observation can be made, and the characteristics of specific outcome that indicate probable success or failure. Ideally, oracles will be self-verifying, allowing automated tests to make an initial assessment of test pass or failure; however, be careful to mitigate the risks inherent in automated results determination.

Required tools

The technique requires the following tools: Test script automation tool Base configuration imager and restorer Backup and recovery tools Data generation tools

Success criteria

The technique supports the testing of all critical business cycles.

Special considerations

System dates and events may require special support tasks. A business model is required to identify appropriate test requirements and procedures.

Exercise the target-of-test under the following conditions to observe and log target behavior: Application-level security: An actor can access only those functions or data for which their user type is provided permissions. System-level security: Only those actors with access to the system and applications are permitted to access them.

Technique

Application-level security: Identify and list each user type and the functions or data for which each type has permissions. Create tests for each user type and verify each permission by creating transactions specific to each user type. Modify user type and rerun tests for same users. In each case, verify those additional functions or data are correctly available or denied. System-level access: See Special considerations below.

Oracles

Outline one or more strategies that can be used by the technique to accurately observe the outcomes of the test. The oracle combines elements of both the method by which the observation can be made and the characteristics of specific outcome that indicate probable success or failure. Ideally, oracles will be self-verifying, allowing automated tests to make an initial assessment of test pass or failure; however, be careful to mitigate the risks inherent in automated results determination.

Required tools

Success criteria

Special considerations

APRIL 2008

The technique requires the following tools: Test script automation tool "Hacker" security breach and probing tools OS security administration tools The technique supports the testing of the appropriate functions or data affected by security settings that can be tested for each known actor type. Access to the system must be reviewed or discussed with the appropriate network or systems administrator. This testing may not be required, as it may be a function of network or systems administration.

www.stpmag.com •

WHIP THOSE TESTS

focus on test methods by putting tests in this framework. It’s easier to encourage those sometimes reluctant users to get on board with testing if you use terms they understand.

TABLE 3: NOT REDUNDANT; THOROUGH Functionality

Methods

Functional Area A

Method 1 Method 2 Method 3

Functional Area B

Method 1 Method 2 Method 5

Functional Area C

Method Method Method Method

1 2 4 5

QA testing

Functional Area A Functional Area B Functional Area C

Method Method Method Method Method

4 5 6 7 8

Acceptance testing

Functional Area A

Method 5 Method 7

Functional Area B Functional Area C

Method 7 Method 8

Phase Development testing

Precious Time When asked how long it will take to test something, my experience has taught me that it’s helpful to counter with the question “How much time do I have?” Most teams aren’t afforded the amount of time necessary to do an absolutely complete job of testing, and therefore choices need to be made. Breaking up the tests into methods allows QA team leads, project managers and the final customer to understand the risks they’re taking on based on the amount of time available to complete testing. So when you’re asked at the next “golive” meeting if all the testing is done, if you can’t tell them, “Yes, 100 percent,” here’s the next best thing. Give them the all the information they need to allow them to make the appropriate decision on what level of risk they’re willing to accept. If they hear that 75 percent of the security tests failed during acceptance testing, they might allow more time. However, if you’re able to report that only x percent of tests passed and y percent failed, and hand them a defect report for filling in the blanks, I wouldn’t expect a favorable reaction. Here’s the bottom line: If you pass along enough details about your testing results, you also pass along responsibility for the decision to go live.

simply increase the overhead of keeping up with them. Another perceived drawback is that the process requires additional skills to create the test plan. This is true. However, the knowledge of test methods should be one of the many skills

•

techniques are part of the fabric of every test team, and adopting any new group practice is never an easy task. To compensate, use a slow, gradual rollout to allow the team to adjust. In many cases, teams react to the given requirements at the time they create the test cases. This can lead to tests that vary based on the phase, team and individual tester. While such an approach may sometimes be adequate, the metrics that result can usually be improved. Despite advances in knowledge and technology, testing remains something of an art form. Using a consistent canvas and paintbrushes will help keep your team on an even keel while still allowing for creativity. By using test methods based on the system’s functional areas, you bring more information to the status table while improving on your ability to estimate and prioritize tests. This will lead to better testing, with more repeatable and predictable results. As time goes by and more members of the broader team see your new methods in action, it will nudge your requirements teams to “up” the quality of requirements, which will in turn make the flow into testing and development easier. Such side effects are just the beginning of the benefits you’ll see when adopting this process. ý

If you pass along enough details about your testing results, you also pass along responsibility for the decision to go live.

•

Up- and Downsides Like anything else, this approach has drawbacks. Making tests more modular increases the number of tests that must be maintained. When using a requirements approach, each functional area may have two or three (or maybe even one) all-encompassing tests, covering everything. If we increase that to a dozen or so, this will

that a testing manager has, so to put them in this framework shouldn’t be too much of a stretch. Also, encouraging your test teams and other interested parties to change to a new method can present challenges of its own. Test methods and

TABLE 4: FOR EACH PHASE AND FUNCTION Test Method

# 0f Tests for Method

Est. Hrs. to Create Each

Method 1 Method 2

• Software Test & Performance

Total Est. Hrs. to Create Hrs. Run Each

Total Automate? Priority Hours (H, M, L)

APRIL 2008

How to Analyze Your Existing Bug Lists to Help Improve The Defect Tracking Process By Colleen Voelschow

hen most of us in the testing industry think of our defect tracking database, we think of the quality—or lack thereof—of our products.

Defect tracking is, after all, an essential mechanism for recording and measuring the faults of any system under test. Without these records, the testing effort becomes a subjective evaluation of your personal interpretation of what quality means in any given instance. But what if your defect records could reveal more? What if I told you that with a different approach to analyzing the same product defect data, you could uncover gaps in the software development process and use this information to fix those errors before they manifest as failures in the system? Believe it or not, there is a way to do it. And it’s here in the following pages.

Photographs by Zsolt Nyulaszi/Photoillustration by The Design Diva, NY

Anatomy of a Defect Defects come in three main shapes: something missing, something wrong or something extra. When something’s missing, it can be a requirement or feature that was simply not implemented. When something’s wrong, it’s something that was not implemented correctly or completely. This would include basic coding mistakes. And yes, something extra, such as an additional feature outside of the original project specifications, is still considered a defect. Extra features often occur due to creative developers attempting to fill in the gaps in requirements on their own. But if no requirements exist for a feature, chance are there will be no test cases, either, creating a hole in test coverage. But it’s hard (if not impossible) to produce software without human activity, and Colleen Voelschow is senior staff consultant at ProtoTest, a software QA consultancy.

human beings are fallible. Errors caused by faults in the system can come from many different sources. Time pressure or rushed schedules can lead to simple mistakes. Lack of training or the introduction of new technology can create a steep learning curve that increases the likelihood of error. Complex code or infrastructure can make it difficult for developers to produce bug-free products, particularly when inheriting someone else’s code. Poor communication or changing requirements can produce defects before coding even begins. Before analyzing the data in your defects, you must first look at the defect attributes currently being captured and tracked. A tool must be in place that allows you to document defects, track their life cycle and provide metrics. Defect reports should provide the reader with all necessary information to identify, isolate and fix the problem reported. Each report should be limited to a single defect and be written in neutral language that avoids accusations. Attributes included in a defect report should include (but not be limited to): • Reported by • Reported date • Project/category • Platform/OS • Status • Title • Steps to reproduce • Expected results • Severity • Priority To get collective buy-in on providing feedback in a defect review meeting, the evaluation must be a collaborative process. If one person is responsible for assigning the defect source, it may be difficult to remain objective. Involving all stakeholders in the evaluation process will help to eliminate any bias and foster shared ownership of the outcomes. Your stakeholders will be different based on whether the defects evaluated are found www.stpmag.com •

DEFECTUAL HEALING

during the test phase or in production. Stakeholders should include representatives from development, project management, business analysts, help desk and QA. For project-level reviews, it’s best to involve the people who have actually written the requirements, schedule, code and test cases as participants in the review process. You may find that the help desk isn’t required for projects under test, but its involvement is critical to the evaluation of production defects. The goal of this review process is to identify the root cause of the reported defects. It’s not the purpose of these reviews to prioritize, schedule, resolve or assign ownership of active defects. It will be most useful to schedule these reviews at the end of a project or release phase once the defects are resolved and closed. Separating these reviews from regular bug meetings will help you to focus your audience on the goal at hand. This will also allow the review team to draw on resolution information when assessing the defects source. It’s important to prepare your audience by publishing and distributing your defect source categories. When these categories aren’t defined prior to the review meeting, the meetings can devolve into an open forum for assigning blame. By reviewing the defects as a group, you’re making the process an objective one, eliminating what may seem like personally directed attacks. It can be useful to have a moderator who is external to the project lead these reviews. Because newly entered defects typically go to development first, it’s usually up to development to escalate if a given issue isn’t the result of a development error. It’s important to empower all possible defect owners to raise any issues regarding defect sources. Educate your stakeholders to understand what you’re doing and why you’re doing it, so they too will have a vested interest in evaluating and

understanding where a defect originated. It’ll be difficult to begin any regular review meetings if defects are being closed every time they aren’t the result of a coding error. This can be resolved by simply requesting that defect analysis notes be added to the defect report.

Close the Gaps

•

When evaluating the trends in the sources of your defects, you’re looking for faults in each phase of the software development process. Once you’ve established your defect source categories, it’s important to identify which gaps in the process they might stem from. These gaps will vary based on project or product, so it’s best to keep this context-based and open to discussion. By some estimates, more than half of all defects originate in the requirements phase. Defects stemming from missing, ambiguous or contradicting requirements all point to issues in the requirements phase. Enhancements that are approved for inclusion after requirement sign-off or after the product has released also should be evaluated as missing requirements (because they were missing from the original project scope). Defects that fall into such categories can likely be resolved through more clarification and collaboration before coding begins. It’s important not to point the finger at the person who wrote the requirements. Resolutions will require the entire project team. To determine if a defect source originated in the requirements phase, you’ll need the ability to locate the faults in the original requirements. This can be done manually, but the use of a Requirements Traceability Matrix will save time by allowing you to link a failed test case back to the requirement id. In the following defect, you can trace the failed test case back to the requirement id being tested.

If many of your defects are indeed the result of gaps in the requirements phase, you should consider more verification testing.

• Software Test & Performance

•

DEFECT ID: 19872 STATUS: In Dev SEVERITY: 2 PRIORITY: Med TITLE: Order cancellation should return to custom home DESCRIPTION: Users should be sent back to their customized ‘my home page’ when canceling out of an order. They are currently sent back the shopping cart. In this example, the requirement is that users be able to cancel an order at any step during order process. The requirement fails to mention what should happen when the user does cancel an order. The source of this particular defect, therefore, would be a missing requirement. If you find that many of your defects are indeed the result of gaps in the requirements phase, you should consider implementing more verification testing, which involves a tester looking for defects before the product is built. Verification testing can include document reviews, user interface walk-throughs, personal evaluations or any formal review process. The objective of these reviews is to find inconsistencies, ambiguities, contradictions and omissions before the coding effort begins. When considering the most basic defect type, most will think of something that is broken. Defects that are created from coding errors, logic issues or architecture/integration faults indicate issues in the development phase. While these are often simple syntax mistakes, they sometimes include errors in validation such as a failure to filter input variables that would result in security vulnerabilities. These types of errors will be the core of the functionality testing effort. Determining that a defect is the result of a gap in the development phase will often require looking at the code directly, either by white-box testing or through investigation with a developer. In the following defect example, the tester was able to confirm the source of this defect by looking at the calculation in the code. DEFECT ID: 19874 STATUS: In Test SEVERITY: 2 PRIORITY: Med TITLE: Free shipping calculation using incorrect purchase total DESCRIPTION: Shipping cost should be APRIL 2008

DEFECTUAL HEALING

waived when product purchase totals more than $30.00 without any applicable sales tax. Sales tax is currently being included in total towards free shipping. Upon review, it was also determined that the requirement regarding this calculation was complete and accurate. Therefore, the defect was not the result of an issue in the requirements phase. This defect could be categorized as a coding error.

that the reason is not an issue with the code itself. Researching defects such as this may require help from developers, database administrators and configuration managers (or whomever is responsible for code migrations). If you discover that your defect data

than one source type to be selected for any given defect.

Keep It Detailed It may be beneficial to your evaluation process to include resolution details in your defect reports to assess if the ini-

FIG. 1: BEWARE OF METRIC MISUSE

Verify and Validate If your defect reviews point to a gap in the development process, your solution will require a combination of verification and validation. Verification testing should be used to hold reviews of design documents, architecture and code. Validation testing should be used to unit-test the code. Unit testing may be most effective when done by a peer. The goal of implementing these improvements is to uncover coding errors before they manifest as defects in the testing phase. Gaps in the testing phase can result in incorrect or missing test cases, invalid test data or invalid defects. Attributing defects to testing-phase sources can be extremely difficult. This is because it’s unlikely that you’ll discover a defect for which there is no test case unless it’s discovered by end users in beta testing or in production. Unfortunately, defects that stem from the testing phase will most likely be discovered in production. A common issue for many QA departments is keeping test environments up-to-date with production code and data. Evaluation of the defect below might determine that the issue is the result of the database in the test environment not containing the full data set, or not replicating production data. DEFECT ID: 19875 STATUS: Open SEVERITY: 2 PRIORITY: Med TITLE: Product search result set is incomplete DESCRIPTION: Product search results do not return all available product types that meet search criteria. Search criteria should return the result set in attached file. This doesn’t mean that this defect is entirely invalid since it’s a defect in the test environment. It will require research to determine what data should be returned, and confirmation APRIL 2008

misinterpreted requirements coding errors invalid test data integration faults omitted requirements

indicates a gap in the test process, you’ll again need to implement both verification and validation techniques to close the gaps. Implementing reviews of all testware produced can eliminate missing or incorrect test cases, but the success of this is dependent on the involvement of project stakeholders. Testware can include test plans, test cases, test reports and defect reports. Using validation techniques such as qualification or smoke tests can reduce the number of defects entered regarding build and environmental issues. The purpose of implementing these improvements is to find holes in the testing process before testing begins and to minimize the number of invalid defects once testing has started. To use your defect reports for process improvement, modifications or additional information may be required. It’s important to include a field for defect sources that isn’t a free-form text entry. The sources of a defect can include categories such as missing requirement, misinterpreted requirement, coding error or data defect. It’s important that these categories are predefined to eliminate finger pointing. Each defect source type should be clearly defined in your defect management plan to eliminate any subjective classification. You may find it necessary to allow for more

tial definition of the defect source was accurate. You may think that a defect was the result of a code error when in fact the requirement was never clearly defined. In the example below, the tester has reported a defect that appears to be an incorrect error message returned to the user. DEFECT ID: 19871 STATUS: New SEVERITY: 2 PRIORITY: Med TITLE: Credit card validation is returning wrong error message DESCRIPTION: When entering a credit card with an invalid number, an invalid expiration date message is received. See attached screen shot. This defect was initially categorized as a coding error. But upon investigation by the developer, it was discovered that the error checking needed to be evaluated in a certain order. In this example, the credit card number should be validated before checking the credit card expiration date. Because the details of this validation process weren’t defined in the requirements documentation, the defect source should be reclassified as a missing requirement. It’s important that a defect’s source be reevaluated based on the resolution. The QA department should own the defect review process and provide www.stpmag.com •

DEFECTUAL HEALING

all stakeholders with the resulting reports that are intended to drive improvements of the entire software process. These reports should include the percentage of each defect source and the overall defect totals.

meeting. For production defects, it may be useful to present findings on a monthly or quarterly basis to be reviewed for implementation suggestions.

Compare Reviews Avoid Metrics Misuse Metrics can be misused. It’s therefore important to take note of any special cases that may impact these numbers. In Figure 1 on page 29, the metrics show that the highest percentage of defects were the result of “misinterpreted requirements.” While that may indicate communication problems between the design and development phases, it also could mean something entirely different if a change in the development resource had been noted. In that case, this metric may tell you that you need to revisit how projects are handed off when resources are changed midstream. The resulting recommendations based on the evaluation of your defects should be presented separately as defects that are found in test, versus defects found in production. For a project in test, this may be presented as part of the project’s “post-mortem”

• Software Test & Performance

For projects in which defect data can be evaluated in both test and production, it can prove valuable to compare the results of both reviews. For example, a project that had a large number of “coding errors” in the test phase may point to the need for improvements in unit testing. Once the project is in production, the majority of the defects may be “omitted requirements” that indicate the need for greater depth and detail in the requirements-gathering phase. If you can find defects earlier in the process, when the least design, code, or testing is involved, fixes are far less expensive. When you consider that the total cost of quality includes the detection, removal and prevention of defects, it’s no surprise that the cost of identifying defects increases the further along you are in the software development life cycle. The cost of defect detection typical-

ly includes the time and resources required for the testing activities. Removal of the defects includes fault isolation, analysis, repair, retest and migration activities. However, the cost of defect prevention can be minimized by using defect source analysis, which also often provides measurable results. Start by evaluating the current practices you have in place for defect tracking. Determine whether your defect tool and workflow capture all of the necessary information. Then set the stage for your defect evaluation efforts by communicating your goals, empowering your stakeholders and establishing your evaluation criteria. Finally, close the gaps by showing upper management the results of this review process and describing the improvements that you’ve proven are required. Using your product defect data to evaluate your process will not only help you identify the areas in need of improvement, but will provide you with the data needed to make the argument for change. The information that you already capture, monitor and store can be used to improve your entire software development process. ý

APRIL 2008

The Use Of Independent Verification And Test Validation Helps Ensure Total

Trust, But Verify Product Control

By Sreenivasa Pisupati

s thorough a job as your testers might do, there will always be bias in their work. Why? Because testers have as

Photograph by Mykola Velychko

much at stake in the success of the project as anyone—right or wrong—failure often reflects directly on them. Such is the value of independent verification and test validation, a process by which procedures and outcomes are monitored. IV&V can reduce the creation of defects and increase their detection, mitigate risk and give greater visiSreenivasa Pisupati is assistant VP of testing at software quality consultancy AppLabs Technologies. APRIL 2008

bility to the financial, managerial and subject-matter aspects of any project. IV&V includes a series of technical, managerial and financial activities performed by resources independent of the project team to provide management with an assessment of the project’s ongoing health. The practice is recognized as an important step in the software development process for improving quality and ensuring that the delivered product satisfies the business user’s operational needs. The IV&V assessment and its associated recommendations can provide project managers and teams with actionable advice based on industry standards and best practices. By offering an unbiased audit and assessment of

processes and deliverables, IV&V ensures that goals are met for cost, dates and quality. Verification is the process of evaluating a system or component to determine whether the products of a given development phase satisfy the conditions imposed at the start of that phase. The verification process typically asks: • Does the software function and perform as expected? • Is the system being built correctly? • Are applicable industry and best practices and standards employed? • Are there defects in the design or code that cause unexpected results or fail to cause expected results? • Do those unexpected results involve the import or export of www.stpmag.com •

IV&V FOR QA

data, user interface or hardware performance? Validation is the process of evaluating a system or component at or near the end of the development process to determine whether it satisfies specified requirements. Validation focuses on the process used to build/deliver the application and takes the form of an audit. Validation typically includes such questions as: • Is the system or component internally complete, consistent and sufficiently correct to support the next phase? • Does the system satisfy business and/or operational requirements? • Is the correct system being built? • Is the team following the defined process and executing to defined standards? IV&V introduces objectivity, helps you maintain an unbiased technical viewpoint and supports objective engineering analysis. It promotes earlier detection of software/process errors to ensure that peak detection takes place during the creation phase. This dramatically reduces the effort and cost of removing detected errors, enhancing operational accuracy and reducing variability in development process. IV&V should be viewed as an overlay process

that complements the project life cycle and is adaptive to the needs of unique applications.

Approaches to IV&V There are three major analytical approaches to IV&V: Static analysis. The direct analysis of the form and structure of a product without executing it. This includes reviews, inspections, audits and dataflow analysis. Dynamic analysis. The execution or simulation of a developed product to detect errors by analyzing the product response to sets of input data where the range of output is known. This is also known as testing and prototyping. Formal analysis. The use of rigorous mathematical techniques to analyze the algorithms of a solution, including formal proof of correctness and algorithmic analysis. IV&V also has several key focus areas: Requirements verification confirms that the software and interface requirement specifications are consistent with system requirements in a way that is unambiguous, complete, consistent (internal and inter-relational), testable (or verifiable) and traceable. Design verification ensures that interfaces between hardware and software

are appropriate. It verifies that interface design documents are consistent. Software test plan and software test description verification looks for effective test coverage. It’s important to build and examine traceability tables and document software requirements design artifacts. Data structure and algorithm analysis are also needed here. Implementation verification ensures that design is accurately reflected in the implementation. It verifies that approved standards/practices are followed, such as for coding, documentation, naming, data dictionary terms, and completeness and correctness of algorithms. Application verification establishes adherence to software test plans and software test descriptions. A duplication of some tests by an independent party ensures correctness. Often, what surfaces are perceived but unproven weaknesses, previously found but unsolved problems and stress tests that crash the system. Process validation ensures that client standards and industry best practices are being employed to develop the product or execute the project.

The IV&V Model and Benefits IV&V offers several benefits that make it a popular practice, including financial

FIG 1: THE IV&V MODEL

Project Team

QA Team

Business Requirements Solution Architecture

Perfect Planning

High Level/ Low Level Design

Traceability, Master Test Strategy

Test Planning

Strategy for individual applications, Test case design

Solution Mapping/ Development/ Customization

• Software Test & Performance

Release

Integrated Solution

Application Integration

QA Team

Acceptance Tests & Certification

Incremental Integrated Testing

Infrastructure Support

Tools Support

Build/Change Management

System Testing (Application level Testing)

APRIL 2008

IV&V FOR QA

value, operational efficiency and increased quality. Financial value. Many senior management teams believe that IV&V is cost prohibitive. Full-phase IV&V increases development costs 10 to 20 percent. This only translates to 3 to 4 percent of total lifecycle costs. However, the savings through earlier and more thorough detection can be 200 percent or more than the cost of the IV&V work itself. From a maintenance perspective, one of the more crucial support items is documentation. An accepted fact (supported by numerous studies) is that post-deployment maintenance accounts for approximately 60 to 80 percent of the application’s true life-cycle cost. To facilitate effective postdeployment maintenance, the delivered documentation must reflect the actual application design, implementation and operational profile. The constant oversight of the IV&V group, along with its insistence that noted anomalies are addressed and resolved, result in higher-quality documentation and training materials. Operational value. The operational benefits derived from employing IV&V are substantial and span the product life cycle, beginning with improved requirements specifications and ending with a complete, more maintainable application that meets user needs. In effect, IV&V can be viewed as an effective risk-mitigation strategy that significantly increases the probability of producing a quality application, on time and within budget. Quality benefits. Although an IV&V firm can’t control the quality of the primary vendor’s personnel, methodologies, work habits or commitment, they can and should diligently review and analyze results, challenge and recommend processes and proactively protect the client’s interests. A project of this scale will benefit significantly from even minor process changes that prevent just a small percentage of defects and/or detect those defects as close to the point of creation as possible. In software development, the expense and time to repair defects increases exponentially the later a defect is detected in the process. An ambiguous requirement that is detected during user acceptance testing

will cost up to 100 times more to repair than if it is caught prior to technical specifications. It’s the responsibility of the IV&V vendor to sample test for defects at each stage, to audit the primary vendor’s project management processes for deviation and to constantly challenge the comprehensiveness and accuracy of each deliverable and milestone achieved.

•

preventing defects is to know about those that do occur. Defects should be logged in a defect-tracking tool and accompanied by documentation about how they arose. An accurate description and steps to re-create the error are critical here, as they allow developers to reproduce the defect. It’s also advisable to provide screenshots to help ensure the specified defect is actually being reproduced. Appropriate and consistent bug classification is helpful for spotting trends and gaining historical insight. Root cause analysis of the defects should be in practice with the release of each build to ensure that the same critical defects aren’t present in the next build/version. Defects also can be prevented by analyzing the lessons-learned or postmortem reports. Defects may have been identified on other projects or in earlier stages of the current project. Defect prevention activities also offer a mechanism for spreading lessons learned across the projects. It’s also helpful to look at the organization-level database and take preventive action. Prevention deliverables. Defects that are similar in nature can be identified and documented. Once defects are identified, verified and repaired, developers need to propose actions to prevent those types of defects from being introduced in the future.

Root cause analysis of the defects should be in practice with the release of each build.

APRIL 2008

• • Business benefits • Stable QA processes • Multi-location rollouts and associated testing • Consistent processes and frameworks • Critical resources for testing • Represents the voice of the customer • Implements a test lab for automated and manual testing • Implements a repository of reusable components • Provides managed-services QA model for the client • Driven by service-level agreements

Defect Prevention 101 Supplemental to the IV&V process is defect prevention. This process helps to reduce the causes and occurrences of defects and other types of errors before they happen or soon thereafter. The basic principles of defect prevention are programmer self-evaluation of errors, use of feedback and process improvement. Prevention practices are helpful for delivering high-quality software that contains fewer defects. Prevention also can impact the software process by backing up the testing process and reducing the cost of fixing errors with early detection. Begin with logging. The first step in

Defect Prevention in Action The process of defect prevention is as follows: • Discuss in kick-off meeting • Defect reporting • Cause analysis • Action proposals • Action plan implementation and tracking • Measure results Kick-off meeting. During the initial project kick-off meeting, a defect prevention leader (DPL) should be nominated for the project. Some objectives of subsequent meetings include identifying the tasks involved in each stage/phase, scheduling and assigning tasks, and listing defects that may commonly be introduced during the phase. The initial kick-off meeting could be organized as a part of project initiation and phase-start review meeting. (Project managers can combine the phase-end www.stpmag.com •

IV&V FOR QA

review and phase-start review into a single meeting). During this meeting, the central defect prevention database and action database are referenced for a list of defects that occurred in prior projects and a causal analysis, with proposed corrective and recommended preventive actions taken by the DPL. The preventive actions necessary for the current stage/phase are extracted from the database and documented in the test plan, and the DPL will be responsible for their implementation. Defect reporting. In this process, the defects found during the current phase are identified, gathered and categorized, along with a significant amount of information about their creation collected by the DPL. A bug-tracking system should be used to collect and track this information. Causal analysis. A causal analysis meeting can then take place among the project members and the DPL. This could be a part of phase-end review, but a separate meeting should be conducted if many defects are identified or their priority is critical. The time at which errors are analyzed is important because it prevents further errors from being produced and propagated. The DPL should select the defects for analysis using any statistical analysis method (C/E diagram, Pareto chart, etc.). That selected set of defects is further analyzed for root causes and sources. Various measurements are performed to identify the root causes of the defects, and the results are categorized. Examples of root cause categories include: • Inadequate training • Process deficiency • Mistakes in manual procedures /documents (typographical or print errors, etc.) • Not accounting for all details of the problem • Breakdown of communication During this meeting, a causal analysis report for defect prevention is created. Common root causes might relate to the

•

test cases. To prevent defects related to test cases, the following could be planned: • Test cases should effectively cover all functions specified in the requirements document. • Test cases should cover different scenarios along with the test data. • Enhancement of test cases at the time of testing should be kept at a minimum. • Test-related documents should be peer-reviewed. • Testers should be given enough time for testing. Action proposals. A detailed action list should be prepared for each category of the root cause by the DPL. A process representative group (PRG) will be nominated by the DPL from within the project /team members. PRG members of the respective project will consolidate and update the action plan. Sometimes the type of error that occurs most frequently may be the cheapest to correct. In this case, it would be appropriate to do a Pareto-based analysis on cost to see which error accounts for most of the correction cost. Further, during software engineering process group (SEPG) meetings, the actions taken for defect prevention will be reviewed from the phase-wise DP action proposals database, and will be prioritized based on factors such as cost, effort and staffing. Appropriate measures should be taken at the organizational level to reduce any recurring causes. Action plan implementation and tracking. With the help of the PRG, the DPL initiates the implementation. The progress and results of the implementation of action proposals are verified and validated by periodic review, DPL audit and internal audit. Those results are consolidated and submitted to the SEPG for further review. Once approved by the SEPG, the preventive actions taken by one project can be included in the organization-specific quality process. All those results are recorded and documented within the project. Measure results. Various measurements can be performed to determine

The progress and results of the implementation of action proposals are verified and validated by periodic review.

• Software Test & Performance

•

EFECT PREVENTION V.2 Once you’ve implemented the prevention practices outlined above, try these other options: • Document and publish major defect categories • Document the frequency and distribution of defects in the major defect categories • Publicize innovations and other actions taken to address the major defect categories • Update the lessons learned details in the project’s historical database for future reference • Publish a summary of the action proposals and actions taken

the effectiveness of addressing the common causes of defects and other problems on the project’s software process performance. For example, one useful measurement is that of the change in the project’s software process performance. Also, data on effort expended to perform defect prevention and the number of defect prevention changes implemented can be captured in the phase-wise DP action proposals database and analyzed for further process improvement.

Better Products, Happier Teams Independent verification and validation is a systematic approach for ensuring that a product is being built correctly, and that the correct application is being built. Because IV&V has the support of the customer and is developer independent, it increases project effectiveness. True IV&V relies on technical, managerial and financial independence, and is performed in parallel with the development process. IV&V also extends beyond the software engineering process into the assessment, customization and implementation of application software. Prevention practices are everyone’s responsibility, and the discipline should be adopted by the entire organization, particularly those in development. Defect prevention activities also are a mechanism for communicating lessons learned among software projects and groups, and usually lead to better-quality products and happier, more effective teams. ý APRIL 2008

EVERYTHING ABOUT JAVA™ TECHNOLOGY. AND SO MUCH MORE.

You won’t want to miss the JavaOne conference, the premier technology conference for the developer community. This year’s Conference presents the latest and most important topics and innovations today to help developers access even more richness and functionality for creating powerful new applications and services.

OF WHAT YOU NEED

200+ technical sessions More than 100 Birds-of-a-Feather sessions

15 Hands-on Labs

LEARN MORE ABOUT s Java Platform, Standard Edition (Java SE) s Java Platform, Enterprise Edition (Java EE) s Java Platform, Micro Edition (Java ME)

s Web 2.0 s Rich Internet applications s Compatibility and interoperability s Open source s E-commerce collaboration s Scripting languages

Save $200 on Conference registration!

java.sun.com/javaone Please use priority code: J8PA2STP

JavaOneSM Conference | May 6–9, 2008 SM

JavaOne Pavilion: May 6–8, 2008, The Moscone Center, San Francisco, CA

Platinum Cosponsors

Cosponsors

Copyright © 2008 Sun Microsystems, Inc. All rights reserved. Sun, Sun Microsystems, the Sun logo, Java, the Java Coffee Cup logo, JavaOne, JavaOne Conference, the JavaOne logo, Java Developer Conference, Java EE, Java ME, Java SE and all Java-based marks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries.

Best Prac t ices

Profiling: Leavin’ On a Jet Plane Maybe it’s a stretch to server to divine hot spots declare application profiland performance bottleing a prominent protagonist necks. Aiding and abetting of software’s history. But the developers engaged in this sub-discipline certainly has work is a fairly mature colits own modest library of lection of application profilcanonical literature. The ing tools, from Intel’s only problem is that much VTune to Apple’s Shark to of this literature is being NVidia’s NVPerfKit. rendered obsolete by the I don’t contest the fact evolving compute model. that every program will Geoff Koch One paper that helped always need at least a few to cast the die on how to think about deputized pest control specialists who delving into code is “gprof: a Call Graph can smoke out the inevitable slowness Execution Profiler,” published in 1982 bugs—or slugs—that slime their way by Susan Graham and colleagues at U.C. into every code base. But I do suspect Berkeley. Innovative in its day, gprof that application profiling will come to spits out a visual profile that can be used mean something subtly different as to compare and assess the costs of varimore programs are accessed and conous routines associated with a given feasumed via the Web. ture. Another must-read is “ATOM: A Cloudy Single-Node Computing system for building customized program The December 24, 2008, issue of analysis tools,” published in 1994 by BusinessWeek has on its cover a rather Amitabh Srivastava and Alan Eustace, messianic image of 27-year-old Google then at Digital Equipment Corporation. software engineer Christophe Bisciglia. ATOM provides a means to inoculate a In the images accompanying the glowing program with a wee bit of code so that it article by reporter Stephen Baker, the can essentially profile itself. long-maned, hyper-thin Bisciglia is capIn 1994, the two papers, both of which tured first with glib, then gloomy counteare still readily Googleable, were named nances, both of which are appropriate for as two of the 20 most influential papers the man who is proclaimed to be ever presented at ACM’s Programming Google’s master of cloud computing. Language Design and Implementation It’s reasonable to draw one of two con(PLDI) conference. For those of you who clusions when the mainstream media pay attention to such things, we’re in the (MSM) features a computing-related midst of Pulitzer season. And while being topic in several glossy pages loaded with honored by PLDI isn’t exactly like naboccasionally breathless prose, such as bing one of those iconic gold medals Baker’s rhapsodizing that Google’s cloud handed out by Columbia University, it initiative “hovers in the distance, large nonetheless is high praise, at least among and hazy and still hard to piece together, those who care about coding matters but bristling with possibilities.” One is such as call times, function frequencies that some especially convincing PR perand sampling profilers. son sold some especially credulous Application profiling very much reporter a bill of goods. The other, true reflects its gprof- and ATOM-inspired in this case, is that the MSM is hopelessly heritage. Which is to say that even today, late to a party that started years earlier. it’s mostly about digging into a specific Baker describes the move toward program running on a specific PC or

• Software Test & Performance

cloud computing as similar to “the evolution in electricity a century ago when farms and businesses shut down their own generators and bought power instead from efficient industrial utilities.” This would have been a brilliant analogy if it had been used to describe another older and more prosaic phenomenon. For at least a decade, vendors ranging from application service providers to Web hosting firms have been finding a ready market for CPU cycles offered over the internet. And never mind that timesharing was all the rage before Bisciglia was even born. My point is not to pick too much on Baker or BusinessWeek. (Though I do think that the magazine’s snarky media critic Jon Fine is long overdue to turn his attention to the success of laser-focused trade publishing outfits like BZ Media, which is thriving in a generally dismal market for print media products.) Rather, I’m curious why the dominant find-the-hot-spot mantra about application profiling, as reflected in Wikipedia entries and vendor white papers, hasn’t seemed to change all that much, even though the era of down-the-wire computing is so clearly upon us. A hint of what application profiling might become is apparent in a rather obscure paper presented in 2002 at the Operating Systems Design and Implementation (OSDI) conference. Despite its underwhelming and obtuse academic title, “Resource Overbooking and Application Profiling in Shared Hosting Platforms,” the paper nonetheless poses a rather a tantalizing question: What if CPU and networking resources for Web-based applications were doled out the same way that airlines allocate their seats? “Instead of trying to ensure that every ticketed passenger gets to board their chosen flight, we try to ensure that no plane takes off with an empty seat (which is achieved by overbooking seats),” writes University of MassachuAccording to Geoff Koch’s own quirky QoS requirement, he is sometimes willing to be bumped from his flight (due to overbooking) to get that free round-trip voucher. Write to him at gkoch@stanfordalumni.org. APRIL 2008

Best Practices setts computer science professor Prashant Shenoy and two collaborators.

You Must Read This I can already hear the e-mail complaints from you in the curmudgeonly software crowd. You’ll tell me that it’s your job to write and test software. You’ll tell me that if the nitwits who run your code can’t allocate enough CPU horsepower and network bandwidth, that’s their problem. You’ll tell me that everyone knows that software vendors are responsible for communicating underlying platform requirements— operating system versions, CPU speeds, memory amounts and so on—and not much more than that. But I think you may be wrong. One consequence of the fact that more and more programs carry the “Web-based” prefix is that users are stripped of most of their ability to tinker with hardware and software, and thus are improving their computing experience. And when these users start

grinding their teeth at the intermittent performance of their favorite Web application, to whom will their ire be directed: the name of the software company emblazoned on the Web page with the “service temporarily unavailable” message, or the faceless operator of the anywhere-in-the-world server farm that’s hosting the code? I’ll bet that in the future, application profiling is going to be more about accurately provisioning shared hardware and less about probing for balky hotspots in a single code base. The work of Shenoy and his peers is for now too wonky to be widely adopted. But whether or not you apply their complex formula for deriving qualityof-service resource guarantees, you can at least start thinking about the formula’s variables. Do you know the hardware resource requirements associated with your code, both on average and during times of peak usage? And more important, do you know your code’s tolerance for hardware

resource overbooking? You should. Shenoy and his collaborators found that by overbooking resources by as little as one percent, server cluster utilization increased by a factor of two. And a five percent overbooking yielded a 300 percent to 500 percent improvement, they write, “while still providing useful resource guarantees to applications.” With numbers like that, it’s a fair bet that some of your precious code will inevitably be subject to overbooking. To ensure that the hardware guys aren’t ham-handed about it, you might want to start thinking differently about application profiling. You can start by reading Shenoy’s paper and those on application profiling for provisioning being churned out by smart researchers at HP Labs, IBM Research and Duke University, the hotbeds of thinking on this topic today, Shenoy says. It’s work that will never garner Pulitzer-type acclaim, but may help to create the future of application profiling. ý

Page

NOMINATIONS NOW OPEN FOR THE 2008 TESTERS CHOICE AWARDS

www.testcomplete.com/stp

"The Testers Choice Awards recognize excellence in software test and performance tools. The awards encompass the full range of tools designed to improve software quality."

Empirix

www.empirix.com/freedom

Hewlett-Packard

hp.com/go/quality

iTKO

www.itko.com/lisa

JavaOne Conference

java.sun.com/javaone

Pragmatic

www.softwareplanner.com

Parasoft

www.parasoft.com/adp

QA Infotech

www.qainfotech.com

RTTS

www.reachsimplicity.com

Software Test & Performance Conference

www.stpcon.com

Test & QA Report

www.stpmag.com/tqa

Testers Choice Awards

www.stpmag/testerschoice

Index to Advertisers Advertiser

URL

Automated QA

APRIL 2008

Nominations open on May 1 and close on June 13. There is no limit on the number of products that may be nominated by a company, or on the number of categories into which a product may be nominated. There will be a processing fee for each nomination. All nominations must be received by June 13, 2008. VOTING STARTS JULY 1 WATCH YOUR E-MAIL BEGINNING JULY 1 FOR YOUR INVITATION TO VOTE. Online voting opens on July 1 and close on July 30. Only qualified subscribers to Software Test & Performance may vote. Winners will be announced at the Software Test & Performance Conference Fall 2008, Sept. 24-26, 2008, in Boston, MA. The awards will appear in the Nov. 2008 issue of Software Test& Performance.

stpmag.com/testerschoice 9, 37

Questions? Contact editor Edward J. Correia, at ecorreia@bzmedia.com.

www.stpmag.com •

Future Future Test

Test

The Software Tester’s Bill Of Rights Software testers are people, Don’t you have any imaginatoo! Many of my best friends tion? Those performance are software testers, and I and reliability metrics— can guarantee that they are you’ve got to be kidding; people. In many countries, that throughput will never people have rights. Well, fly on a real-world network. not everyone has rights. Fortunately, we testers Airline travelers don’t have know better. We know what’s any rights, as we all know. a good requirement and Celebrities don’t have any what’s totally lame. Let us rights. Neither do people fine-tune the specs. If we disI. B. Phoolen who talk loudly on cell agree with a feature request, phones in restaurants or on the subway. let us revise it or delete it. If the people who designed cell If the test team owns the specs, we can phones wanted to make friends, they’d guarantee that our tests will show that the program the phones to drop the call if application meets those specs on time, on the caller is being too noisy. Mobile budget, blah blah blah. phones have sensitive microphones. You 2. The Right to Kill the Project don’t have to shout! That’s right. If the requirements are suffiWhat about the rest of us? We do have rights, and that goes double for software ciently moronic, or if we think the project testers. You know, testers take it in the is silly or necessary, we’re going to axe it. shorts most of the time. Good people, it’s I.B. calls that “improving ROI.” time we fight back with our very own 3.The Right to Choose Our Tools Software Tester’s Bill of Rights. Everyone talks about how developers are I know that you’re asking yourself, creative free spirits who should be able to “What a brilliant idea. But who would choose their own tool chain. If some prowrite this Bill of Rights for us?” Fear not, grammers want to use Visual Studio or gentle software tester. I.B. Phoolen is the IBM Software Platform, that’s groovy. more than happy to draft this important If Bob wants to run JBuilder, that’s fine. If document on your behalf. And now, withSally wants to run Eclipse, nobody out further ado, I present: The Six-Point objects. If Biff eschews IDEs to write the Software Tester’s Bill of Rights. entire application with vi, lint, gcc and 1. The Right to Own Requirements duct tape, more power to him. A tester’s job is to ensure that software Meanwhile, C-level executive bozos meets requirements. Where do those want to standardize the quality assurance requirements come from? Some from suites to embrace new flash-in-the-pan the customer. Some from the architect. paradigms like “test automation” and There’s the problem. “test-driven development.” They insist Many of those requirements are that testers use uniform tools and bug obtuse, poorly written or plainly misguidtracking applications, or—heaven help ed. Those user stories—c’mon, folks. us—“ALM suites.”

• Software Test & Performance

Bullfeathers! Testers are just as creative as developers, as you can tell by reviewing my recent expense reports. We demand a generous budget so we can choose our own tools. As far as I’m concerned, every tester has an unalienable right to adopt the defect management system of his or her own choice, even if it’s Excel. If the CIO and VP of IT don’t like it, well, that’s their problem, bunky.

4. The Right to Employ Agile Methods Preferably, those agile methods would be demonstrated by a perky aerobics instructor wearing a torn sweatshirt and leggings like Jennifer Beals in Flashdance.

5. The Right to Determine Release Schedules I’ve had it up to here with test cycles being compressed due to boneheaded requirements, flawed architectures and nitwit coders who wouldn’t know an unchecked buffer if it bit them in the nose. Under this Bill of Rights, any tester— any tester—can push back the release schedule at any time, with or without cause, and there ain’t nuthin’ you can do about it.

6. The Right to Fix the Org Chart In some organizations, development and test are peers. In other organizations, test reports to the development organization. Both of those models are flawed. The only reason that companies hire architects and developers is to create applications for the test team to test. Therefore, ipso facto, development is a subset of the test organization and should be treated as such. That means that all developers work for the test organization. And, of course, all testers will get paid more than developers, and get all the best parking spaces. Take that, coding prima donnas. Who’s your daddy now? Oh, while you’re up, could you grab my cell phone? I need to call Jennifer Beals. Thanks. ý Retired software engineer I.B. Phoolen lives in Southern California, where he regularly frolics. He rarely updates his blog at ibphoolen.blogspot.com. Write him at ibphoolen@gmail.com. APRIL 2008

A LT E R N AT I V E T H I N K I N G A B O U T Q UA L I T Y M A N AG E M E N T S O F T WA R E:

Make Foresight 20/20. Alternative thinking is “Pre.” Precaution. Preparation. Prevention. Predestined to send the competition home quivering. It’s proactively designing a way to ensure higher quality in your applications to help you reach your business goals. It’s understanding and locking down requirements ahead of time—because “Well, I guess we should’ve” just doesn’t cut it. It’s quality management software designed to remove the uncertainties and perils of deployments and upgrades, leaving you free to come up with the next big thing.