Page 1

No. 4 / 2012 • www.todaysoftmag.com

TSM

T O D A Y S O F T WA R E MAG A Z I NE

Impressions TechEd North America 2012 Function Point in practice Microsoft SQL Server Optimizing performance The analysis of Object Relational Mapping (ORM) mechanism with Hibernate SEO QA latest Google algorithm updates Flame - the cyber disembodied weapon Restful Web Services using Jersey Microsoft Project and the Agile Projects Semantic Web A Short Introduction Made in Cluj - Jumping Electron Big Data - Apache Hadoop Background tasks - Metro Gogu


8 Impressions TechEd North America 2012 Leonard Abu-Saa

10 The analysis of Object Relational Mapping (ORM) mechanism with Hibernate examples Anghel Contiu

14 SEO QA latest Google algorithm updates Radu Popescu

17 Function Point in practice Ionel Mihali

20 Microsoft SQL Server Optimizing performance Cosmin Cristea

22 RESTful Web Services using Jersey Tavi Bolog

25 Semantic Web A Short Introduction Dia Miron

28 Flame the cyber disembodied weapon Andrei Avădănei

31 Big Data Apache Hadoop Robert Enyedi

34 Made in Cluj Jumping Electron finmouse

36 Background tasks Metro Radu Vunvulea

39 Microsoft Project and the Agile Projects Florian Ivan

41 Gogu Simona Bonghez


editorial

Editorial

Ovidiu Măţan, PMP

ovidiu.matan@todaysoftmag.com Founder & CEO of Today Software Magazine

The last months proved to be full of events. IT Camp was a very well organized event, involving local and international speakers who gave us the opportunity to become aware of Microsoft latest trends. To be noted the importance and the support Microsoft gives to the community. I would love to see the same kind of conferences held in Cluj by the competitors Apple and Sun. The list with my impressions continues with Hack a Server, an event where the platform for testing vulnerabilities offered to hackers and system administrators was launched. Business Days Cluj Napoca was another exceptional event that TSM attended. This time, the target was the business environment and the big interest for this subject reflected into over 800 participants. The last two events we attended were based on technologies: Saturday Ruby – one Saturday dedicated to learning and deepening Ruby and the second event Code Camp – based on Microsoft latest news and technologies. You can find more details and photos from these events in the dedicated article. Starting with this issue we are going to get closer to our readers by organizing launch events for every TSM new number. By rotation, each sponsor of our magazine will host the event. During these mini-conferences, the readers are going to meet the authors of the last issue’s articles. Thus, we give the opportunity of starting a dialogue and creating a community. Each participant will be given a free copy of the magazine. This issue is notable for the great number of approached technologies. You will find articles starting with Restful Web Services using Jersey and ending with the semantic understanding of the Web. The old dilemma about the way the data persist is examined in The Analysis of the Mechanism Object–Relational Mapping (ORM) with Hibernate examples. But if we want a greater control and we choose SQL, Microsoft SQL Server – Optimizing the Performance presents the way of optimizing the performance by using production data. The novelty of this issue is introducing a series of articles about Information Viruses. The first article to open this series is dealing with Flame, the famous virus which descends from Stuxnet. The Big Data series continues with an introduction in Apache Hadoop. The Romanian initiatives are promoted by the presentation of the game Jumping Electron developed on Android platform by Finmouse. Planning Agile projects is approached with the use of Microsoft Project Tool. As usual, we are going to conclude with Gogu, that absent-minded character who makes mistakes but learns from every one. We are looking forward to meeting you at the future TSM events!!! Thanks,

Ovidiu Măţan

Founder & CEO of Today Software Magazine

4

nr. 4/2012 | www.todaysoftmag.com


Editorial Staf Founder / Editor in chief Ovidiu Mățan ovidiu.matan@todaysoftmag.com

Authors Cosmin Cristea cosmin.cristea@endava.com

Robert Enyedi robert.enyedi@betfair.com

CLD Head of Development Endava

Senior Software Developer @Betfair

Editor (startups and interviews): Marius Mornea marius.mornea@todaysoftmag.com Anghel Contiu

Graphic designer Dan Hădărău dan.hadarau@todaysoftmag.com Marketing collaborator Ioana Fane ioana.fane@todaysoftmag.com Translators Cintia Damian cintia.damian@todaysoftmag.com Nora Mihalca nora.mihalca@todaysoftmag.com Reviewers Romulus Pașca romulus.pasca@todaysoftmag.com Tavi Bolog tavi.bolog@todaysoftmag.com Made by

Today Software Solutions SRL str. Plopilor, nr. 75/77 Cluj-Napoca, Cluj, Romania contact@todaysoftmag.com www.todaysoftmag.ro www.facebook.com/todaysoftmag twitter.com/todaysoftmag ISSN 2285 – 3502 ISSN-L 2284 – 8207

Copyright Today Software Magazine Any reproduction or total or partial reproduction of these trademarks or logos, alone or integrated with other elements without the express permission of the publisher is prohibited and engage the responsibility of the user as defined by Intellectual Property Code.

anghel.contiu@threepillarglobal.com

Andrei Kovacs

andrei@finmouse.com Fondator & CEO Finmouse

Senior software developer, Three Pillar Global

Radu Popescu

rpopescu@smallfootprint.com

Radu Vunvulea

Radu.Vunvulea@iquestgroup.com QA şi Web designer Small Footprint

Senior Software Engineer @iQuest,

Florian Ivan, PMP

florian.ivan@rolf-consulting.com

Ionel Mihali

Ionel.Mihali@isdc.eu Project MVP QA Officer @ISDC

Leonard Abu-Saa

Simona Bonghez, Ph.D.

System Architect Arobs

Speaker, trainer şi consultant în managementul proiectelor, partener al TSP(smartprojects.ro)

leonard.abu-saa@arobs.com

simona.bonghez@smartprojects.ro

Tavi Bolog tavi.bolog@nokia.com Development Lead @Nokia

Andrei Avădănei andrei@worldit.info Fondator si CEO DefCamp CEO worldit.info

Marius Mornea

marius.mornea@todaysoftmag.com Fondatorul platformei Mintaka Research

Ovidiu Măţan, PMP

ovidiu.matan@todaysoftmag.com Fondator și CEO Today Software Magazine

Alina Dia Miron, Ph. D. dia.miron@recognos.ro Semantic Web Expert @Recognos Romania

www.todaysoftmag.ro www.todaysoftmag.com

www.todaysoftmag.com | nr. 4/2012

5


events

Local events Business days

Ovidiu Măţan, PMP

ovidiu.matan@todaysoftmag.com Founder & CEO of Today Software Magazine

This was the first time TSM took part in an event of this type as a media partner. During the two days we could attend numerous roundtables, workshops, conferences and even a meeting networking. I must admit that we never took part in anything like this before. What is interesting about the process is the fact that every participant must give the others his card and speak about his business, the field where he can give advice and in what areas he needs it. It all happens very quickly and there is a networking section following the presentations. You will be amazed how many people can be interested in your business and the areas that may become related are unexpected. A very interesting workshop we attended was the one about social media, where we witnessed the deconstruction of the myth about social media that could provide you with a lot of financial benefits. One million fans doesn’t automatically bring a million dollars. This could still happen, but you need a competitive product, so that the social networks could be used to talk to your fans instead of actually selling the product.

Hack a server

There isn’t much to say about this platform, since everyone should already know about it, after the interview from TSM, third edition, as well as the local sites and Transilvania TV station. During the latest discussions with Marius Corici, the platform is successful, becoming increasingly popular. We are going to keep you posted on its further evolution.

Ruby

It was a nice start of the weekend. On Saturday at 10 o’clock I went to ISDC where I attended an introduction to the Ruby language, followed by more advanced topics during the second half of the day. We were pleased to see the speakers’ enthusiasm and passion and we definitely recommend further meetings of this group to all who wish to deepen their knowledge of the Ruby language or to learn something new.

Code camp

On July 21st we witnessed a summer IT event organized by the Code camp from Cluj-Napoca. The event took place at ISDC Headquarters and gathered more than 45 IT specialists, in spite of extremely nice and warm weather.

6

nr. 4/2012 | www.todaysoftmag.com


events

TODAY SOFTWARE MAGAZINE ITCamp

Marius Mornea

marius.mornea@todaysoftmag.com Software Engineer and Founder of Mintaka Research

Taking into consideration that, in our last issue, we have already explored most of the quantitative and qualitative metrics, both those of the actual event and those related to the hosts and invitees, in this issue we intend to present the ITCamp atmosphere and the impressions that it leaves afterwards. First of all we were impressed by the size, 280 participants seem a lot fewer on paper than in real life. Out of all the local, similar profile, events that I’ve had the opportunity and pleasure to attend, ITCamp clearly sets itself apart in its own category when taking into consideration the number of attendees, speakers and content. The dominant feelings, both at the opening event and throughout the conference, were enthusiasm and anticipation. One creative analogy would be the resemblance to a theme-park: many people, but also many laughs, many attractions and the tumult generated by the mixture of people trying to reach the best rides and those that just got off, split into enthusiasts, satisfied and slightly shaken. I’ve experienced the same feeling while standing there, with the timetable in my hands, not knowing which presentations to choose to maximize the final reward. Sometimes you are tempted to go for the star speaker, other times you are curios why everybody rushes to one of the halls and sometimes the quieter hall with the unknown presenter gets your attention. Before going further into the ambiance during the presentations I would like to stop on a few logistic aspects: the first impression is one of professionalism (dedicated sound crew, cameraman, quick check-in accompanied by many smiles), and then a few slip-ups to remind us that its just the second edition and the limited time for organizing such a big event takes its toll (a runaway HDMI cable, no coffee early in the morning and the most annoying one: a very slow Internet connection, that troubled at least some speakers). The main reason I mentioned the slip-ups is the public attitude towards them: at first some were noticed, even talked about, but very briefly forgotten because, quite frankly, there was no time to waste on such little details when there were that many presentations to attend, people to talk to and impressions to share. Taking into consideration Mihai’s recipe for success, shared in the last issue: “obviously the food and coffee are very important, but clearly content comes first”, its worth mentioning that I can barely remember what I’ve had for lunch, because it was full of debates, discussions, impressions exchange and attempts to steal a few answers from one speaker or another. Content comes first over food, coffee or slip-ups, thats why the only real logistic issue was the poor Internet connection that interfered with the content delivery. Since the mood was set by the content, its time to take a look at the team of speakers. Insisting on the word “team”, because thats the image they project, starting with their willingness to help (one example would be the two extra talks held by Lino and Paula to fill in for a missing or sick fellow), down to their excellent chemistry: Tim joking around with speakers from the audience, Tibi to be precise, to drown in laughter a misbehaving demo; or the administrator vs. developer game set in stage by Paula and Tibi; and not the least important is the easy and casual way in which all attendees, both speakers up front and the ones sitting in the audience complete each others ideas and concepts in the community open panel managing to build and relay a common, coherent message. Due to limited space, theres no point in going further into a detailed description of each talk, however you are invited to watch them all on the event site and form your own opinion. I want to close by stating that both the individual skills of the presenters, each managing to compensate any misbehaving slide or slipping past the allocated time with charisma and the public on their side; but also the team spirit and unity, have built a solid conference, full of quality content. I warmly recommend the following editions. www.todaysoftmag.com | nr. 4/2012

7


events

Impressions TechEd North America 2012 TechEd? A dream come true. It was the first event of this scale that I attended and I can say it was sensational. In technical terms it would mean „it’s a must”. It is my opinion that every Dev / Architect / Project Manager / Sysadmin / QA / etc. would be wanting.

Leonard Abu-Saa

leonard.abu-saa@arobs.com System Architect Arobs

Like running, the event took place on 5 consecutive days, starting the so-called PreConference Seminars, the name is descriptive, so do not get into details. On Monday and Tuesday I had one keynote session, which I will describe below.

Monday’s keynote

It talked about cloud platform: private and public Cloud. Topics were covered System Center and Server 2012. I must say I was amazed by both the team’s achievements and on the System Center Server 2012. A demo that impressed me was that on Server 2012 and Hyper-V. He highlighted the access to resources, ie copying a 10GB file. If copying takes far ... a few minutes, well, will now take seconds! Speed​ is about 1 GB / sec! Awesome!

6. Office 7. Office 365 & SharePoint (OSP) 8. Security & Identity (SIA) 9. Virtualization (VIR) 10. Windows Azure (AZR) 11. Windows Client (WCL) 12. Windows Server (WSV) In total there were hundreds of sessions for these areas. And if we think that we had 4 days to it ... well, I leave you to ponder. Between sessions was a half hour break, which went incredibly fast: the hall way north or south in the hall opposite + juice on the road.

Event Location

Orange County Convention Center - a huge building. I do not want to exaggerate, but the lobby of the building south to north Tuesday’s keynote hall went about 5-7 minutes (depending on Tuesday was presented Windows 8, or how rushed I was). Temperature: -10 ° C. at least „scratch the surface,” as Antoine Obviously I exaggerate, but it was very cold Leblond said. I was convinced of the pre- inside. sentation. After the keynote I went into the lobby to see how much and when commer- Conclusions cially will be available Samsung tablet with If at first I was puzzled by the event Windows 8. Looks absolutely flawless and ends on Thursday, well, it explained. It was behave the same. Works both with touch really very nice, but so was the amount of devices and keyboard and mouse. In other information. A busy daily schedule, 8:00 words, combining business with pleasure. a.m. to 6:00 p.m., and the amount of inforAfter keynote sites, participants had to mation has made fatigue presence felt. An choose (too many options, in my opinion) extra day would be too much. a number of areas: The final event was not without surprises. Participants had free entrance to 1. Architecture & Practices (AAP) Universal Studios, a very nice amusement 2. Database & Business Intelligence park in Orlando, where they have to enjoy (DBI) the ride’s full of adrenaline or attend a rock 3. Developer Tools, Languages & concert from the rock band „Nerds”. The Frameworks (DEV) curtain was drawn late Thursday night, 4. Exchange & Lync (EXL) after a great show. Good night, TechEd 5. Management (MGT) 2012! See you next year!

8

nr. 4/2012 | www.todaysoftmag.com


TODAY SOFTWARE MAGAZINE

Local communities

T

he community section commits to keeping track of the relevant groups and communities from the local IT industry and to also offer an upcoming events calendar. We start with a short presentation of the main local initiatives, and we intend to grow this list until it contains all relevant communities, both from the local landscape and the national or international one with a solid presence in Cluj. The order is given by a function of number of members and number of activities reported to the lifespan, thus we are striving to achieve a hierarchy that reveals the involvement of both organizers and members.

Transylvania Java User Group Java technologies community. Website: http://www.transylvania-jug.org/ Started on: 15.05.2008 / Members: 489 / Events: 38 Romanian Testing Community Community dedicated to QA. Website: http://www.romaniatesting.ro Started on: 10.05.2011 / Members: 497 / Events 1 Cluj.rb Ruby community. Website: http://www.meetup.com/cluj-rb/ Started on: 25.08.2010 / Members: 109 / Events: 26 The Cluj Napoca Agile Software Meetup Group Community dedicated to Agile development. Website: http://www.agileworks.ro Started on: 04.10.2010 / Members: 230 / Events: 13 Cluj Semantic WEB Meetup Community dedicated to semantic technologies. Website: http://www.meetup.com/Cluj-Semantic-WEB/ Started on: 08.05.2010 / Members: 117 / Events: 17

TSM recommandation

Calendar

August 30 – September 1

ICCP2012 (8th International Conference on Intelligent Computer Communication and Processing) Contact: http://www.iccp.ro/iccp2012/

September 5

Microformats Contact: http://www.meetup.com/Cluj-Semantic-WEB/ Romanian Association for Better Software Community dedicated to IT professionals with extensive experience in any technology. Website: http://www.rabs.ro Started on: 10.02.2011 / Members: 159 / Events: 6 Google Technology User Group Cluj-Napoca Community dedicated to Google technologies. Website: http://cluj-napoca.gtug.ro/ Started on: 10.12.2011 / Members: 25 / Events: 7 Cluj Mobile Developers Community dedicated to mobile technologies. Website: http://www.meetup.com/Cluj-Mobile-Developers/ Started on: 08.05.2011 / Members: 39 / Events: 2

Open Coffee Club

Contact: https://www.facebook.com/opencoffeecluj A club that is intended for entrepreneurs and those who want to learn from the experiences of successful people. Recommend this group for informal atmosphere, networking and quality guests.

www.todaysoftmag.com | nr. 4/2012

9


programming

The analysis of Object -Relational Mapping (ORM) mechanism with Hibernate examples

Anghel Contiu

anghel.contiu@threepillarglobal.com Senior software developer, Three Pillar Global

Object / Relational Mapping (ORM) is a programming technique that provides programmers with means of accessing and manipulating the application objects without being interested in where these objects come from. This technique has emerged because of the need to overcome the paradigm difference between the object oriented model (sustained by the current high level programming languages) and the relational model (used by the most popular database management systems). This problem is also referred to as the objectrelational impedance mismatch. The object oriented programming languages represent pieces of data by means of an interconnected graph of objects, while the relational database management systems use the tabular representation. There is a large effort to associate the class attributes of an object oriented programming language to the columns of a database table, and the purp ose of the ORM is to address this effort by creating a natural relationship between the two models that should be characterized by transparency, reliability and longevity. The IT industry had not come to a final solution for the problem of paradigm mismatch that everybody approves, but the general feeling is that the ORM frameworks do represent a step forward. The automatic process of object persistence into a database table using an ORM framework consists of the object mapping to tables while their association is described by metadata. A complete ORM framework should have the following features: • an API for CRUD (create, read, update, delete) operations; • a query language that addresses the persistent classes and their attributes; • means for facilitating the definition of metadata for the mappings of objects to tables; • a consistent approach of transactions,

10

nr. 4/2012 | www.todaysoftmag.com

caching and class associations; optimization techniques with respect to the nature of the application;

The purpose of this article is to discuss the challenges and the benefits of using an ORM framework by stating arguments in favor and against its use. The Java language and the Hibernate framework are used for exemplification purposes as Hibernate is one of the most popular ORM frameworks among the developers nowadays. Figure 1 describes the components used by the ORM mechanism.

ORM Challenges Concurrency and transactions

One of the issues that the ORMs must confront with is accessing the shared data in a highly concurrent environment. This problem tends to get complex when it is not carefully considered. The solution should be a simple one, it should obey the project specifications and it must enforce data integrity. Some data manipulation techniques that require a careful consideration of resource sharing will be further discussed, and they should be familiar to developers working in a concurrent environment

Optimistic concurrency

The resource locking technique has

Figure 1 – The components involved in the ORM mechanism. The entities are persisted to the corresponding tables using the ORM framework


TODAY SOFTWARE MAGAZINE

emerged as a mean for data not to be altered between the time of reading and the time of using it. The optimistic locking is based on the assumption that several transactions can be triggered simultaneously with none of them being able to alter the pieces of data belonging to the other, while the pessimistic locking implies locking some of the resources for a longer period o time. Figure 2 describes the fundamental idea behind the optimistic locking. The optimistic control suits applications that have the scalability and concurrency control as non functional specifications because it works well for scenarios based on multiple writings and some readings. For an implementation that is based on the JDPC API only (does not use any ORM framework), the developer is forced to take the responsibility of verifying the data integrity. One could implement an object versioning algorithm but this is feasible for trivial scenarios only. For complex scenarios the developer will be forced to deal with large object graphs that are hard to maintain. There are some popular patterns that approach this kind of problem but they are difficult to implement from scratch. They are: • session per conversation • session per request • session per request with detached objects The ORM frameworks are able to simplify this task. As an example the Hibernate implementation offers the automatic version checking, the extended sessions and the detached objects for handling this problem. Automatic versioning. Hibernate uses a version number specified as an integer or a timestamp in order to detect the object alteration conflicts and to prevent the lose of data. One of the object’s properties is marked as a version property and its value is incremented each time the object is modified (when it gets “dirty”).

Figure 2: Optimistic locking

speed, to observe the critical points and to address the potential performance issues at different points in time. Also, some of the important techniques that help in boosting the performance when it comes to database communication are the caching techniques (see chapter 3) and the database indexing. The expectation is to have an acceptable performance and the benefits Extended sessions. Hibernate checks of using an ORM framework to overcome the the objects’ versions at the time the its drawbacks. objects’ states are synchronized and an exception is thrown if a concurrent modi- Entity mapping fication is observed. The ORM’s performance and the developer’s productivity are also dependent Performance on the entity mapping efficiency. A wrong mapping could lead to: Queries • the generation of slow queries One of the most discussed topics when • memory overloading because of the it comes to ORM performance is about the load of unnecessary objects automatically generated SQL queries and • complex mapping scenarios that will their efficiency. It is said that the efficiency cause a low productivity for the deveof the manually written SQL queries is loper in dealing with it much better. In order to use an ORM framework in Considering a general application per- an efficient manner, the developer is requiformance analysis that is not focused on red to know a couple of things about the queries only, one can notice that the criti- framework’s internal structure so he is able cal points are not necessarily the generated to avoid the previously specified risks. A queries and moreover, even if there are a common high risk scenario can be descrifew generated queries that are slower than bed by having: expected, their number is limited and they • high coupling between the entities don’t have a major impact on the overall with and attributes as collection of application performance. The key points persistent entities; where the generated queries are to slow • relationships of type 1 to n, m to n, n can be modified by the developer as he has to 1 between the entities; the option of changing the query parame- • cascading on more than one level; ters or write his own SQL query, although • lack of optimizations when it comes he should pay attention to further mainteto data fetching. nance issues. The logical separation of the code responsible for data retrieval from the When dealing with this kind of scenadatabase provides an important advantage rio, chances are that a slight change in one when it comes to maintenance, because as of the mappings could has an unexpected changes must be made, their number is impact on a different area of the applicalimited because of the lack of duplicated tion. Tests are helpful in detecting this, but code. the challenge is to find the solution that obeys all mapping constraints, which is As the application develops, one can not easy. This is why the developer must avoid the premature query optimization. be focused on keeping things simple while Using an ORM framework leads to a fas- mapping the entities. Simple mappings and ter code writing, a simple relations between the entities are a b e tt e r re a d abi l it y huge advantage when it comes to mainwhile the code dupli- tenance because of the lower number of cations are easier to constraints. avoid. As the number of queries increases, In memory data caching benchmark tools can Caching is one of the ways to improve be used to check their the application performance, and its main Detached objects. These are objects that are not attached to a session but they get attached to one and their version is checked against their database version at the time of the synchronization (flush time). An exception is thrown if a concurrent modification is noticed.

www.todaysoftmag.com | nr. 4/2012

11


programming

The analysis of Object -Relational Mapping (ORM) mechanism with Hibernate examples

objective is to store data in the memory. cached data can only be used by the session Trips to the database are avoided this way that queried them. as those particular pieces of data were already queried and they are present in The second-level cache is used at sesmemory. sion factory level and offers the possibility of storing data from multiple sessions, meaIn memory caching using the JDBC API ning that any session objects are allowed to The JDBC API (Java database con- access the cached data. Hibernate supports nectivity API) offers an universal way 4 open source implementations of the of accessing data in Java through its two second-level cache: packages: java.sql and javax.sql. There are • Ehcache few solutions for caching provided by the • OSCache JDBC API, one of them being the javas.sql. • JBoss TreeCache CachedRowSet interface, which is a contai- • Swam Cache ner that stores records in the database and also a JavaBeans component that allows By providing a default first-level cache, scrolling, updating and serialization. The Hibernate already provides an impordefault implementation allows the retrieval tant improvement in the performance of of data from a set of records, but this can database connection over the JDBC API. be extended to obtain data from tabular The second-level cache, usually used for sources also. While working with such an the optimization of relatively performing object, the connection to the database is applications, increases even further the only be necessary when the data needs to application’s performance and it also offers be synchronized. the developer the possibility to choose the • Based on the JDBC API implemen- most appropriate of the available caching tation a number of different caching implementations. solutions were develped and they provide more solutions for data caching: The extensive use of reflection • The Oracle extension for JDBC – proOne of the topics that is said to cause vides an interface and implementation a performance drop for ORMs is the for statement cache extensive use of reflection. According to • The PostgreSQL implementation the Oracle documentation, some optifor JDBC – provides a wrapper for mization operations that are done by the statement cache over the default virtual machine can not be performed as implementation offered by JDBC. the reflection implies dynamically resolved types. This is why the speed of operations Additional solutions exists, offered by done by reflection is slower, and the recomthird party companies, which can be inte- mendation is to avoid doing them in a grated into a JDBC based application. Such repetitive manner, especially for applicatiexamples are: ons that are sensitive from the performance • Commons JCS point of view. • Ehcache The question is how much do reflecIn memory caching using an ORM tion operations slow down the application framework and if there is an advantage in doing the In the world of ORM, both the database connection and the JDBC API are transparent for the developer. The persistent objects are manipulated at ORM level. Hibernate provides the first-level cache that is associated to a sesseion and it used by default, while the second level cache is optional. The second-level cache is optional and it is associated with session factory (Figure 3). The first-level cache is used for reducing the number of queries for a session;

12

nr. 4/2012 | www.todaysoftmag.com

Figure 3 – Levels one and two of the Hibernate caching

compromise between speed and flexibility. Considering the reflection operations and their execution times on one side, and the other common performance problems for database communication (like database architecture, network latency, slow queries) on the other side, the reflection operations are not decisive for the performance problems. One could consider some optimizations for an application that has a good performance if tests show that the reflection operations represent an important percentage of the time that is needed for operating on the database, but chances are this won’t happen. The reflection operations tend to become more efficient as they are sustained, among other factors, by the development of bytecode manipulation tools like Java Assist or ASM. This direction is highlighted by events like the adoption of Java Assist library by Hibernate framework and the use of the ASM library by tools like Oracle TopLink, Apache OpenEJB, AspectJ and many others. This tools are useful for both optimizing the reflection operations and enhancing the aspect oriented programming.

Maintenance

The maintenance represents an important topic when it comes to choosing between using an ORM framework or not. An efficient SQL query adds performance but one should also notice the compromise with respect to code maintenance. The developers are usually used to the SQL and relational databases, so using them straightforward is always an option. Using the SQL code results in many lines of code used for CRUD operations that take a lot of time to be written. The code maintenance gets difficult when it comes to modifying a model class as it has an important impact on the SQL code. The JDBC API and SQL offer a command based approach for manipulating data inside the database, so the involved tables and columns must be specified several time (for insert, select, update) leading to a lack of flexibility, more code and more time for implementation. Writing the SQL code in a manual manner leads to a dependency between the tables structure and the code. Any modification


TODAY SOFTWARE MAGAZINE

on one of this sides will impact the other, so in the end the maintenance is difficult Conclusions and the developer is forced to think about addressing this problems in non elegant The decisions of using an ORM framanners. mework and choosing one must be carefully considered with respect to the project’s speThe integration of ORM frameworks cifications. Here are some of the advantages with the object oriented programming and disadvantages of using an ORM. model has the advantage of enhancing The learning curve for an ORM fraa design where the data access code has mework is one of the first problems a its own place, often being written inside its own component. This will imply less duplicated code and a better re-usability while the developer has the possibility of interacting with objects, use inheritance, polymorphism, design patterns and other best practices that are suitable for object oriented languages. It is true that for small applications, where the business domain is trivial, having queries that are closer to the database might be more efficient, meaning that the developer could use the JDBC API in a straightforward manner, get the job done and forget about the effort of integrating the ORM framework.

Bibliography 1. 2. 3. 4. 5. 6.

programmer will deal with, followed by the difficulties in mapping the entities in an efficient manner. The mappings must be done manually by taking into considerations the way the relationships between the persistent classes will impact the tables’ structures. The learning curve will pay off (at least theoretically) on the long run,

especially when medium and large projects are developed. As soon as the ORM mechanism is understood the development time gets shorter while the developer’s productivity increases. The experience of an object oriented language is not enough for a developer to use an ORM framework in an efficient manner. The ORM frameworks have the potential of adding a lot of value, but the

developer must know about the relational model, the SQL and about the details of the chosen framework in order to deal with the sensitive problems like performance or concurrency. The final purpose is to increase the productivity and performance while dealing with persistent data.

Michael Keith, Randy Stafford, „Exposing ORM Cache”, ( http://queue.acm.org/detail.cfm?id=1394141 ), 1 May 2008; K. L. Nitin, Ananya S., Mahalakshmi K., and S. Sangeetha, „iBatis, Hibernate and JPA: Which is right for you?” (http://www.javaworld.com/javaworld/ jw-07-2008/jw-07-orm-comparison.html?page=7), JavaWorld.com,15 Iulie 2008; Mahmoud Parsian, „JDBC Metadata, MySQL, and Oracle Recipes: A Problem-Solution Approach”, 13 Martie 2006 Christian Bauer, Gavin King, „Java Persistence with Hibernate”, 24 Noiembrie 2006 Hibernate Community Documentation, „Hibernate – Relational Persistence for Idiomatic Java” ( http://docs.jboss.org/hibernate/orm/3.3/reference/ en/html/index.html ) Oracle Java SE Documentation http://docs.oracle.com/javase/6/docs

www.todaysoftmag.com | nr. 4/2012

13


programming

SEO QA latest Google algorithm updates

In the last years, Google enacted a number of changes to their search algorithms. In 2012 there were at least two updates every month and this had a devastating effect on a lot of websites. The controversial Penguin and Panda updates delisted even websites with good and useful content. Today’s SEO doesn’t depend only on content optimization and link building but on other, recently relevant, factors like user experience or social media presence. We need to be always informed, to keep up with Google’s updates. Let’s look now at the most important algorithm updates of the last year and a half and how it changed Internet the search. Radu Popescu

rpopescu@smallfootprint.com QA and Web designer Small Footprint

Firstly, we have to understand why we refer to Google when we talk about SEO, without mentioning Yahoo or Bing. Between January 2011 and now, Google had a market share of over 90% (over 98% in Romania) and has not dropped below 88% in the last 6 years, so we have to focus our efforts towards Google.

Panda update and Artificial Intelligence

Since the beginning of 2011, the Panda update, which is based on an advanced machine learning algorithm, Google has started to classify websites based on more advanced and hard to quantify criteria like design and user experience, the percentage of content above the fold, trust factor and social media buzz. The algorithm is based on the well known “human raters”, which were real people, paid by Google to review websites on the mentioned criteria. This algorithm downgraded even old websites with good and useful content, which ranked well on SERPS but didn’t offer a good user experience. Panda update also negatively impacts websites that have only a few pages which don’t follow these guidelines. In this case, a solution is to

14

nr. 4/2012 | www.todaysoftmag.com

block all the low quality pages using meta robots. The old saying “Content is King” doesn’t have the same value now. There is a lot of good and unique content online but this is not enough to rank well. A good website is one which offers the user a nice experience and it’s present on social media or it has good inbound links. My recommendations to keep up with this update are the following: • All the pages that have low content or excessive ads should be blocked from indexation with meta robots; • Sharing options lead to social media mentions which can improve the ranking on SERPS. +1 button is very useful; • Keeping the bounce rate between 35-65% through good internal linking; • Improving the user experience, navigability and load speed can influence the site’s position in Google results;

Freshness factor

The Freshness update, introduced in November 2011, have a very big impact on search results, affecting about 35% of searches. At this time, the best way to have


TODAY SOFTWARE MAGAZINE

consideration the fact that some people will try to update a few words on their websites so that they will have “fresh” all the time. Google looks at the date when a page was first indexed and other small content updates are not processed as fresh content. Here are a few methods that you can use to create new content regularly: • Company or product blog helps in creating new content regularly; • Press releases added on a section inside the website also generate fresh content; • By sharing a page on different social media platforms in a short period of time, Google sees new and interesting content;

Penguin Update and spam

The Penguin Update’s main purpose was to downgrade and delist websites which don’t follow Google’s policy regarding black hat SEO. This update removes from search results, all websites that use duplicate content, link farms, cloaking or keyword stuffing. In the same manner, websites that user white hat SEO are penalized if the optimization is excessive. Basically anything that doesn’t look naturally is affected. Penguin had a huge impact on the affiliate niche, because most of the site owners used the same description and images as on all of the manufacturer websites. Also, a lot of ecommerce websites, which didn’t used original content in product presentation were penalized and had to come up with ways to create original content. The solution was to add on to their pages user generated content from reviews, comments on the product and ratings. fresh content is with blogs. If we look at company blogs, it’s almost impossible to add new content regularly, so the blog gets stale. Employees can write weekly articles about new technologies they are using or new trends from their domain.

Of course we don’t always need fresh content. For example if we are searching for a tutorial on CSS “float” property we don’t need the newest article, but the best. Google learned to analyze and distinguish if a search query needs to display fresh result or not. The engineers took into

Figure 1. WPMU.org traffic evolution before and after Panda update (from webpronews.com)

One of the most famous cases of the Penguin update downgrade is the case of WPMU.org, one of the largest websites which offers news and resources using the Wordpress platform. After the Penguin update, the site had a drop in organic traffic of about 80% which lead to a massive earnings drop. The main issue in this case was the low quality inbound links which came from two major sources. WPMU.org is running a blogging service too, EDUblogs. org, something similar to Blogger but on the educational nice only. On every page of the blog, in the footer there was a link www.todaysoftmag.com | nr. 4/2012

15


programming

SEO QA - latest Google algorithm updates

to WPMU.org; in total over 500 thousand was solved. shows that the SEO consultant job will artificial links which were detected by become a web strategist one. Google. What to expect next? Through all the mentioned changes, Besides this, in every Wordpress theme Google is trying to take the Internet search that you can download from WPMU.org, to a new level. Quality is the keyword of the they had a link to the template page. This new updates and by quality we are referring led to another 21 thousand links coming to both content and design. Most likely, from independent domains (blogs where social networks will play a very important people used the themes), which apparently role in SEO and Google+ will be in the were spamming or hosting low content. center of all these changes. We know for sure that in order to have good results in The new Penguin algorithms consi- SEO, we need to be informed all the time dered all these links to be low quality and and update our strategy and actions each WPMU.org was penalized. After removing time new updates appear. Taking into conover 30% of all inbound links and doing a sideration that every action on the Internet small internal SEO cleanup, the problem can affect our SERPS ranking, the tendency

16

nr. 4/2012 | www.todaysoftmag.com


management

TODAY SOFTWARE MAGAZINE

FPA in practice

After publishing an article on the subject in a previous number, I accepted the invitation to explore in detail the practice of FPA. The previous article was aiming to broadly explain the FPA method, its applicability, how it can be used, the basic terminology and a brief example was given. In this article, I’ll focus more on how to apply the method.

Ionel Mihali

Ionel.Mihali@isdc.eu QA Officer

I decided that, for a better understanding, I will take an application already built (so as not go on an estimate based on functional requirements) in which we can apply the detailed measurement method.

them, see the article „Function Point estimation method” published in the previous number of the magazine http:// www.todaysoftmag.com/article/en/3/ Function_Point_Analysis_57.

I wanted to count for this article an application that has a functionality that is more or less standard, so that we can focus more on explaining the method, not the functionality, and I needed an application that does not have confidentiality constraints given the need to also analyze and count the database. Therefore, I downloaded from the website http://www. comersus.com/ an application that works on the shopping cart principle. The following chapters will go through the main steps in executing the count. A number of abbreviations will be used: • FTR – File type referenced • RET – Record element type • DET – Data element type • EI – External Input • EO – External Output • EQ – External Inquiry • ILF – Internal Logical File

Collecting documentation. As mentioned above, to execute the count, one will use the application itself and in subsequent chapters there will be several images for each functionality. Determining the count. We will use the detailed count. Determination of the border. In our case the application does not communicate with other applications, so we don’t have to ask ourselves whether we have data coming from or sent to other systems. For reasons of time and space I decided to count only a part of the system functionality.

Identify the data functions

In Fig. 1 we have a series of tables that are grouped in a set of data functions: ILF_Products: It is an internal logical file because it is inside the border applicaThese terms will not be explained in this tion and the maintenance of the tables is article, if you have problems understanding done by the counted application. www.todaysoftmag.com | nr. 4/2012

17


management

Metoda de estimare Function Point în practică

18 DET. Figure 6 shows the implementation of the functionality of listing products that exist in table „products”

Figure 1. Identifying the data functions Figure 5. Add products features were chosen and have been put in pictures. We’ll take each picture, explain the functionality implemented and explain how to count it. As the name suggests, Figure 2, defines the system settings. This feature contains an EI), which is the functionality to save settings in the table ILF_Mesaje, ILF_Setări:They are two „settings” Number of elements (DET) is 11. Figure 6. Products list data files with one table each. We have an EO with 5 elements: SKU, Key-key entities: There are tables In Figure 3, the user receives a message which don’t store data with user value. as a result of the save action. This is an EQ description, number of visits, price, image. Usually these tables are only used for with a single DET returned from the „screDetermining the complexity and calcusetting relations, and in the FPA termino- enmessages” table. lating the number of function points. logy they are called „key-key entities” so Having the functions and elements they are excluded from count. determined above, we now have to determine the complexity of each function and Identifying the user functions the number of points associated. These are The application has two main parts: an Figure 3. User message listed in the table below (Fig 7). There is administration and a navigation / ordean exception from the rule, saying that in ring products part. Below, a number of Figure 4 shows the implementation of addition to existing user functions there the update and delete functionalities of a also has to be a count of one EQ, EO, EI for product category in the table: ‚categories’. FPA tables. The complexity is determined based We have two EI with two DET each. on the number of DET and RET for data functions and the number of FTR and DET for user functions. The meaning of these abbreviations is explained in detail in the article from the previous number. The example in this article is for informational purposes only. Because this Figure 4. Category maintenance method of counting has a margin of error, like any method, the literature does not Figure 5 shows the implementation of recommend applying it to systems that may Figure 2. Defining the settings the functionality of adding a product in have fewer than 100 function points. the table „products”. We have an EI with This file contains the tables „categories”, „products”, „relatedproducts”, „cartrows”, „cartrowsoptions”. We grouped them this way because there are dependencies between them, for example, a product belongs to a category.

18

nr. 4/2012 | www.todaysoftmag.com


TODAY SOFTWARE MAGAZINE Conclusion

The measurement above is aimed at determining the productivity of an already built system (hours / function point) that can be used to estimate similar applications to be developed in the future.

Figure 7. Total number of function points

www.todaysoftmag.com | nr. 4/2012

19


programming

Microsoft SQL Server Optimizing performance

How many times you’ve encountered the issue of improving the performance of your SQL Server queries? Did you know how to tackle the problem? Myself, I’ve encountered the issue enough times before finally understanding and putting into practice a SQL Server functionality that solves, at least in part, the challenge.

Cosmin Cristea cosmin.cristea@endava.com CLD Head of Development Endava

We all know that in the development How do you use the information? Start stage, major performance issues are rare, from a system view: and that is happening with different rea- sys.dm_db_missing_index_group_stats sons: low data volume, perfect development which exposes the missing indexes environment, low variations on scenarios statistics. The information is modified at and user stories, etc. each query execution and is deleted when the server restarts. So be careful on what The solution is based on the statistics dataset you’re working on, having only data built by SQL Server. You can use them to since the start of the server. Another two find the stress applied to the server, but the views are needed: application needs to run a cycle on as close sys.dm_db_missing_index_details și sys.dm_db_missing_index_groups. as possible to user scenarios (not development). Best scenarios are those where the An interrogation that gives pertinent application runs in a production environ- values looks like this: ment or as close as possible to a real one. select top 100 What are statistics? Wikipedia states: „Statistics is the study of the collection, organization, analysis, and interpretation of data”. Well, I didn’t like statistics in college either.

priority = s.user_seeks * s.avg_total_user_cost * s.avg_user_ impact, s.user_seeks, s.avg_total_user_cost, s.avg_user_impact, d.statement, d.equality_columns, d.inequality_columns, d.included_columns from sys.dm_db_missing_index_ group_stats s join sys.dm_db_missing_index_groups g on s.group_handle = g.index_group_handle join sys.dm_db_missing_index_details d on g.index_handle = d.index_handle order by priority desc

Specifically on a database, since creation there’s an option set that updates the statistics for each table. They’re used to optimize SQL queries and to create the A real life result looks is shown in execution plans. The information stored Figure 1: regards table usage, and is enabled by activating the „autocreate” option on the DB.

20

nr. 4/2012 | www.todaysoftmag.com


TODAY SOFTWARE MAGAZINE

Figure 1. anonymized columns and tables As you can see the priority column is computed; it shows the total user impact on performance and how a possible index affects the table / queries. Best if we analyze a few: • First result suggests execution plans that do equality comparisons on the StatusID column from Table1. Taking into account that there are 30k queries, having a strong impact on the user, it’s worth taking into account adding an index on that column. • Third result suggests an index on the same column, but in queries with inequality conditions (<, >, !=). Another reason to add an index on StatusID. • The last result has some interesting characteristics. The impact is

98%. But if you look in more detail the suggestion is that there should be a composed index using three columns, from which one is probably a primary key. Don’t jump into a decision here, at least not until you make a more detailed analysis on what is the source of the data. This is where the real work starts. You have to analyse each recors separately. After a few dozen you get the hang of it, and find the ones that need more attention and the ones that don’t affect the performance too much. Knowing the business rules is also useful, so the 80/20 rule can be applied.

disadvantages. Also the purpose of the statistics is not this one, but they can be used as information in the process. Here are some of the caveats: • Destination is not for optimizing. • Can’t gather statistics for more than 500 groups of missing indexes. • They don’t say the indexing order. • For inequality conditions the results are not precise. • Included columns are not consistent. Further analysis is required.

Conclusion

As a simple conclusion, it’s useful info to start the process. But don’t start indexing everything, there is still work to do, since some analysis is required. It’s your decision is you use the data or not. For example, an index on a low data variation column is useless, but you know where to start from.

Be careful on the number of indexes you use on a table, since there are

www.todaysoftmag.com | nr. 4/2012

21


programming

RESTful Web Services using Jersey RESTful Web Services are services based on HTTP methods and the REST concept. There are usually four HTTP methods used to define services RESTful: • • Tavi Bolog tavi.bolog@nokia.com

Development lead @Nokia

POST: upload a new resource (create or modify). Repeated executions can have distinct effects. PUT: create a new entry in the collection. Repeated executions will have the same effect as the one obtained from a single IDEMPOTENT operation GET: identify a resource without modifying the source. The operation shouldn’t be used in creating resources DELETE: delete a resource. Repeated executions will have the same effect as the one obtained from a single IDEMPOTENT operation.

The REST concept was first introduced in 2000 by Roy Fielding. There are some important features for REST services below: • It’s a simple architectural style based on Web and HTTP standards (gaining ground against that of such models as Soap) • It’s a client-server architecture based on using resources, such as: books, products, cars etc. • Compared to Soap, REST is not very rigid in terms of data types, it’s easier to read using nouns and verbs and is less ‘verbose’ • Error correction is done according to the HTTP protocol. • It’s a scalable architecture due to separating the responsibilities between client and server. For instance, the responsibility of a client is to maintain the user state, while

22

nr. 4/2012 | www.todaysoftmag.com

• •

the server has no such responsibility, but the one of data management. Moreover, the server should not store a user’s data between requests (stateless). The clients can use caching techniques for increasing performance and the scalability of a solution. The server can have multiple levels, unless their modification affects the clients in any way. This approach could lead to a separation of responsibilities for different levels, such as: load balancing, security, data storage, resource caching. The resources can have various representations (JSON, XML, user-defined etc). Their usage is determined according to HTTP protocol (for example, HTTP Headers)

A RESTful service typically defines base URIs (such as: http: //myserver.com / myresources), MIME types that it creates / uses (JSON, XML, user-defined etc) and related operations (shown above).

Java and RESTful Web Services

In Java, the RESTful Services Support is called Java Api for Restful Web Services (JAX-RS) and is defined by JSR 311. JAX-RS is highly based on annotations, as it can be seen below. Jersey is an open source implementation of JAX-RS that has production quality and was used in my projects. The example below was implemented using Jersey. It’s worth mentioning that JAXB is used for JSON and XML support


TODAY SOFTWARE MAGAZINE of JAX-RS. Other implementations would be: Apache CXR Restlet, JBOss RESTEasy.

A client-server implementation using Jersey

Let’s try to build a small application using jersey. An interface for “books” resource management will be provided by the application. I will also present a method of using the created web service, through the client support provided by Jersey. The server side will expose a series of useful methods for ”books” resource management: reset, get, list, delete and add. The client side will be created as a JUnit form and will expose a series of testing methods for validating the functionality of the Web service.

Project Creation

The project was implemented using SpringSource Tool Suite (including the WMware vFabric Web Server) OpenJDK 7 and Maven 3 in Ubuntu 12. Jersey needs four libraries (including their dependencies): • jersey-server: it contains Jersey’s implementation on server side • jersey-json: it contains JAXB support for Jersey • jersey-servlet: it contains support for servlets framework • jersey-client: it contains Jersey’s implementation on client side Using Jersey-Server together with Jersey- Client is not mandatory, according to the RESTful WebServices definition (this approach was only used to exemplify Jersey’s multiple features).

Maven and pom.xml

Maven offers a built-in method to resolve library dependencies, so the usage of this tool is highly recommended. For further details, see: https://github.com/ tavibolog/TodaySoftMag/blob/master/ pom.xml.

Web.xml

Every application installed in a servlet container needs more than a web.xml configuration folder. The servlet needed to initialize Jersey and the Restful services will be specified, along with the Java package as a parameter value of com.sun.jersey. config.property.packages:

<servlet> <servlet-name> Jersey REST Service </servlet-name> <servlet-class> com.sun.jersey.spi.container.servlet. ServletContainer </servlet-class> <init-param> <param-name> com.sun.jersey.config.property.packages </param-name> <param-value> com.todaysoftmag.examples.rest </param-value> </init-param> <load-on-startup> 1 </load-on-startup> </servlet> <servlet-mapping> <servlet-name> Jersey REST Service </servlet-name> <url-pattern>/*</url-pattern> </servlet-mapping>

and javax.ws.rs.code.* packages found in jersey-core dependency. Let’s discuss the most important. For service definition, you only need to define a class from the Java package, defined in web.xml by the com.sun.jersey. config.property.packages. In our case, this will be com.todaysoftmag.examples.rest. package com.todaysoftmag.examples. rest; // importurile sunt omise @Path(„/books”) public class BookService { … }

Servlet mapping specifies the URL pattern used to call the ServletContainer of Jersey. For a :

As you can see, class definition is preceeded by the@ Path annotation. This defines the way for the Web service. If we use the above example, The URL will be:

http://<server>:<ip>/<AppName>/ <JerseyDomain>/<WebService>/<method>

http://<server>:<ip>/<AppName>/ books/<method>

mapping refers to <JerseyDomain>. If we use “/*”in the URL pattern, the request will be as follows: http://<server>:<ip>/<AppName>/<WebSe rvice>/<method>

Or If we use ‘home’ in the URL pattern, the request will be as follows: http://<server>:<ip>/<AppName>/ home/<WebService>/<method>

How to define a domain class

The domain class defines the structure of the objects used by the Web services. As a rule, domain classes are POJO. The @X m lRo otE lement and @ XmlElement annotations are needed to enable JAXB automatic data mapping XML or JSON formats to/from POJO and vice versa. This support is offered by JAX-RS. If the domain class (Book in our case) defines a constructor, then the default constructor should be defined, being used in mapping by JXAB. @XmlRootElement public class Book { @XmlElement(name = „id”) String id; @XmlElement(name = „name”) String name; @XmlElement(name = „author”) String author; … }

The domain class complete code can be found here: https://github.com/ tavibolog/TodaySoftMag/blob/master/src/ main/java/com/todaysoftmag/examples/ rest/Book.java.

Service implementation

Annotations are widely used by JAX-RS (as mentioned above). Most of them can be found in the javax.ws.rs.*

The @Path annotation is also used for the methods exposed by the Web service: @PUT @Path(„/add”) @Consumes(MediaType.APPLICATION_JSON) @Produces(MediaType.APPLICATION_JSON) public Book add(Book book) { …. }

In this case, the annotation value will be used as a part of the URL web service for that method. In our care, the URL for adding a ’book’ will be: http://<server>:<ip>/<AppName>/books/ add

This method is preceded by a series of annotations: • @PUT (similar to @GET, @POST, @DELETE): a request method designator which will process HTTP PUT request (or one that is similar, like HTTP GET, HTTP POST, HTTP DELETE) • @Consumes: used to specify the MIME types of representations a resource can produce. In this case, the ’add’ method will accept ’application/json’. A method might accept many MIME types separated by ’’,’’ when thay are included in the annotation, • @Produces: used to specify the MIME media types of representations a resource can produce. . In this case, the ’add’ method will produce ’application/json’. A method might accept many MIME types separated by ’’,’’ when thay are included in the annotation. There are some cases when we want to send parameters as a part of URL request. JAX-RS defines a series of annotations that can be used for this purpose. The most common are: www.todaysoftmag.com | nr. 4/2012

23


programming

Restful Web Services folosing Jersey

@PathParam: a type of parameter that you can extract for use in your resource class. URI path parameters are extracted from the request URI, and the parameter names correspond to the URI path template variable names specified in the @Path class-level annotation.

There are different ways of a Rest client implementation: using HTML pages. Apache HTTP Client, Jersey Client etc. This article speaks about using the Jersey Client. A Jersey Client configuration is made using the interface com.sun.jersey.api. client.config.ClientConfig. Configuring @DELETE cilent name refers to properties, functions, @Path(„/delete/{id}”) public Book delete(@ etc.. which will be used by the Jersey client. PathParam(„id”) String id) { … Jersey offers a default implementation of } the interface, called DefaultClientConfig. @QueryParam: : used to inject request An initialization of the Jersey client confiparameters in parameters of the ser- guration is very simple: vice method. In the example below, if ClientConfig config = new DefaultClient Config(); the request URL is http:// <server>: <IP> / <AppName> / books / list? s The next step is to create the Jersey = abc , the „list” method of the ser- client(com.sun.jersey.api.client class. vice will receive „abc” as value of the Client). It is important that this operation search String parameter. If the para- is very expensive and requires considerable meter „s” is not specified the request, resources. Reusing a Jersey a client is therethe method will receive „ „ as the fore recommended. value of the search String parame- Client client = Client.create(config); ter, through the use of annotation @ Once we have the Jersey client will have DefaultValue. to create our WebResource object (com. @GET sun.jersey.api.client.WebResource). This @Path(„/list”) @Produces({MediaType.APPLICATION_ will allow us to run request into the Web JSON, MediaType.TEXT_XML}) public Book[] list(@ service methods , while providing procesQueryParam(„s”) @DefaultValue(„”) String search) { sing capabilities for the answers. … }

@HeaderParam: used to inject parameters from HTTP header of a request in the parameters of a service. Below is an example of use. If no header is sent, the parameter will be evaluated from the „null” to the value default of the parameter type of the method service (e.g null for String, 0 for int or false for boolean). @CookieParam: used to inject HTTP cookies in parameters of a service. Below is an example of use. If the cookie is not sent, the parameter will evaluate from the „Null” to the default value of the parameter type of the service method as above.

@POST @Path(„/reset”) public Response reset(@ HeaderParam(„clearOnly”) boolean clearOnly , @ CookieParam(„credentials”) String credentialsȘ) { … }

24

nr. 4/2012 | www.todaysoftmag.com

Bibliography

http://en.wikipedia.org/wiki/ Hypertext_Transfer_Protocol http://en.wikipedia.org/wiki/REST Once we have the WebResource subject http://jcp.org/aboutJava/commuwe can start using it to send request. You nityprocess/final/jsr311/index.html will identify in the examples below the use http://jersey.java.net/ of the Builder pattern to create the request. Consider the following example of request. Let’s explain it: WebResource resource = client. resource(UriBuilder.fromUri(“http:// localhost:7000/TodaySoftMag”). build());

Book book = resource.path(“/books”). path(“/list”).accept(MediaType.TEXT_ HTML).get(Book.class);

The complete code of „Books” service is here: https://github.com/tavibolog/ TodaySoftMag/blob/master/src/main/ java/com/todaysoftmag/examples/rest/ BookService.java •

Client implementation

You can find more examples in the unit test of the project: https://github.com/ tavibolog/TodaySoftMag/blob/master/src/ test/java/com/todaysoftmag/examples/ rest/BookServiceTest.java. The examples use the WebResource methods described above,plus the following: • „post” method used to execute an HTTP POST; • „header” method used to add a HTTP request header; • „cookies” method used to add a cookie request; • „queryParam” method used for to add a query parameter request; • „put” method used to execute a HTTP PUT; • „type” method used to set the request content-type; • „delete” method used to execute a HTTP delete. In conclusion, I believe that the use of Jersey as a development framework and testing Restful services is easy to deepen, providing developers an extensive set of functionalities for application development.

the ’path’ Method „ is used to add a path to a URI Web resource if the Web resource is: http://localhost: 7000/ TodaySoftMag, adding a new path will turn it in http://localhost:7000/ TodaySoftMag / books. More calls of the path method will contribute to the Web resource URI changes. the „accept” method specificies the media type accepted as a response by the WebResource. In our case it is „Text / html”. Acceptable media type sites should be a subset of the media type of Web service sites and declared by @ Produces annotation. the „get” method executs a HTTP GET request and tries to map the result in a Book class instance.


programming

TODAY SOFTWARE MAGAZINE

Semantic Web A Short Introduction Semantic Web is an extension of the present Web, allowing the formal description of the resources existing on the Internet (Web pages, text and multimedia files, databases, services and so on). Among its advantages, the most important is the prompt and precise identification of relevant resources for the user, as well as the automatic operation of resources by intelligent agents. The concept of Semantic Web was introduced by Tim Berners-Lee, the inventor of the World Wide Web, about 15 years ago. The need of Web Semantic can be easily explained with an example. Alina Dia Miron, Ph. D. dia.miron@recognos.ro Semantic Web Expert @Recognos Romania

The result of such a semantic interrogation would be a description of the software agent concept in terms of properties of this concept and a possible list of examples. This argument can be further developed by imagining a semantic search engine that could answer to questions such as: “Which are Let’s assume that a user wants to find the universities in this area whose curricuout information about a software agent, lum includes courses on software agents?” an entity from the Artificial Intelligence that examines and acts in a defined world The Semantic Web Architecture: with the purpose of meeting an objective. Semantic Web Layer Cake By doing a search of the system software The Semantic Web may be seen as a suite agent in one of the known search engines of technologies, effectively organized by T i (Google, Bing, Ask, Yahoo etc), it returns m Berners-Lee in a pyramid known as the the files in which the searched words have Layer Cake, illustrated in Image 1. At the been identified. The disadvantage of the pyramid base stands the Unicode standard, present search engines is they operate on string matching, meaning that they identify the words specified in the query in a corpus of target documents. As a consequence, they return the files that contain at least one of the two words from our example. Yet the subject of those documents might not be of interest for the user. That is why the user has to skin over and manually sort the results in order to extract only those documents he is interested in. A semantic search engine would consider the whole system instead, therefore the concept of software agent and not the correspondent string and it would return only the results that are relevant to the user. Figure 1. Semantic Web Layer Cake www.todaysoftmag.com | nr. 4/2012

25


programming

Semantic Web scurtă introducere

which is used in representing and manipulating the text in various languages, as well as the standard for building URIs, which is useful for identifying the resources published on the Web (multimedia files, web pages, blogs etc). At present, the de facto standard for data describing and transferring on the web is XML, which has a series of limitations. One of these limitations is the lack of a formal semantics of the XML Schemas, which burdens the communication between applications and services that do not share the same schema or the same interpretation of a schema. As a certain level of semantics was intended to be added to the XML language, the RDF language was born, which describes the resources using triples such as <Resource, Property, Value>, arranged in structures similar to graphs. The RDF Schema is used in a hierarchy of types in order to organize the resources and the properties. The OWL language extends the RDF(S) one introducing a series of more advanced constructors that allow more expressive descriptions than the possible ones using RDF(S) triplets. OWL also allows defining constraints upon properties such as cardinal constraints, value restrictions or predefined characteristic constraints for properties. (i.e. transitivity, symmetry etc.) To be noted that the RDF(S) and OWL languages are based on the theory of description logics, which guarantees a nonambiguous semantic interpretation

of the declarations made in those languages. Moreover, using the inference engine specific to the description logic, new knowledge can be automatically derived from RDF(S) or OWL descriptions. The knowledge using RDF(S) and OWL languages can be interrogated by the standard language SPARQL, which is very similar to SQL. The other layers in Figure 1 are not completely standardized yet. They aim at defining standard languages of rules (i.e. RIF, SWRL, SPIN), which are extremely useful in defining relationships that cannot be described using the builders available in OWL, defining systems of measuring the level of trust which can be associated to a RDF triplet (i.e. it checks whether the statement premises come from trustful sources and the kind of inference system was used etc). The Encryption layer is also extremely important when checking the source of the RDF(S) and/ or OWL statements and it is based on digital signatures.

Data representation and interrogation in the Semantic Web – A few examples

Traditionally data can be represented hierarchically in a XML structure or interlinked in a data base. The Semantic Web introduces a new data organization consisting of RDF (Resource Description Framework) graphs. RDF is a simple model of describing the resources on the web

(multimedia files, blogs, web pages, web services, databases etc) and the metadata associated to them, such as title, author, publication date. RDF describes a resource considering its properties. Practically, the example of a simple RDF(S) graph is illustrated in Figure 2. In case the description using RDF(S) is not expressive enough to configure more complex domains, there is a language for representing knowledge based on ontology called OWL (Web Ontology Language). Details about ontology and OWL language can be found in the Semantic Web section (http:// www.w3.org/standards/semanticweb/) of W3C, the official webpage of the body responsible for specifying and standardizing these languages. A useful framework for developing semantic applications is AllegroGraph, which offers functionalities of RDF(S) triplets storing and management in a structure called triplestore. Data interrogation in AllegroGraph is made with SPARQL – a standard language defined by W3C – using the graphical interface AllegroGraph Web View or programmatically through a API offered by Franz (http://www.franz. com/agraph/downloads/). The interrogations with SPARQL are similar to the SQL interrogations and the results are RDF(S) sub-graphs of the target graphs. Usually, the information is

<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xml:base="http://www.example.fake/comp#">

subClassOf ITCompany

TBox ABox

Company range

FinancialInstitution

domain providesITSolutions

type

Recognos

subClassOf

providesITSolutions

type FisherInvestment

<rdfs:Class rdf:ID=“Company" /> <rdfs:Class rdf:ID=“ITCompany"> <rdfs:subClassOf rdf:resource="#Company"/> </rdfs:Class> <rdf:Property rdf:ID=“providesITSolutions"> <rdfs:range rdf:resource=“#ITCompany"/> <rdfs:domain rdf:resource=“#Company"/> </rdf:Property> <FinancialInstitution rdf:about="#FisherInvestment" /> <rdf:Description rdf:ID=“Recognos"> <providesITSolutions rdf:resource="#FisherInvestment"/> </rdf:Description> </rdf:RDF> 0

Figura 2. Example of a simple RDF(S) graph.

26

nr. 4/2012 | www.todaysoftmag.com

RDF/XML


HR

presented in a table.

it can be very easily exploited in order to build new applications and services.

A simple example is the query «find all artists born in Austria during the 19th century», which in a SPARQL interrogation looks like this: SELECT * WHERE { ?person a <http://dbpedia.org/ontology/Artist>; <http://dbpedia.org/ontology/birthPlace> ?birthPlace. ?birthPlace <http://dbpedia.org/ontology/country> ?country. ?country rdfs:label „Austria”@en. ?person <http://dbpedia.org/property/ dateOfBirth> ?dob FILTER (?dob > „1/1/1800”^^xsd:date && ?dob < „12/31/1899”^^xsd:date) }

This query can be run in the sparql-endpoint offered by dbpedia (http:// dbpedia. org/snorql/), and it displays a list of artists with name, country and birthdates.

The present state of implementation

Figura 3. Semantic web adoption cycle (Octombrie 2009) During the last 10 years, the idea of Semantic Web has spread considerably, as well as the technology adoption by the big players on the IT market. The W3C Corporation believes that the semantic technologies are mature and they are going to be adopted on a larger scale, at least by the early adopters (see Figure 3). Market analysts are also extremely optimistic about the evolution of semantic technologies and their future adoption. For example, a

Semantic Web Meetup

Figure 4. Linked Data Cloud diagram study made in 2012 on a set of 12 billion web pages indexed by Yahoo! Search shows that the use of RDF microformats for pages annotation has grown by 510% since March 2009 to October 2010, from 0.6% of analyzed pages to 3.6% (430 million pages out of 12 billion.) One of the main reasons for data / resources annotation and publishing using semantic technologies arises from the advantages of integrating them with the Linked Data Cloud. The Linked Data Cloud is a collection of data that have already been annoted and published on the web. It can be exploited in order to increase the value and the visibility of its own data. In other words, Linked Data is intended to break down the barriers of exploiting scattered and isolated data silos. Describing data using semantic languages (RDF(S) or OWL), they can be exploited in an invariable manner, they can be interconnected and they are available to anybody. In Figure 4 one can notice various data sets published and interconnected inside the Linked Data graph. In the center of the diagram, as main hubs, there are Dbpedia (the semantic database behind Wikipedia), Geonames (an ontology describing geographic locations), Freebase etc. A lot of this data is public and

At the same time with its activity of research and development, the Semantic Web Team focuses on creating and maintaining active a local community of professionals interested in semantic technologies. With the activity of the meetup group in Cluj (http://www. meetup. com/Cluj-Semantic-WEB/), that amounts to almost 110 members, together with the 70 members of the meetup group in Târgu-Mureş (http://www.meetup.com/ Targu-Mures- Semantic-Web/) we work closely with, we are trying to keep a higher level of interest and awareness vis-à-vis the latest semantic technologies on the market. In order to promote it, we organize seminar meetings where we invite famous research specialists, when possible. We should mention Rada Mihalcea, Constantin Orasan or Elena Simperl. The meetup groups are opened to all those interested in Semantic Web and the discussion topics are settled together with the community. Therefore, if you want to find out more about semantic technologies, you are gladly invited to the next events!

www.todaysoftmag.com | nr. 4/2012

27


technologies

Flame the cyber disembodied weapon Cybernetic war – the term that travels round the Internet and enters in all corners of the modern world. The press reaction is understandable as IT generates so many

Andrei Avădănei andrei@worldit.info Founder & CEO DefCamp CEO worldit.info

happy faces, as well as scared ones, and the Infosec domains is by far a fascinating one, full of surprises. Lately, the Stuxnet and Duqu applications discovery – two of the most dangerous malware targeted applications, developed during the entire history of the planet – but also the hacktivist and decentralized Anonymous attacks against the government services, have often put forward the threat of the cybernetic war. Many countries are dealing with this problem and taking preventive measures.

From Stuxnet to Flame

Although Stuxnet stands for a template, a pioneer in the cyber war, the last months of this year have been struggling with a new discovery, an application much more complex than Stuxnet could ever be. Kaspersky Lab, Maher Center of Iranian National Emergency Response Team (CERT) and CrySyS Lab of the Budapest University of Technology and Economics announced on May 28, 2012 the Flame modular malware discovery, also known as Flamer SkyWhiper/Skywiper. The last who took part in the discovery mentioned that Skywiper (Flame) was by far the most sophisticated malware they met so far and certainly the most complex discovered in history.

infected computers, saved folders, network traffic, contacts, or can even record audio conversations, if the computer has an attached microphone (including Webcam) or the keyboard activity. The data is sent on one of the CC Servers (Command and Control) spread worldwide, meanwhile waiting for new instructions. According to estimates in May 2012, Flame had initially infected approximately 1,000 machines, with victims including governmental organizations, educational institutions and private individuals. At that time the majority of the infections happened in Iran, Israel, Sudan, Syria, Lebanon, Saudi Arabia, and Egypt, with a „huge majority of targets” within Iran. Flame supports a „kill” command which wipes all traces of the malware from the computer. The initial infections of Flame stopped operating after its public exposure, and the „kill” command was sent on the entire network, and the virus was erased from the victims’ computers.

Preliminary reasearch show the fact that this malware exists ’in the wild’ (without being detected so far) for more than 2 years, strating from March 2010. Due to the extremely complex level and the targeted attacks (it only attacks certain Flame can spread to other systems over targets), it wasn’t identified by any antivirus a local network (LAN) or via USB stick. software. The dangerous program, identified by the Kaspersky Lab products under the name The people from Kaspersky declare on of Worm.Win32.Flame, was created for the company’s official blog that this compucyber espionage and can steal confidential ter worm is different beacuse is not meant data, record screenshots, gather data about to steal money and is very different from

28

nr. 4/2012 | www.todaysoftmag.com


TODAY SOFTWARE MAGAZINE More details on Flame Command and Control servers

the attacks of the so-called ’hackivists’. We are left with one possibility, that of having been started by a nation that sponsored Flame development and spreading. This argument supports the fact that the attack was very complex and also very well targeted (especially to the Middle Eastern countries). Although the functions are different from the previously known cyber weapons Stuxnet and Duqu, the geography of the targets, the exploration of certain weaknesses and the fact that only certain computers are targeted place Flame in the cyber super-weapons category. ‘’For a few years, the risk of starting a cybernetic war is the hottest topic of information security’, said Kaspersky, Co-founder and CEO of Kaspersky Lab. ‘Stuxnet and Duqu were part of a family of viruses created in the same place, which raised doubt in the whole world. Flame marks the beginning of a new era in cyber-warfare and terrorism and it’s very important to understand that such cyber weapons can be easily used against any state. But this situation is different from that of a traditional war, because the most developed states are in fact the most vulnerable ones, in case of a cyber war’.

The link between Flame and Stuxnet

Everyone felt that the way in which Flame and Stuxnet work is very similar, but BitDefender succeeded in proving it. One of the most obvious similarities is the strings detection algorithm. This can be seen in such components like atmpsvcn. ocx, s7otbxdx.dll or mssecmgr.ocx. For example, the algorithm for atmpsvcn.ocx and s7otbxdx.dll looks like this: for (i=0; i < strlen(s); i++): for (i=0; i < strlen(s); i++) { sum = (0xB + s[i]) * (0x11 + s[i]); s[i] -= (sum >> 0x18) ^ (sum >> 0x10) ^ (sum >> 0x8) ^ sum; }

În principala componentă Flame (mssecmgr.ocx) algoritmul araăa astfel: for (i=0; i < strlen(s); i++) { sum = (0xB + s[i]) * (0x17 + s[i]); s[i] -= (sum >> 0x18) ^ (sum >> 0x10) ^ (sum >> 0x8) ^ sum; }

The only difference lies in the common executable file called atmpsvcn.ocx. Yet, if we get into details in the structure of the

library, more can be found. You can read a detailed article on this issue written by BitDefender representatives on http://labs. bitdefender.com/2012/06/ stuxnets-oldestcomponent-solves-the-flamer-puzzle/

How does Flame virus spread?

The main Flame platform module is a DLL file called MSSECMGR.OCX. When it is activated, it is saved in the Windows registry files, and is automatically updated on the next auto-refresh. Then MSSECMGR starts downloading additional modules like bettlejuice, Microbe, Euphoria, limbo etc and the list ends here. Among these, the Gadget module for the spreading of Flame. During analysis, the people from kaspersky were surprised to find out that computers with Microsoft Windows 7 operating system and all current updates, were infected by Flame. How was this possible? Well, it isn’t, unless there is a 0-day Windows vulnerability. Moreover, the Gadget module is crucial in malware spread through major networks, with the help of a second module, called Munch. Gadget and Munch are launching a man-in-the-middle attack against the other network components (i.e. when a computer tries to connect to the official Microsoft Windows Update website, the connexion is redirected through an infected computer and sends a false Windows update. The so called update is called Desktop Gadget Platform and is useful in displaying the gadget on desktop. Using a false Microsoft certficate, Flame is installed in the system. For further technical details, visit http:// www. securelist.com/en/blog/208193558/ Gadget_in_the_middle_Flame_ malware_spreading_vector_identified

Taking into consideration the fact that the geography of the targets started by Flame is similar to the one of Duqu, Kaspersky compared the control and command servers. The number of the domain servers of Duqu is unknown, but for Flame they are more than 80. When a computer is infected with flame, this is implicitly connected to five C&C servers, their number growing up to 10. Every domain was registered with false identities through GoDaddy. The addresses were either false, or they belonged to some hotels, the oldest record dating from 2008. Among the many false names are: Adrien Leroy, Arthur Vangen, George Wirtz, Gerard Caraty, Ivan Blix, Jerard Ree, Karel Schmid, Maria Weber, Mark Ploder, Mike Bassett, Paolo Calzaretta, Robert Fabous, Robert Wagmann, Romel Bottem, Ronald Dinter, Soma Mukhopadhyay, Stephane Borrail, Traian Lucescu, Werner Goetz or Will Ripmann. Most of CC servers were using Windows 7 or windows XP. The studies made by kaspersky show that Flame has an auto-upgrade mechanism. For example, the 2.212 version became 2.242 in two cases. This suggests the presence of at least a C&C server with a different control level. The folders loaded by the C&C server are encrypted by applying the XOR operator, dubbed by a substitution. More than that, many blocks are compressed using zlib and ppdm. All data is sent in packets of 8192 bytes, confirming once again the targets for the Middle eastern countries that have a weak and instable Internet connection. For further details, go http:// www.securelist.com/en/blog/208193540/ T h e _ R o o f _ I s _ o n _ f i r e _ Ta c k l i n g _ Flames_C_C_Servers.

Why has Flame been undetected for at least five years? The Bitdefender analysts made some research about a less known Flame malware application module known as advnetcfg.ocx – a module that deals with antivirus monitor, but it’s also a debugging component. The module was tagged as being the component dealing with Flame’s data collection in order to improve the quality and functionality of the application. Whenever Flame discovers an www.todaysoftmag.com | nr. 4/2012

29


technologies

information screen about the malware application files or contains keyword like ’injected’ or ’File mssecmgr.exe looks suspicious, the debugging module performs a screenshot which is then sent to C&C server and analyzed by a team of programmers working for Flame. The issue was analyzed thoroughly on the Bitdefender Lab Blog http://labs.bitdefender.com/2012/07/ flamer-used-qa-module-to-thwart-antivirus/.

The show goes on...

Following the Flame virus, another ’political’ one was discovered by the IT firms in the Middle East. The Madi or Mahdi virus, used for targeted cyber espionage, was recorded in Iran, Israel and Afghanistan on July 17 2012. Over 800 computers belonging to government agencies, financial institutions and infrastructure companies were infected with Mahdi virus, according to Kaspersky lab and Seculert experts. The Madi or mahdi virus, which means

Flame - the cyber disembodied weapon

’Messiah’ in Israel, is used to gather infor- as having the intelligence needed to build mation, emails, passwords and file transfer. such an application. Actually, all that means cybernetic war is based so far on luck and The Madi Trojan allows the cyber cri- assumptions, nothing is concrete and minals to steal secret information from nothing can be proved. There are players computers with Microsoft Windows, worldwide who know very well what they monitor e-mails and instant messaging are doing, far better than we expected and programs, record the keyboard activity and for them words like anonymity or infiltraalso screenshots. The analysis confirms that tion into (non)governmental institutions a lot of information gigabytes were stolen. make sense and are a matter of time and resources instead of technology. If Stuxnet, Among the spied applications and web Duqu and Flame were able to demonspages are: the G-mail, Hotmail, Yahoo! trate in the past three years that the online Mail, ICQ, Skype, Google+ and Facebook environment has much to do, we have to accounts belonging to the victims. The take into consideration the other mutaprogrammers used the ERP/CRM inte- tions that already exist or are being build grated systems, business contracts and in this very moment. This could be the finanacial administration systems. The beginning of World War III, only this time new malware threat was easily detec- it’s a cold, technologized war, that starts ted and has a less complex code than the with economy destruction and information Flame virus, considered to be the most control, but no one knows where and how sophisticated in technology information it stops. The nice part is that all states of the history. For further details, go to: http:// modern world will equally face each other www.securelist.com/en/blog/208193677/ and we can’t make assumptions on the ones The_Madi_Campaign_part_I. who might win, as the states hide a lot of surprises in their concern about informaConclusions tional security. Speed is another feature of Flame is known for its complexity this war – we can either expect a long and and various available options, the whole hidden war, or a very fast and tough one malware application code having more than with long-term effects. Nowadays, there is 20 MB. It is by far one of the most noncon- nothing better for us to do than to find our formist tools used in the cybernetic war. place on the chessboard and improve our Romania, together with other states from position. Stay safe, stay tuned! this geografical area, have been considered

Bibliography http://en.wikipedia.org/wiki/Flame_(malware) http://labs.bitdefender.com/2012/06/stuxnets-oldest-component-solves-the-flamer-puzzle/ http://labs.bitdefender.com/2012/06/flame-the-story-of-leaked-data-carried-by-human-vector/ http://labs.bitdefender.com/2012/07/flamer-used-qa-module-to-thwart-antivirus/ http://www.securelist.com/en/blog/208193522/The_Flame_Questions_and_Answers http://www.kaspersky.ro/blog/flame_%E2%80%93_cel_mai_complex_%C5%9Fi_interesant_malware_pe_care_lam_v%C4%83zut_p%C3%A2n%C4%83_acum http://economie.hotnews.ro/stiri-it-12374275-flame-vierme-informatic-complexitate-fara-precedent-ataca-tinte-din-orientul-mijlociu-spun-cei-kaspersky-lab. htm

30

nr. 4/2012 | www.todaysoftmag.com


programming

TODAY SOFTWARE MAGAZINE

Big Data Apache Hadoop After we started with the introduction to the “big data” world in the second issue and we continued with the article on NoSQL type of databases in the third issue, we are now introducing another important member of the family: Apache Hadoop. Apache Hadoop is a framework facilitating processing of large (and very large) data

Robert Enyedi robert.enyedi@betfair.com Senior Software Developer @Betfair

sets, running on an array of machines and using a simple programming model (map/ reduce paradigm). It is designed to scale from few machines (even one) to several thousands, each of those contributing processing power and storage. Apache Hadoop does not rely on hardware for “high-availability”, but the library is designed to detect the errors at the application level. It is an open source project, under the Apache Foundation, with a global community of committers, the most significant for its development being Yahoo!. Apache Hadoop is used by Yahoo! for its search engine, Facebook claims it has the biggest Hadoop cluster (30 petabytes of data), used amongst others for Facebook Messaging. Amazon offers a MapReduce platform as part of its cloud proposition (AWS), named Amazon Elastic MapReduce. Numerous other IT companies, and not just, use Apache Hadoop (Twitter, IBM, HP, Fox, American Airlines, Foursquare,

Linkedin, Chevron, etc), for solving different types of problems (online travel, e-commerce, fraud detection & prevention, image processing, health care, etc).

History

Hadoop was created by Doug Cutting, who named it after his son’s plush toy. It has been designed to offer a distributed system for the Nutch search engine, in the years 2004-2006, and it is inspired from the articles on GFS (Google File System) and MapReduce made public by Google around that time. In 2006 Yahoo! hired a dedicated team (including Doug), to contribute to the project, this helping spinning off Hadoop as independent project.

Architecture

Apache Hadoop is developed in Java and it has two main components: • HDFS (Hadoop Distributed File System) • MapReduce

www.todaysoftmag.com | nr. 4/2012

31


programming

HDFS

Is a distributed file system offering high throughput access to applications’ data. HDFS has a master/slave architecture. An HDFS cluster consists (usually) of a single Namenode,, a master server managing the filesystem namespace and controling the access of clients to the files. Additionally there are a number of Datanode, servers, usually one for each machine in the cluster, which manage the storage of the machine they run on. HDFS exposes a filesystem namespace and allows for storage of client data in files. Internally a file is broken up in one or more blocks (the size of the block is a configurable cluster setting, in most cases between 16-128 MB), and these blocks are stored on the Datanodes. Namenode is responsible for executing the operations on the filesystem, like: opening, deleting, renaming files and directories. It’s also responsible for determining and holding the mapping of the blocks of data to Datanodes. The Datanodes are responsible for serving the read/write requests coming from the clients of the filesystem. They also execute create, delete and replication operations on blocks, as instructed by the Namenode.

Big Data - Apache Hadoop

process, the JobTracker, and a TaskTracker process for each node in the cluster. The JobTracker is responsible for scheduling the tasks which run on TaskTrackers. It also keeps track of various MapReduce tasks running on different TaskTrackers, if any of these tasks fail, it schedules the task for execution on a different TaskTracker. In simple terms JobTracker needs to make sure a query running on a large dataset is executed successfully and that the results are presented to the client in a reliable way. Each TaskTracker is executing the map and reduce tasks, assigned to it by the JobTracker. It also constantly sends heartbeat messages to JobTracker, thus JobTracker keeps track of the load on each TaskTracker and enables scheduling new tasks for the respective TaskTracker, or in case of error re-scheduling the failed task to another TaskTracker.

To define a MapReduce job, the applications need to specify at least: location in HDFS of the input, location in HDFS where the output will be stored, a map function and a reduce function. These, along with other parameters of the job, form the job’s configuration. HDFS implements a permission model Once these are created, the user can for file and directory access similar to submit them to the JobTracker, being resPOSIX. ponsible for scheduling the job, distributing the execution of the specific tasks of the job MapReduce on TaskTracker nodes, monitoring the job MapReduce is a framework enabling and presenting the status to the user. writing applications which process large data sets, in parallel, in a reliable and fault The input is passed to the map function tolerant way. as key-value pairs, this in turn produces A MapReduce job divides the input key-value pairs (possibly of different types). dataset in independent blocks of data Once the map tasks finished, their respecwhich are processed by the map tasks in tive key-value pair results are appended parallel. The framework is appending and and then sorted, then passed as input for sorting the output of the map tasks, which the reduce function. This in turn also prois then fed as input data to the reduce task. duces key-value pair type of results. Normally, both the input and output are stored in HDFS. The framework takes Limitations care of job scheduling, monitoring and reThe current implementation of the executing the ones failed. MapReduce framework is starting to show Usually the node which stores data its age. Looking at the trends in terms of (Datanode) is also a compute node, thus size and processing power of Hadoop clusthe MapReduce framework and the HDFS ters, the JobTracker needs an overhaul to run on the same set of nodes (machines). address problems related to scalability, This configuration enables the framework memory consumption, threading model to plan the execution of tasks on the nodes and performance. where the data is already present, thus optiThe requirements for the MapReduce mizing network traffic within the cluster. framework, to solve the above mentioned MapReduce comprises of a single master limitations are:

32

nr. 4/2012 | www.todaysoftmag.com

• • • • • • •

reliability availability scalability (clusters of ~10000 machines), the current implementation scales up to aprox. 4000 machines evolution predictable latency optimal resource management support for alternative paradigms to MapReduce

MapReduce 2.0 (YARN or MRv2)

Next generation of MapReduce has been designed to satisfy the above mentioned requirements. The main idea behind re-architecturing, was to divide the two major functions of JobTracker, resource management and scheduling/monitoring job executions, in separate components. The new ResourceMananger handles allocation of compute resources for applications, and a per application ApplicationMaster handles the coordination/scheduling of the application. An application represents either a MapReduce job or a DAG (directed acyclic graph) of jobs. The ResourceManager and NodeManager slaves, from each machine of the cluster form the compute grid. The ApplicationMaster is a library of the framework with the role to negotiate resources from ResourceManager and to work with NodeManager(s) for executing and monitoring tasks. The ResourceManager has two main components: • Scheduler (S) • ApplicationsManager (ASM) The Scheduler is responsible for allocating resources for the various running applications, considering capacity constraints, queue constraints, etc. The Scheduler is a pure scheduler, meaning it does not monitor or track the status of the applications. Also it does not guarantee restarting of failed tasks, due either to hardware or application level failures. The Scheduler performs it’s scheduling function based on applications’ resource needs, and does this based on the Resource Container concept, which encapsulates characteristics such as memory, processing


programare

power, disk usage, network, etc. The Scheduler allow for a plug-in type of policy, policy responsible with distributing cluster resources between various queues, applications, etc. The built-in Scheduler uses FIFO. The existing MapReduce schedulers, like CapacityScheduler and FairScheduler would be examples of such plugins. The CapacityScheduler operates on hierarchical queues, to offer a more predictable cluster resource allocation. It has been developed by Yahoo!. The FairScheduler has been developed by Facebook with the intent to offer fast execution times for small jobs and QoS (quality of service) for production jobs. The ApplicationsManager is responsible for accepting submitted jobs, to negotiate the first container for executing the application specific ApplicationMaster, and restarting the ApplicationMaster in case of errors. The NodeManager, is the per machine process, responsible with launching application containers, monitoring application resources and reporting towards Scheduler. The per application, ApplicationManager is responsible to

Figure 2. MRv2 architecture negotiate resource containers from the Scheduler, tracking the execution status and monitoring progress. MRv2 is part of a major Hadoop release (2.x) which besides MRv2 also includes HDFS Federation. The HDFS Federation resolves another potential limitation of the framework, the singular nature of the Namenode. For horizontal scalability of the name service, federation uses multiple

TODAY SOFTWARE MAGAZINE

independent Namenodes, each of them having their own filesystem namespace. The Namenodes are independent and don’t require any coordination between them. The Datanodes are used as common storage for data blocks in the cluster. Each Datanode is registered with all Namenodes in the cluster and accepts commands from every one of them. Also sends, periodically, heartbeats and usage reports to Namenodes.

Tools

DIndustry adoption of Apache Hadoop brought with it the development of a veritable ecosystem of tools/frameworks to be used on top of, or adjacent to Hadoop. We will list below some of the most important ones. • Scribe - is a server for aggregating log streams, and can be integrated with HDFS (thus the log files can serve as input for MapReduce jobs) • Sqoop - is a tool for exporting/importing data from/to HDFS to/from relational databases. • Hive - a data warehouse infrastructure that provides data summarization and ad hoc querying of data stored in HDFS . • HBase - a scalable, distributed database that supports structured data storage for large tables, using HDFS as underlying storage. • Pig- a high-level data-flow language and execution framework for parallel computation. Its main characteristic is that Pig programs are parallelizable at runtime, by nature. The Pig compiler produces MapReduce jobs. • ZooKeper - a high-performance coordination service for distributed applications. • Oozie - is a tool used for workflow management/MapReduce job coordination • Cascading - is a software abstraction layer for Hadoop. It is used to execute workflows in a Hadoop cluster, thus hiding the underlying complexity of MapReduce jobs. • Mahout - is a library containing “machine learning” and “data mining”, algorithms on top of MapReduce. • Chukwa - is a tool for monitoring distributed applications, on top of HDFS and MapReduce.

Conclusions

This article stands as an introduction to Apache Hadoop and some of the technologies around it. For the ones seeking more details and concrete implementations, there are various sources available on the internet, starting with http://hadoop.apache.org. There are also several commercial solutions based on Apache Hadopp, one of the most wellknown being the one offered by Cloudera (www.cloudera.com - here one can find various presentation and trainings on the subject).

www.todaysoftmag.com | nr. 4/2012

33


programming

Made in Cluj Jumping Electron

Jumping Electron is a game developed with Unity 3D Game Engine that will run on smartphones with Android OS, iOS and their respective tablets. The first 2 chapters and 40 levels are a race on a track inside a Radio and a Jukebox. The action of next 40 levels will take place inside a Phone Switchboard and a TV. Unity 3D has an open source version but a license must be purchased for Android Andrei Kovacs

andrei@finmouse.com Founder & CEO Finmouse

and iPhone. For a team with one developer, it’s OK; however, for a bigger team, Unity PRO is recommended for source control, merging changes etc.

can be attached to objects, and these will be activated according to specific criteria. For example, if one attaches a ‘collider’ to an object, when another object interacts with the said object, it will execute a cerUnity 3D also allows exports to other tain Script. In our case, an animation of an platforms (PC, Flash, etc.), so we are not explosion will be activated when the elecexcluding it in the future. The main idea of tron hits a wall. the game is using the accelerometer to control the left-right movement of the electron We have also integrated various plugins on the road, and using tap screen functio- for: nality to jump over obstacles. • Google Play In-App purchases • Mobclix, dispay add-uri between The game is free, but those who want levels to, can buy Volts, which can be used to • OpenFeint, mobile gaming social purchase various upgrades, from special network electrons to extra energy, no ads, level • Playtomic, analytics boosters etc. Volts can be collected by • Facebook and Twitter playing the game, clicking a Video Ad or • Custom In-House Adds (pressing the replaying a level. buttons of the TV from the main menu will launch an ad)

General presentation of the architecture

Unity is used to import 3D models We used a web service for our In-House and the animations are done in Blender or Ads – we access a PHP on the server, which other modeling programs. returns an image and the corresponding URL. The advantage of the PHP is that it Unity has an IDE that allows the deve- is supported by all webhosting services, loper to work on the objects and also to which means we don’t have to move our visualize the end result. Various scripts servers.

34

nr. 4/2012 | www.todaysoftmag.com


TODAY SOFTWARE MAGAZINE of live games.

Future plans

We plan on releasing 40 additional levels for Jumping Electron on Android and launching the game on iOS as well. We also plan on developing 2 more games with a similar theme but new elements of gameplay and design. We have a long list of game ideas. In terms of other apps, we have entered Alpha testing with a social mobile app which will be launched in Cluj. All we can say about it for now is that the app will be a ‘Cocktail of Interests’.

A few words about Finmouse The movement part was implemented using Unity ‘Physics Engine’ – the electron jumps respecting the laws of physics, its speed is constant, except when the user collects a speed booster. Jumping over ramps will respect Newton’s law. (photo: Speed booster)

engine if you are a small team.

Finmouse was created in 2010 and it’s aim is to develop apps and games for the mobile industry. It started with outsourcing projects and our very own business/ productivity app (Call Reminder Notes – allows the user to attach a reminder to a contact and visualize it during an incoming or outgoing call from that contact).

We used UnityScript, similar to JavaScript, as the programming language for the graphic’s engine, but we do plan on switching to C# in the near future, as it is a more evolved and easier to use language. Also, the Unity plugins are developed in C# We worked on most of the mobile Development challenges and they allow expanding the application platforms but recently, we are focusing on Unity 3D is a very good platform for with different options, like integration with Android, iOS and Unity 3D for games. game development. The most difficult part other applications. was integrating it with other systems, such Our company is focused on internal as advertising services. When it comes to complex textures, products and we are planning to launch 2 we learnt that less texture files help with more games this year and a ‘mobile social The challenge with modeling was opti- performance. In terms of sound loops, network’ this summer. mizing the number of polygons so as to you can’t use mp3s for infinite loops (15 not make the objects appear low polly. The seconds of continuous music). We tried Game and app development involves a trailer animation was probably the most OGG Vorbis but Unity had issues with it great deal of work and is way more difficult difficult part of the 3D modeler. so, in the end, we imported WAVs directly in each and every aspect, but the satisfacinto Unity and we compressed them. tion of a live product and positive feedback During programming we noticed small from users and reviewers make it all wortdifferences in physics between devices – From our experience we learned that hwhile. ‘Challenging and fun’ is the best when the user would interact with a ramp, display ads during levels, lower the FPS way to describe it. the electron would sometimes jump higher (Frame per Second). For instance, on HTC on the Samsung Galaxy S than on the HTC Desire, it drops from 17 FPS to 3-4 FPS Desire, so small changes were required. and it’s impossible to play the game. On Samsung Galaxy S we have 35-40 FPS and Best practices and advice for game on Tegra 2 devices over 70 FPS with good developers on devices graphics. From our experience, it’s better to start with a simpler game. We started work on When using external libraries, it’s a more complex 3D game but we had to better if you put them in a ‘sandbox’ and unfortunately put it ‘on hold’ because we pay attention to all exceptions because found ourselves ‘way over our heads’. We they often carry bugs. Even so, there is no are planning to finish it in the future. guarantee. We recommend Unity 3D for 3D It’s preferable if all the plugins are creagames, although it can be used for 2D ted in-house but it’s a great additional effort games as well. It’s better to use a single and it’s not worth it, unless you have dozens www.todaysoftmag.com | nr. 4/2012

35


programming

Background tasks Metro

The new operating system released by Microsoft has quite a few changes. One of these changes is the background tasks for Metro applications. Before talking about background tasks from Windows 8 we need to understand why they were introduced. Windows 8 appeared because of the necessity of the operating system to run on multiple types of devices. Besides those we are used to (desktop and laptop) more and more people have started using tablets. Even though in terms of CPU and memory they are becoming more powerful, the expectations we have from a Radu Vunvulea

Radu.Vunvulea@iquestgroup.com Senior Software Engineer @iQuest

tablet are different. Battery life is extremely important for a tablet. Because of this the new operating system from Microsoft changes a little bit the lifecycle of an application. Metro applications are totally different from what we were used before on a Windows operating system. Also the concept of Windows Services disappears when we discuss about a Windows 8 Metro application. These applications are designed to preserve as much as they can the resources that a device has (processor, memory, internet connection and battery). Therefore only the foreground application is running. The remaining applications are open but not in the foreground, they are entering on a special state called “suspended”. In this state all threads are frozen, the application exists in memory but is not executing code. An application with this state may return back to the foreground without any problem.

Figure 1. Application lifetime

36

nr. 4/2012 | www.todaysoftmag.com

The operating system can decide that an application in a “suspended” state to stop and all the resources that are held by it to be released. This situation arises when the system doesn’t have enough available resources for the user. There are few applications that need to run code in the background. To support this functionality Windows 8 supports background task. These are not exactly new; they are already used on Windows Phone 7. The basic concept for background tasks in Windows 8 is quite similar to the Windows Phone. A background task is a task running in the background. This doesn’t require our applications to be started. When it starts the life of it is totally controlled by the operating system. Depending on certain conditions the operating system may decide that background tasks should not run for a period of time. For example if the battery life is low and the operating system wants to conserve resources then it is not going to run background tasks. There are several types of background tasks that the operating system provides. In principle we can group them into two main types: user-defined and default. By default we can have some background tasks defined by Windows 8. They are used


TODAY SOFTWARE MAGAZINE

Figura 2. Processes running in background to perform various tasks by the operating system. Also they can be used by developers and it is also recommended to be used because it makes no sense to reinvent the wheel. Below is the list of background tasks that are defined by the operating system: • Background audio • Background upload • Background download • Sharing • Device sync • Live tiles • Scheduled notifications All these background tasks are already defined and implemented by Windows 8. For example to make a file download, all you have to do is to set up the location from where to download and the disk location where to copy the file. Operating system will automatically handle the download process and if the connection is lost or the device reboots, Windows 8 will continue copying the file automatically without our help.

There are many types of triggers available to us, starting with triggers that are called when we have internet connection or a new session is created to triggers that are called at a defined time interval. If you want to use TimeTrigger you should take into consideration that the minimum time interval is 15 minutes. What is important to note here is that a background task can have only one trigger. However a Metro application can have as many background tasks needed. In some cases, we have tasks that are running on background and have various dependencies. For example we have a background task that sends user’s current location at a certain time to the server. To send data to the server we need an internet connection. To support these cases, a background task can have defined 0 or more conditions. Only when all conditions are met it will run. If one is not satisfied and after a time it is the background task will be run automatically by the operating system. Below is the list of available conditions: • InternetAvailable • InternetNotAvailable • SessionConnected • SessionDisconnected • UserNotpresent • UserPresent

The first step that a Metro application should do is to register the trigger and the class which is called when you wire the trigger. Operating system will automatically call our class. Below you can find the registration flow and the call of a background task. The code we want to run in a backgroBackgroundDownloader downloader = new und task must be put into a sealed class BackgroundDownloader(); DownloadOperation download = download- that implements the IBackgroundTask er.CreateDownload(mySource, myDestinationFile); interface. This interface comes with a Run await download.StartAsync().AsTask(); method which will be called each time the In the example above we created a background task is called. background task that can download a file public sealed class MyBackgroundTask:I from the Internet. For each task of this kind BackgroundTask { we can monitor the progress and/or create private int globalcount; a cancelation token that we can use to canvoid IBackgroundTask. Run(IBackgroundTaskInstance taskIncel the download. stance) CancellationTokenSource cancellationToken = new CancellationTokenSource(); await download.AttachAsync().AsTask( cancellationToken.Token, new Progress<DownloadOperatio n>(DownloadProgress));

Unfortunately tasks defined by default on Windows 8 will not always help us in some scenarios. In order to help us Windows 8 allows us to define custom background tasks which the operating system automatically will call depending on certain events.

{ Log.WriteInfo(„My Background task was called.”): } }

In the case of a Metro application written in C# we need to create a separate project in which to put this class. This is required because the assembly must be loaded by the operating system when you wire the trigger. Once we created a background task, it must be registered through

BackgroundTaskBuilder class. BackgroundTaskBuilder taskBuilder = new BackgroundTaskBuilder(); taskBuilder.Name = „MyBackgroundTask”; taskBuilder.TaskEntryPoint = „MyNamespace.MyBackgroundTask”; IBackgroundTrigger timeTrigger = new TimeTrigger(30, true); builder.SetTrigger(timeTrigger); IBackgroundCondition condition = new SystemCondition (SystemConditionType.InternetAvailable); taskBuilder.AddCondition(condition); IBackgroundTaskRegistration myTask = builder.Register();

In the example above we registered our background task that is called every 30 minutes if there is internet connection. Through MyTask instance we can record the progress change and complete events. In the background task this events can be called through the IBackgroundTaskInstance instance that we receive as a parameter to the Run method. myTask.Progress += new BackgroundTaskP rogressEventHandler(TaskProgress); myTask.Completed += new BackgroundTask CompletedEventHandler(TaskCompleted);

A task must be registered only once, but every time our application is stopped and restarted again we need to record the events of Progress and Completed. To be able to do this in the application we can iterate through all the background tasks recorded by our application. In the example below, for each background task we record the event Progress and Completed. Each background task recorded by our application has to be declared in the manifest file. If it is not registered then it won’t run at all. For each background task declared in the manifest we need to specify what type of tasks supports and the entry point for it. Depending on the trigger there are some background tasks that may be on the lock screen. Applications that are in the lock screen are privileged and they have access to more resources than a normal application. Therefore the number of the applications that can be on this screen is limited. Resources to which background task has access are limited. In some ways they have been called Windows Services in

Figura 3. Limitările de procesare pentru un background task

Figura 4. Constrângerile utilizării WiFi www.todaysoftmag.com | nr. 4/2012

37


programming

Metro style app. Compared to Windows Services, background tasks have access to limited resources. These limitations have emerged from the desire to conserve the resources that a device has, here we don’t refer only at the battery but also at other resources such as internet connection. The biggest limitation is the CPU. Each background task that belongs to an application which is on the lock screen is entitled to 2 CPU seconds with an interval of 15 minutes. For those that are not on the lock screen each background task has the right to 1 CPU second every 2 hours. If you have asynchronous calls to external resources (e.g. a call to a service from a server) you shouldn’t worry. Before such calls you need to get an instance of a BackgroundTaskDeferral object void IBackgroundTask. Run(IBackgroundTaskInstance taskInstance) { BackgroundTaskDeferral deferral = taskInstance.GetDeferral(); var result = await SomeActionAsync(); deferral.Complete();

Background tasks - Metro

//

var result ExampleMethodAsync();

The time until the service sends the response is not counted. Similar limitations arise with network access and maximum traffic that can be done at a time. But the biggest constraint is related to the CPU. Using background tasks can help a lot, but should be used only when we need them. The limitations they have are designed to save as many resources as they can and also to be used only when we really need them.

public async void Run(IBackgroundTaskInstance taskInstance) { // // Create the deferral by requesting it from the task instance. // BackgroundTaskDeferral deferral = taskInstance.GetDeferral(); // // Call asynchronous method(s) using the await keyword.

Figure 3. Registration and background task call nr. 4/2012 | www.todaysoftmag.com

await

// // Once the asynchronous method(s) are done, close the deferral. // deferral.Complete(); }

Log.WriteInfo(„My Background task was called.”): }

38

=


management

TODAY SOFTWARE MAGAZINE

Microsoft Project and the Agile Projects

All those who interacted, no matter how little, with Microsoft Project agree unanimously that it’s a very powerful scheduling tool. In other words, if we define and detail its tasks, it is very good at building a plan (the correct term is schedule), better than anything else. But, in order to define the tasks clearly, we need to know exactly what we

Florian Ivan, PMP

florian.ivan@rolf-consulting.com Project MVP

expect from the project: what does it have to deliver, on what conditions, which standards to comply with etc. It sounds easy, doesn’t it? In practice, we all know how complicated (impossible?) it is to define the purpose of the project and the delivery from the very beginning. In this case, how could Microsoft Project help me if I (or my client) don’t know what I want from it?

planning it thoroughly from the start is obviously impossible. Briefly, Agile (and of course its derivates Scrum, XP, FDD etc) teaches us that change is welcome, it is alright to start the project without knowing too much about what is going to be delivered and, the most important thing, the product or the project delivery is being defined on the way. It looks like magic but, actually it is just a change of paradigm which introduces This is the main reason of the Agile the term of iterative planning. This concept methodologies or movements. Because is graphically explained in the image below.

in many domains, especially in software development, it is very difficult to know precisely from the start what you have to deliver. All we have to do is think about a website whose development lasts several months and it is already obvious that

Things look easier now, don’t they? We plan a little, as much as we know, we execute what we have planned, and then we start all over again. And we continue like this till we finish what we have to do or till the client falls short of ideas. Agile is indeed a convenient solution for software companies, helping them put things right inside the projects that otherwise seem confusing. It helps at scope management which is incrementally defined, function by function. It is useful for planning which is also an entity made up of pieces, but it is

www.todaysoftmag.com | nr. 4/2012

39


management

Microsoft Project and the Agile projects

planning and for execution. In the example given above, as many iterations as the project needs can be added. This can be done by inserting a set of tasks for planning and execution and binding the new iteration to the last one, which was previously defined. useful especially for implementing what it software development. That is why we use has been planned. Agile approaches. And yet, how can we use Microsoft Project in this scenario? Therefore, Agile gives us a frame to define the project scope, to plan, to pursue Iterative Planning has always been the progress and then to start it all over part of the project management, no matagain. ter how agile the used methodology was. Therefore, in any project we can have In order to have an even better idea on planning approach, execution and planAgile, we can oppose a different metho- ning again. The idea of using a software dology. The opposite of Agile is called application specialized in project managewaterfall. In a waterfall methodology we ment doesn’t change the uncertainty and it know from the very beginning what it has doesn’t impose a certain rigor which is not to be delivered (the project scope), the characteristic for our projects. The image terms we have to comply with (a very pre- below exemplifies that in Microsoft Project cise scheduling) and, of course, the budget this planning cycle and iterative execution. limits we must fit in. There are many pro- As it can be seen in the image, after having jects in many areas where the waterfall planned as much as possible, we pass on approach is extremely suitable; especially to execution, then we come back to planwhere this exactitude is needed from the ning. This planning cycle is what PMI calls start and the sponsor of the project or the rolling wave planning. background conditions enforce such rigor. As in the software development things are Imposing a rigorous planning when different, Agile steps in. we don’t have enough data brings nothing but an element of risk, given by unrealistic Now that we have defined the pro- estimations. Still, there is no clear plancesses, it is necessary to get to the next ning-execution cycle resulting from the level, to standardize and automates them. image above. As the Gantt graph shows, This is where Microsoft Project enters the replanning looks like returning in time, stage. which is not possible! And then, in order to reflect reality as precise as possible, we As stated above, this software is used can create tasks on iterations, as it is shown and appreciated for its scheduling capa- below. bilities. Of course it can be used for other Actually, tasks for every iteration are things such as resource or report manage- being defined in Microsoft Project, both for ment but its main utility is to manage and automates the complexity of the tasks in a project. There will be no details about portfolios or collaborative work which are characteristic to Project Server, but only descriptions of functionalities of a Professional Project. What does a Project Manager do when he opens a Microsoft Project? Experience shows that 90% start writing a list of tasks that they put in a sequence to look nice in a Gantt graph. In order to do this, we need to know the list of tasks which derive from the previously known project scope. As it was argued above, this does not happen in

40

nr. 4/2012 | www.todaysoftmag.com

Following this continuity, an Agile project can be planned using a tool which was traditionally appointed to waterfall projects. Further on, Microsoft Project allows resource assignment and task progress monitoring. It is extremely important not to give up and continue to monitor the project after the plan (or schedule) has been defined. It is possible that during execution other unknown elements appear which could lead to replanning or modifying existent tasks. It is enough to think about modifying a task in a project with hundreds of similar tasks to see the importance of Microsoft Project in planning and monitoring projects. If this hasn’t convinced you yet, you need to know it integrates natively with Visual Studio suite, that the development teams find the assigned tasks directly into the development environment.


others

Gogu Gogu set the alarm, turned off the lights and stepped out of the building. He nodded to the idly smoking doorman. He was standing in front of the door as if he was the owner of the dozens of offices standing in the darkness. Gogu choked back a sigh: „It’s not like someone made me stay until this hour like the doorman”. Actually, the pride made him stay „but that doesn’t count”. Some other time he would have smiled at his own remark, but now he was too bitter: he could not believe that a “child” was sent to give him indications, he who had been in that company for seven years, „well, six and a half ”, corrected himself. But it was not about that, it was about the fact that he, Gogu, not only knew everything that moved in the company, but even more: he was the author of the procedures that were now reviewed by the „child”, „procedures that would have not existed if I hadn’t tired myself an entire summer three years ago.” And he was congratulated by the company’s GM himself, which now was responsible for bringing the „child” with optimization ideas. „And Misu who is on vacation right now ...” The remark clicked in his mind the movie of Misu’s departure: first appeared the image of Misu hesitating before the departure and then, immediately, the image of Gogu saying confidently: Go in peace, Misu, I can handle, there’s no problem! „Yeah, what man makes with his own hands...” It was dark outside and only few passers-by but he lived nearby at a stone’s throw. He had stood over the program to recheck the file with the description of the business processes, to print the latest version, „for yes - he went back with the monologue we have made optimizations, almost everywhere, hence the number of versions. But experienced employees have contributed, people who know exactly how things work around here.

Uff...” He noticed the strange look of a passer-by and he realized that the interior monologue had become a kind of unintelligible gibberish… „I’m not far from going crazy”. He entered the staircase and a smell of roast slammed him; that banished any thoughts about processes, procedures or child optimization ideas. „Baked Chicken” he said to himself... „oh, and she makes it so good!” A smile of contentment sprawled on his face. He loved his wife very much and he loved as much the meals that she cooked for him. - Look at how he grins at me… where were you until this late hour? It’s like you married the company, not me. The company should be the one to feed you if you stay there until late! But Gogu’s eyes were smiling and he felt her love in every word. While preparing for dinner, he told his wife about the whole mess with the optimization, reported honestly the “child’s” age – who had just turned thirty – but his grimness had already passed; he was only thinking about food. When the tray with the baked Chicken took its royal place on the table, it fully captured the attention of Gogu that could not stop praising it. diverse - But it has no rump?! Why did you cut it? You usually keep it there... - I didn’t think too much, I cut it because I knew that is how it’s done. That’s how my mother does it. - But why? - Gogu, are you hungry or not? Why does it matter, I cut it and that’s it. - I am hungry and I am going to eat, but I want to understand – said Gogu while reaching for the phone. - How come we don’t know why we do what we do? he added laughing, while searching for the number of his mother-in-law. He knew it was the time for her favorite series and he wouldn’t have missed for the world to „to take

her out of the atmosphere” - quote from his mother-in-law. Plus there was no chance they would talk too long, unless - God forbid! – he would have caught the advertising break. The answer took him by surprise: - You don’t cut the rump, where did that came from?! - But mother, you always cut it when we baked chicken at home! the wife jumped under the pressure of the injustice that was being done to her. - Oh, but back then we had that little tray where the entire chicken didn’t fit in. That’s why I used to cut the rump! Can I get back to the series now? It’s not like you would consider calling during advertising... The discussion with the motherin-law was short – obviously, since she was watching the series – and entertained him „hi-hi, I like annoying her a little bit”, but it also made him think. The constraints had changed, but the habit remained… „What a thing..., if you are deeply involved, you don’t realize that some things have changed and you continue to work as you used to. On the other hand, a “child” from outside can easily see when old habits are no longer necessary… As I saw the rump…”

Simona Bonghez, Ph.D.

simona.bonghez@confucius.ro Speaker, trainer and consultant project management Owner of Confucius Consulting

www.todaysoftmag.com | nr. 4/2012

41


...even mammoths can be agile…

Lansarea numarului numarului #5 Lansarea #5alalrevistei revistei TodaySoftware Software Magazine oday Magazine 21 septembrie 2012

21 septembrie 2012 Ce legătură exista între mamuţi şi abordările Agile? Vă invităm să aflaţi pe 21 septembrie 2012, la City Plaza în Cluj Napoca (Str. Sindicatelor 9-13). 08:30 - 09:00 Primire participanti Confucius Consulting, în parteneriat cu Today Software Magazine, găzduieşte un eveniment ce va 09.15 clarifica această dilemă09:00 dar va -face lumină şi în Deschidere tema abordăriloreveniment de managementul proiectelor în industria software.

09:15 - 10:30

Speakeri invitati

Cu ocazia acestui eveniment va fi lansat şi numărul 5 al revistei Today Software Magazin iar unii dintre autorii consacraţi ai revistei vor răspunde participanţilor la întrebări legate de adaptabilitate şi flexibilitate. 10:30 - 10:45 Pauza de cafea Speakerii invitaţi ne vor vorbi despre metodologiile AgileLupei în managementul proiectelor, despre Adrian 10:45 11:30 avantajele lor în comparaţie cu abordările tradiţionale, despre dificultăţi în implementarea lor în How to pay off your technical debt organizaţii şi despre abilităţile necesare pentru a le putea gestiona în mod eficient.

11:30 - 12:15 Speaker invitat Participarea este gratuită, dar numai pe baza înscrierii, numărul de locuri fiind limitat. Pentru mai multe detalii dar şi pentru înscriere, vă rugăm să ne scrieți la adresa training@confucius.ro sau lansare@todaysoftmag.com . - 13:00 12:15 Pauza de pranz Vă așteptăm să aflăm împreună dacă ... și mamuții pot fi AGILE...

13:00 - 14:00

Speakeri TSM

14:00 - 14:15

Oferire premii tombola


sponsori

powered by


TSM_4_2012_en  

http://www.todaysoftmag.com/pdf/TSM_4_2012_en.pdf

Advertisement
Read more
Read more
Similar to
Popular now
Just for you