Page 1


Contents Admin 26

Volatility: The Open Source Framework for Memory Forensics

29

Network Performance Monitoring and Tuning in Linux

Developers 44

Using Python for Data Mining

55

A Few Fundamentals of Ruby Programming Choosing the Right Open Source Programming Language

59

65 70 73 76

Create a Web Database in App Inventor 2 Faster File Search with Python

Let’s Get Acquainted with PHP

37

Julia: A Language that Walks Like Python, Runs Like C

40

Start Programming on Raspberry Pi with Python Build a Website Using Bootstrap and the Express.js Framework

FOR U & ME 81

Programming in R

86

Audacity: Yet Another Tool for Speech Signal Analysis

89

Have Some Fun Converting Plain Text to Handwritten Text

91

The Pros and Cons of Open Source Programming Languages

97

Open Source Solutions that Accelerate Adoption of Cognitive Automation

OpenGurus 99

Python Programming for Digital Forensics and Security Analysis

Columns 17

CodeSport

R EGUL AR FEATURES 08

New Products

10

FOSSBytes

4 | September 2016 | OpeN SOUrCe FOr YOU | www.OpenSourceForU.com

69 Editorial Calendar 104 Tips & Tricks


Contents

Editor

Rahul chopRa

Editorial, SubScriptionS & advErtiSing Delhi (hQ) d-87/1, okhla industrial area, phase i, new delhi 110020 ph: (011) 26810602, 26810603; Fax: 26817563 E-mail: info@efy.in

MiSSing iSSuES E-mail: support@efy.in

back iSSuES Kits ‘n’ Spares new delhi 110020 ph: (011) 26371661, 26371662 E-mail: info@kitsnspares.com

nEwSStand diStribution

ph: 011-40596600 E-mail: efycirc@efy.in

advErtiSEMEntS mumbai ph: (022) 24950047, 24928520 E-mail: efymum@efy.in

50

An Introduction to the Go Programming Language

beNGaluRu ph: (080) 25260394, 25260023 E-mail: efyblr@efy.in PuNe ph: 08800295610/ 09870682995 E-mail: efypune@efy.in GuJaRaT ph: (079) 61344948 E-mail: efyahd@efy.in

CaseStudy 20

“We are transforming the classroom culture in many ways”

“Docker is a boon to developers”

102

22

Neependra Khare, Docker captain

Anant Agarwal, CEO, edX

Re co mm en d

e

on

the

rep lac em en t.

db

terial, if found l e ma

nab tio ec

of

e ern Int

a. t dat

Note:

Any o

bj

September 2016

M Drive VD-RO M, D B RA , 1G : P4 nts me ire qu Re

tended, and sh unin oul

he complex n d to t atu ute re

erly, write to us a ot work prop t supp does n ort@ DVD efy.i this n fo ase ra free In c

e

tem ys dS

s c, i

rib

Enjoy a cool Linux distro on your computer.

dis

att

DVD of The MonTh

Government Leverages Open Source to Build DigiLocker for Indian Citizens

CD Team e-mail: cdteam@efy.in

6 | September 2016 | OpeN SOUrCe FOr YOU | www.OpenSourceForU.com

• PCLinuxOS FullMonty KDE64 2016 Desktop (Live) • PCLinuxOS LXQT 2016 Desktop (Community)

106

chiNa power pioneer group inc. ph: (86 755) 83729797, (86) 13923802595 E-mail: powerpioneer@efy.in JaPaN tandem inc., ph: 81-3-3541-4166 E-mail: tandem@efy.in SiNGaPORe publicitas Singapore pte ltd ph: +65-6836 2272 E-mail: publicitas@efy.in TaiwaN J.k. Media, ph: 886-2-87726780 ext. 10 E-mail: jkmedia@efy.in uNiTeD STaTeS E & tech Media ph: +1 860 536 6677 E-mail: veroniquelamarque@gmail.com printed, published and owned by ramesh chopra. printed at tara art printers pvt ltd, a-46,47, Sec-5, noida, on 28th of the previous month, and published from d-87/1, okhla industrial area, phase i, new delhi 110020. copyright © 2016. all articles in this issue, except for interviews, verbatim quotes, or unless otherwise explicitly mentioned, will be released under creative commons attribution-noncommercial 3.0 unported license a month after the date of publication. refer to http://creativecommons.org/licenses/by-nc/3.0/ for a copy of the licence. although every effort is made to ensure accuracy, no responsibility whatsoever is taken for any loss due to publishing errors. articles that cannot be used are returned to the authors if accompanied by a self-addressed and sufficiently stamped envelope. but no responsibility is taken for any loss or delay in returning the material. disputes, if any, will be settled in a new delhi court only.

SUBSCRIPTION RATES Year Five three one

Newstand Price (`) 7200 4320 1440

You Pay (`) 4320 3030 1150

Overseas — — uS$ 120

kindly add ` 50/- for outside delhi cheques. please send payments only in favour of EFY Enterprises Pvt Ltd. non-receipt of copies may be reported to support@efy.in—do mention your subscription number.


new products Digisol’s 8-port fast Ethernet switch

High resolution headphones from Audio Technica

Digisol, the networking and surveillance brand of Smartlink Network Systems, has launched its 8-port fast Ethernet power-overEthernet (PoE) unmanaged switch which is designed to enhance network performance in a compact form factor. The DG-FS1008PH-A switch offers 8x10/100Mbps fast Ethernet ports with four PoE ports, which are IEEE802.3af compliant and can supply PoE power to PoE devices. The existing Ethernet cables can be used to power up IEEE802.3af compliant network devices, which eliminates the need for an external power source and power cabling. The device offers users the flexibility to connect either PoE or non-PoE devices. The switch uses ‘store and forward’ packet switching technology which ensures reliable data transfer. It also supports automatic MDI/ MDI-X detection, which eliminates the need for crossover cables or dedicated uplink ports. The feature-rich Digisol DG-FS1008PH-A is available at retail stores.

Japanese audio equipment manufacturer, Audio Technica, has recently launched its high definition headphones in India. These come with 45mm high-definition audio drivers, with a frequency range of 5Hz to 40,000Hz. The headphones have a detachable 1.2 metre cable, which has an inline microphone that controls incoming calls and music playback easily on compatible smartphones. The headphones offer maximum input power of 1,500mW, sensitivity of 100dB/mW and 35 ohms impedance. These lightweight headphones are easy to carry around. They are designed with flexible, soft, swivel foam earpads, and a headband that

Address: Smartlink Network Systems Ltd, Plot No. 5, KurlaBandra Complex Road, Santa Cruz (E), Mumbai 400098; Ph: 91-22-30616666 Price:

` 4,499

Price:

` 10,999 gives a comfortable and secure fit on the ears for listening to many hours of music. Available in black, white, navy and brown, the headphones can be purchased via Amazon.in. Address: Amazon India, Brigade Gateway, 8th Floor, 26/1, Dr Rajkumar Road, Malleshwaram West, Bengaluru, Karnataka – 560055; Ph: 1800-3000-9009

All-in-one portable speaker from Manzana Computer peripherals manufacturer, Manzana, has launched its latest all-in-one Bluetooth speaker in India called the Manzana Drumbazz. Handy and equipped with a shoulder strap for easy portability, this speaker is perfect for outdoor activities and social outings. It comes with a 10.16cm (4 inch) sub-woofer for a powerful bass output. The Manzana Drumbazz can play music via USB flash drives and microSD cards. It is also equipped with inbuilt FM radio and can be paired with any Bluetooth device. Backed with a 2,600mAh battery, it supposedly delivers up to seven hours of playback on a single charge. The device can be connected to an aux input and also supports Bluetooth connectivity.

8 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com

Price:

` 4,199 The Manzana Drumbazz is black in colour and is available via Amazon and Flipkart. Address: Manzana, G-35, Apollo Industrial Estate, Mahakali Caves Road, Near MIDL, Andheri (E), Mumbai 400093; Ph: 022-67250575/76


Pocket-friendly smartphones from Xiaomi Chinese smartphone manufacturer, Xiaomi, has launched its budget smartphones with the latest features, in India. The phones come in two variants – Xiaomi Redmi 3s and Xiaomi Redmi 3s Prime. The Redmi 3s comes with 2GB RAM and 16GB inbuilt storage, while the Redmi 3s Prime comes with 3GB RAM and 32GB inbuilt storage— expandable up to 128GB via microSD card—along with a fingerprint sensor. Both the variants feature a 12.7cm (5 inch) HD (720X1280 pixels) IPS display, run MIUI 7.5 based on Android 6.0 Marshmallow, and have the new octa-core Qualcomm Snapdragon 430 processor (with four cores clocked at 1.1GHz and the other four at 1.4GHz) with an Adreno 505 GPU. According to company sources, these are to be the first smartphones featuring the new Snapdragon 430

Price:

` 6,999 for Redmi 3s and ` 8,999 for Redmi 3s Prime

chipset in India. The devices sport a 13 megapixel rear camera with PDAF (phase detection auto-focus), an f/2.0 aperture, HDR mode, 1080p video recording and LED flash along with a 5 megapixel front camera with 1080 video recording. The connectivity options of the handset include 4G, Wi-Fi, GPRS/EDGE, Bluetooth, GPS/A-GPS, Glonass, Wi-Fi 802.11 b/g/n and micro USB. Backed with a 4000mAh battery, the Xiaomi Redmi 3s and Redmi 3s Prime are available via Mi.com and Flipkart. Address: Xiaomi India Pvt Ltd, 8th Floor, Tower-1, Umiya Business Bay, Marathahalli – Sarjapur Outer Ring Road, Bengaluru, Karnataka 560103; Email: service.in@xiaomi.com; Website: www.mi.com

Sony launches power banks with pass through charging Japanese tech giant, Sony, has launched two power banks in India with capacities of 15,000mAh and 20,000mAh. Both models come with fast-charging output and can charge multiple devices at a time. With lithiumion polymer technology, the devices support pass through charging, which means that the device connected to the power bank can be charged even while the power bank itself is being charged. According to company sources, the devices retain 90 per cent of their capacity even after 1000 charge cycles. Manufactured using Sony’s Hybrid Gel

Price:

` 5,100 for the 15,000mAh variant and ` 7,500 for the 20,000mAh variant

Sony launches smartphone for selfie lovers After the success of the Xperia series, Japanese tech giant Sony has launched a smartphone targeting selfie lovers. The Xperia XA Ultra sports a 15.2cm (6 inch) full HD 1080p display that runs on Android 6.0 Marshmallow. Backed with a 2,700mAh battery, the device comes with a 2.0GHz 64-bit octa-core processor. It has 16GB of internal memory that can be expanded up to 200GB via microSD card along with 3GB RAM. A much hyped feature of the smartphone is its 16 megapixel front camera, with an ‘optical image stabiliser’ and a smart selfie flash, along with a 21 megapixel auto focus rear camera that comes with quick launch capabilities enabling users to capture images in split seconds. The connectivity options of the device include Wi-Fi, Bluetooth 4.1, NFC and LTE 4G/3G. Available in white, graphite black and lime gold colours, the Sony Xperia XA Ultra can be purchased via online and retail stores. Address: Sony India, No-A-31, Mohan Co-operative Industrial Estate, Mathura Road, New Delhi – 110044; Ph: 011-66006600 Price:

` 29,900

technology, the 15,000mAh variant comes in silver and black, and the 20,000mAh variant is available only in black. The power banks are available via Flipkart and retail stores. Address: Sony India, No-A-31, Mohan Co-operative Industrial Estate, Mathura Road, New Delhi – 110044; Ph: 011-66006600

The prices, features and specifications are based on information provided to us, or as available on various websites and portals. OSFY cannot vouch for their accuracy.

Compiled by: Aashima Sharma

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 9


FOSSBYTES LibreOffice 5.2 launched with tons of new features The Document Foundation has released LibreOffice 5.2. This newest version has several new features to compete against proprietary solutions including Microsoft Office. Among them, document classification comes first. This feature is designed to classify your documents as per the TSPC standard. The suite automatically stores the classification level in the XML code of the document. Apart from the classification, the updated LibreOffice comes bundled with some financial features. There is a forecast function to predict future financial details through the Calc spreadsheet program. The suite now also supports multiple signature descriptions alongside the ability to import and export signatures from OOXML files. The Document Foundation has additionally provided some improved interoperability features that include better import filters within Writer as well as the support for Word for DOS legacy documents. To enhance security over cloud storage, LibreOffice 5.2 has two-factor authentication for Google Drive. The office package also sports program modules with new drawing tools such as filled curves, polygons and freedom lines. LibreOffice is already one of the leading open source office suite alternatives to Microsoft Office. The Document Foundation estimates that it has been downloaded over 140 million times since its launch in 2011. Nearly 80 people contribute each month to improve this community-driven project. The latest version of LibreOffice comes in new packaging formats to deliver an upgraded experience. You can leverage its features through Snap on your Linuxbased desktop. Going forward, open source enabler The Document Foundation is set to launch LibreOffice Online. This new Web-based offering will take on Google Docs by letting users access its features directly from their servers. “LibreOffice is growing fast, thanks to distinctive advantages such as the standard document format, which is recognised by a growing number of governments as the best solution for interoperability,” said Thorsten Behrens, one of the directors of The Document Foundation. The new release will get upgraded to LibreOffice 5.3 in February next year. Meanwhile, you can experience its features by installing the free package on your Windows, Mac OS or GNU/Linux system.

Black Duck launches Centre for Open Source Research and Innovation

Black Duck has announced the launch of its Centre for Open Source Research and Innovation with an aim to increase the use of open source software for application development. The new centre will be based at Black Duck’s Massachusetts headquarters, but the two new Black Duck research groups in Canada and Europe are likely to play major roles in the application release and development cycle. “Open source is the way today’s applications are developed, and we expect

10 | September 2016 | OpeN SOUrCe FOr YOU | www.OpenSourceForU.com

Compiled by: Jagmeet Singh

Parrot Security OS 3.1 released with Linux kernel 4.6 and PHP 7 support

To deliver an advanced experience to security professionals and ethical hackers around the world, Parrot Developers has released Parrot Security OS 3.1. The new Debianbased operating system works on Linux kernel 4.6 and comes preloaded with PHP 7 support. As a successor to Parrot Security OS 3.0 that was released a month back, Parrot Security OS 3.1 comes with the new Linux kernel. The group behind the Linux distribution has also added QTCreator 4.0.2 and QT framework 5.6.1 to help white hat hackers. Likewise, there is support for GCC (GNU Compiler Collection) 4.8.5, 4.9.3, 5.4.0 and 6.1.1 as well as CLANG 3.6-33 and 3.8-2. To improve the user experience, the new Parrot Security OS version has switched from MySQL to the MariaDB database engine. There are some stability improvements and bug fixes to improve the overall platform for security assessments and penetration testing.

Apart from the major systemwide changes, the updated Parrot Security OS has some tweaks on the interface front. There is an updated Parrot core and menu in addition to some new tools. Drivers support on the platform has also been improved to let hackers use their new hardware with the operating system. Besides, the platform comes with new zuluCrypt and AnonSurf modules and includes Tor Browser launcher. You can download Parrot Security OS 3.1 on your system directly from the official Parrot Developers website.


FOSSBYTES the worldwide adoption will continue to accelerate because of the compelling economic and productivity benefits open source provides. Over the next decade, more cutting-edge research, innovation, information and education — particularly related to open source security — are needed to ensure the open source ecosystem remains vibrant. We will be a leader in that effort,” said Lou Shipley, chief executive officer, Black Duck. Europe-based Black Duck Security Research analyses security issues and attack patterns in open source software to offer actionable information on vulnerabilities. It also suggests steps to reduce risk and strategies for using the community-backed solutions effectively. Based in Vancouver, Canada, the group conducts applied research in data mining, machine learning, natural language processing, big data management and software engineering. “Both groups will be sources of valuable research and reports throughout the year. Their work will help us innovate and improve our open source security and management solutions, and a great deal of what they do will also be shared for the benefit of the open source community,” Shipley said. Black Duck aims to issue periodic Open Source Security Audit (OSSA) reports analysing results of applications audited by the company. Additionally, the company is set to enhance its open source repository and database, KnowledgeBase, and expand its online community, Open Hub.

Microsoft’s Chakra JavaScript engine debuts on Linux

Microsoft has announced that it is bringing its open source ChakraCore to Linux. The core of the Chakra JavaScript engine is already powering the company’s Edge Web browser and Universal Windows Platform system. At the Node.js-centric NodeSummit, Microsoft showcased the first experimental implementation of the ChakraCore interpreter and runtime on x64 Linux and OS X 10.9. The Redmond company also previewed an experimental Node.js version with ChakraCore engine on x64 Linux. “Bringing ChakraCore to Linux and OS X is all about giving developers the ability to build cross-platform applications with the engine. The JavaScript Runtime (JSRT) APIs to host ChakraCore were originally designed for Windows, so they inevitably had a few Windows dependencies – for example, Win32 usage of UTF16-LE encoding for strings, where other platforms might use UTF8-encoded strings,” Microsoft developers wrote in a blog post. To deliver cross-platform support on ChakraCore, Microsoft refactored and redesigned some of the major JSRT APIs. Additionally, the JavaScript engine has backwards compatibility with the previous set of JSRT APIs on Windows. Apart from bringing the same JavaScript experience to Linux that was previously available only on Windows 10, Microsoft has bigger plans. “It has been a little over a year since we started working on Node-ChakraCore, with the intention to grow the reach of the Node.js ecosystem. One of the

Asttecs expands global footprint with new offices in Africa

Asttecs has announced that it has expanded its footprint in Africa by launching local offices in Ghana and Botswana. The enterprise telecom company also has plans to extend its reach to countries like Rwanda, Kenya, Mozambique, Zimbabwe and South Africa. Bengaluru-headquartered Asttecs aims to deliver open source solutions in some of the main African markets to cater to organisations across multiple verticals in the continent. “We are focusing significantly on developing our partner ecosystem for increased market penetration across the globe. As part of the global expansion programme, Asttecs is strengthening its presence in recognition of the huge potential of the African market,” said Binish V.J., head – international business, Asttecs, in a statement. “Our business strategy in Africa focuses on understanding the specific needs of the customers and delivering cutting-edge solutions, especially in the enterprise communications segments,” he added. Asttecs considers that extensive development of communications infrastructure and changing business and ICT trends have created new opportunities in African countries. Therefore, the company has found ways to expand its presence and offer a broad range of cost-effective, flexible and versatile communication solutions. Through the power of open source and IP, Asttecs develops enterprise solutions. It has Asterisk diallers and various other open source communication solutions that result in efficient operations with lower TCO, yielding quick returns on investment.

www.OpenSourceForU.com | OpeN SOUrCe FOr YOU | September 2016 | 11


FOSSBYTES

Google, Intel and Mirantis sketch out future of OpenStack

OpenStack distributor Mirantis has partnered with Google and Intel to take the cloud platform to the next level. The new partnership is aimed at packaging OpenStack into Docker containers by leveraging container cluster manager Kubernetes. Container support is not new for OpenStack. However, the latest collaboration is designed to give users precise control over the placement of services used for the OpenStack control plane. The new development will also provide a ‘self-healing’ experience on the OpenStack control plane and open new avenues for Kubernetes-powered cloud solutions such as Google Cloud. “With the emergence of Docker as the standard container image format and Kubernetes as the standard for container orchestration, we are finally seeing continuity in how people approach operations of distributed applications. Combining Kubernetes and Fuel will open OpenStack up to a new delivery model that allows faster consumption of updates, helping customers get to outcomes faster,” said Boris Renski, Mirantis CEO, in a statement. Google believes its tie-up with Mirantis and Intel will bridge the gap between legacy infrastructure software and the nextgeneration of application development. “Many enterprises will benefit from using containers and sophisticated cluster management as the foundation for resilient, highly scalable infrastructure,” said Craig McLuckie, Google’s senior product manager. The trio is set to utilise the OpenStack Fuel deployment tool to empower Kubernetes. The tool will reduce the burden of running Docker containers on top-end Linux hardware. Additionally, it will bring Kubernetes into the limelight for various OpenStack developments. The results of the new announcement are expected to be ready in 2017.

fundamental goals of this project from the beginning has been to ensure that the existing ecosystem continues to work, in an open and cross-platform way, exactly like Node.js,” sources in the company added. Microsoft is updating its ChakraCore roadmap to enhance its cross-platform support. Moreover, the software giant is in development to bring a ‘fully-capable’ ChakraCore JIT compiler and Windows-exclusive concurrent and partial GC to other platforms. These new efforts are likely to improve the performance of Node. js and other applications hosting the ChakraCore engine. Developers have started using ChakraCore for their projects on GitHub. These developments help Microsoft fix and advance the engine for multiple computing platforms.

Google tweaks Linux kernel in Android to enhance security

Google has enabled new protection features on the Linux kernel within Android to increase the security of the open source platform. The features are classified into two different categories—memory protection and attack surface reduction. “Android relies on the Linux kernel for enforcement of its security model. To better protect the kernel, we’ve enabled a number of mechanisms within Android,” Android security team member Jeff Vander Stoep wrote in a blog post. Among the new changes, there is memory protection for user space processes through address space separation. This tweak enables the Linux kernel to maintain its integrity despite some vulnerabilities within the unrelated portions of the system’s memory. Google has also provided a feature to segment kernel memory into logical sections and sets. To prevent direct access to user space memory, the kernel now has restricted access. “This can make a number of attacks more difficult because attackers have significantly less control over kernel memory that is executable,” Stoep said. Google has also reduced the attack surface to expose fewer entry points to the kernel. For this, the operating system now has the kernel’s perf system. Stoep revealed that Android Nougat will by default block access to perf. However, there will be an option to use the perf system via developer settings. The kernel additionally restricts app access to ioctl commands. These commands previously helped attackers gain backdoor access to Android. Although developers do not use most of the ioctl commands, some third-party apps access them. Thus, Android Nougat will carry a precise whitelist of socket ioctl commands. The new operating system will also come with seccomp to enable an additional sandboxing mechanism. “Due to these efforts and others, we expect the security of the kernel to

12 | September 2016 | OpeN SOUrCe FOr YOU | www.OpenSourceForU.com


FOSSBYTES continue improving,” Stoep concluded.

Facebook releases open source tool to ease React developments

Facebook brought out React as its open source JavaScript library to let developers access its native user experience. But as developments using the library became tough for many individual developers and enterprises due to ‘an overwhelming explosion of tools’, the social networking giant has now released the Create React App. Having evolved from a hackathon project, Create React App provides an official way to create singlepage React apps. The tool is designed to extend the presence of React without abandoning the company’s open source philosophy, which became apparent three years back with the release of the JavaScript library in March 2013. “Create React App is a new officially supported way to create single-page React applications. It offers a modern build set-up with no configuration,” Facebook’s Dan Abramov explained in a blog post. Facebook has designed Create React App in such a way that it leverages both the Webpack asset-bundling tool and the Babel JavaScript compiler. Also, there is the ESLint code-linting tool to list out all the lint warnings right from the console. The build settings on Create React App tool are pre-configured, and there is just a single dependency for developing multiple React apps. Moreover, Facebook has provided all the necessary instructions on GitHub to let developers easily build some of their interesting apps. “This is an experiment, and only time will tell if it becomes a popular way of creating and building React apps, or fades into obscurity,” Abramov stated. Although Facebook is looking forward to influencing some developers through its Create React App, Abramov’s statement hints at a cloudy future. The company still needs to make developers aware of the benefits of its JavaScript library. Also, basic features such as testing codes are yet to arrive on the open source tool. That being said, Facebook’s Create React App would be a productive tool for all those developers who want to ease their development efforts on the framework that is already available on multiple platforms including Android, iOS and Windows 10. The new tool will also result in the development of many new Web apps in the future.

Google develops an open source API to ease password management on Android

Google has announced the development of a new open source API project that will simplify password management on Android devices. This new move emerges from a partnership between the search giant and password manager Dashlane. Called Open Yolo (You only login once), the new open API is specifically

Ubuntu Touch starts supporting fingerprint scanners

Canonical has released Ubuntu Touch OTA-12 as its latest open source mobile platform. The new Ubuntu Touch version has native support for a fingerprint scanner to let users add an advanced security layer on top of their interface. As biometric security solutions are the trend, Canonical addressed that demand and added the support for a fingerprint scanner. Though the support is available intrinsically within the new Ubuntu Touch version, the Meizu Pro 5 Ubuntu Edition is the only device so far to offer the new experience. Also, you are allowed to store any five of your fingerprints on the smartphone. This could be to restrict access to the device to just one user. Apart from the support for fingerprint authentication, the new Ubuntu Touch version has convergence with an all-new libertine scope for converged devices, animated mouse cursors, MPRIS support for playlists and an on-screen keyboard support for X apps. The operating system also lets you maximise windows both horizontally and vertically. Canonical has bundled updates of its core apps with the latest Ubuntu Touch. There is the message forwarding option within the default messaging app. Additionally, there is the coloured emoji feature for the Ubuntu keyboard and wireless display support for Ubuntu tablets such as the Aquaris M10. Ubuntu Touch OTA-12 has a new Oxide video player and an updated Web browser. The browser has got touch selection improvements and zoom support; it optimises tab view loading time and offers consistent page headers. In addition to the new features, Ubuntu Touch OTA-12 comes with certain bug fixes. It fixes the screen dimming issue that was reported on some devices. Besides, there are WebRTC fixes for selecting cameras, connectivity fixes and some stability improvements. You can check for the availability of the newest version of Ubuntu Touch on your smartphone or tablet. It comes as an overthe-air update for all the recent smartphone and tablet hardware.

www.OpenSourceForU.com | OpeN SOUrCe FOr YOU | September 2016 | 13


FOSSBYTES designed for third-party Android developers to let them enable their apps with the ability to access passwords directly from password managers like Dashlane. Although Dashlane is the founding firm behind the new project, other password managers such as 1Password, LassPass and mSecure will also get a chance to leverage the advancements from the open API. “To stay one step ahead of the market demand, Google and Dashlane are helping create a seamless, universally-acceptable Android app authentication solution to increase your online security,” Dashlane community manager, Malaika Nicholas, said in a statement. Google and Dashlane are set to make Open Yolo available on other mobile platforms too. This would broadly change the way you use passwords today and, ultimately, transform security across mobile platforms. “In the future, we see this open API going beyond just Android devices, and becoming universally-implemented by apps and password managers across every platform and operating system,” Nicholas added. Open Yolo is not the only tool by Google to upgrade password management on its Android operating system. The search giant added the Smart Lock feature to Android Lollipop last year, which keeps devices unlocked when these are close to the body of their users. The release of Open Yolo appears to be a competitive move by Google to take on Apple. The Cupertino company has been providing its native password management system within iOS from the last several months. Users can store a large number of their passwords and grab them when required with just a single tap on the screen. However, unlike Apple’s attempt to store passwords under one roof, Open Yolo by Google is an open source project. This will help the search giant attract more interest and support from the community.

Linux kernel 4.8 to natively support Microsoft Surface 3

While Linus Torvalds is busy developing Linux kernel 4.8, it has now been found that the next version of the open source operating system will natively support Microsoft’s Surface 3. The new support emerges from a touchscreen controller driver that will enable the last year’s Surface tablet to run Linux flawlessly. Dmitry Torokhov, Linux kernel input subsystem maintainer and one of the active contributors to Linux 4.8, in a recent email to Torvalds suggested a driver integration for the touchscreen controller found in the Surface 3. Moreover, a changelog available in the same email hints at a tweak to support the Surface Pen stylus that enhances the functionality of the Windows-running tablet. Support for the new drivers is apparently a part of the Linux kernel 4.8 first release candidate, which is already available for testing purposes. “This [Linux kernel 4.8 first release candidate] seems to be building up to be one of the bigger releases lately, but let’s see how it all ends up. The merge window has been fairly normal, although the patch itself looks somewhat unusual: over 20 per cent of the patch is documentation updates, due to conversion of the DRM and media documentation from DocBook to the Sphinx doc format,” Linux creator Torvalds wrote in a community email. Linux kernel 4.8 could bring the Surface 3 back to life. Interestingly, the new development will give Linux users an option to use Microsoft’s in-house hardware for its open source developments. 14 | September 2016 | OpeN SOUrCe FOr YOU | www.OpenSourceForU.com


FOSSBYTES Apart from Surface 3 support, the new Linux kernel version will come with some upgrades for new GPU models while offering various updates to support advanced architecture.

Attic Labs develops open source Noms database

Attic Labs has announced that it has developed the open source Noms database to deliver an experience similar to projects like Git and Camlistone. The San Francisco-based software company has raised US$ 1.8 million in a Series A round to maintain its decentralised database over the long run. Noms can be replicated and edited concurrently on multiple computers. This feature makes it a unique offering in the world of database systems. Additionally, the database stores structured data instead of text files, and is designed to scale to massive data records. “We think of a forkable, synchronisable database as an important primitive in our increasingly data-centric world,” wrote Aaron Boodman, co-founder of Attic Labs, in a blog post. “It can be used to collaborate on largescale structured data across organisations. It can be used as the basis for any decentralised application. And it can make a very good archive for most sorts of data,” he added. Boodman previously developed popular Firefox extension Greasemonkey and was a technical lead for Google Chrome. Apart from Boodman’s development skills, Attic Labs has some other top-level management members who worked on open source developments like Chrome OS before releasing Noms. Some of the primary features that could persuade you to leverage Noms for your next data-related operation include its easy aggregation and transformation of information, synchronisation of large datasets and fork as well as merging of workflow from an existing solution. Greylock Partners, which previously backed Docker, led the funding round. Investors, including Harrison Metal, Naval Ravikant, Linus Upson and Othman Laraki, also helped in the development of the database.

Government to launch India’s own open source collaboration platform

Months after rolling out a policy to support open source software development, the Indian government is now all set to launch its own collaboration platform for hosting open source projects. The move is apparently aimed at encouraging software developers and various government bodies to start sharing code from their major projects, under one roof. The Department of Electronics and Information Technology (DeitY) released a policy related to the adoption of open source software in April 2015. Called ‘Collaborative Application Development by Opening the Source Code of Government Applications’, the policy aims to provide a comprehensive framework for archiving government source code in repositories. The framework

Linux 4.7 now out with enhanced security and advanced graphics support

A couple of months after bringing out the mobile-focused build, Finnish-American software engineer Linus Torvalds has now released Linux 4.7 as his newest Linux kernel. The new version enhances the security of Linux systems through a new module and supports some modern GPU models. “After a slight delay due to my travels, I’m back, and 4.7 is out,” Torvalds wrote in a note. “Despite it being two weeks since rc7, the final patch wasn’t all that big, and much of it is trivial one- and few-liners. There’s a couple of network drivers that got a bit more loving,” he added. Linux kernel 4.7 has a LoadPin security module that helps the platform load all the necessary modules from the same file system. It is ported directly from Google’s Chrome OS and aims to limit the medium from which kernel modules and firmware load on a system. Apart from the new security module, the latest Linux kernel has the CONFIG_TRIM_ UNUSED_KSYMS option to remove exported kernel symbols that are unused on the system. This will reduce the size of the generated kernel binary and improve the overall user experience. Also, the reduction of exported kernel symbols would be a helpful move from the security perspective. This kernel supports the newly launched Radeon RX 480 GPUs through the AMDGPU driver. Besides, the video driver has been upgraded with some performance improvements. Continuing the trend of supporting more mobile devices in addition to desktops and notebooks, Linux 4.7 supports some new ARM platforms. There is native support for devices like the Google Pixel C and Amazon Kindle Fire. Torvalds has additionally provided a PMC driver for various Intel Core chipsets and support to generate virtual USB device controllers in USB/IP. You can download Linux 4.7 on your system directly from the official kernel website.

www.OpenSourceForU.com | OpeN SOUrCe FOr YOU | September 2016 | 15


FOSSBYTES is primarily designed for open software repositories to enable reuse, sharing and remixing of new and existing code. “While the policy is in place, it needs to be supported by appropriate technology infrastructure to create and grow a thriving open source community around Indian e-governance,” a source told Open Source For You, with regard to the launch of the open source platform.

Similar to some of the popular repositories such as GitHub and SourceForge, the new offering will enable not just some government bodies but also a large number of software developers and corporates to develop and publish their code at a single place. The team behind the ongoing development is planning to divide the repository into two different segments— while one of its parts will be exclusive to government departments, another one will be open to the public. Taking a cue from the policy designed by the Indian government, the US government also recently announced its Federal Source Code policy. But the US agencies will not get a distinct location like their counterparts in India to archive source code. This will make the Indian platform unique and a role model for some developed regions including the US, where the government is already embracing open source technologies. Moreover, the new initiative by the Narendra Modi-led government would improve the efficiency of existing code that different departments are already using for various public programmes. Software developers and academic institutions could also help contribute to application testing and platform enhancements to deliver effective solutions.

US launches a policy to promote open source across federal agencies

In a move to enhance digitisation among its entities, the US government has launched its first-ever open source policy. The new policy mandates that all the federal agencies in the country have to release a portion of their custom code to the public. Called the Federal Source Code policy, the initiative is aimed at helping government agencies reduce their expenses on various software development projects and improve access to

their custom application code. This emerged as a result of the plan that was revealed by the White House earlier this year. “We’re releasing the Federal Source Code policy to support improved access to custom-developed Federal source code. The policy, which incorporates feedback received during the public comment period, requires new custom-developed source code developed specifically by or for the Federal Government to be made available for sharing and reuse across all Federal agencies,” US Chief Information Officer, Tony Scott, wrote in a blog post. In addition to having a pool of source code to enable its sharing and reusability by government departments, the policy comes with a pilot programme that makes it necessary for the agencies to release at least 20 per cent of the new customdeveloped code to the public. The US government is not demanding which portion of the code is to be made available in the public domain. However, federal agencies are recommending “transparency, participation and collaboration” in their projects. “Agencies should calculate the percentage of source code released using a consistent measure — such as real or estimated lines of code, number of self-contained modules, or

cost — that meet the intended objectives of this requirement,” the policy reads, describing the pilot programme. This is not the first time that the Barack Obama-led government has opted for the open source way. The code of flagship public offerings such as We The People, Vets.gov and Data.gov is already available on GitHub. “By opening more of our code to the brightest minds inside and outside of government, we can enable them to work together to ensure that the code is reliable and effective in furthering our national objectives,” Scott added. The US government is building Code.gov to support its policy. The new website will operate as an official place for all the open source code released by federal agencies in the future.

16 | September 2016 | OpeN SOUrCe FOr YOU | www.OpenSourceForU.com

For more news, visit www.opensourceforu.com


CODE

SPORT

Sandya Mannarswamy

In this month’s column, we discuss a few computer science interview questions.

T

a query keyword appears in a document. For his month, let’s discuss a set of computer example, given the query ‘Tendulkar the Indian science interview questions, focusing on Cricketer’, I am asked to find all documents natural language processing, machine learning relevant to this query. It is obvious that one as well as on traditional programming problems. can compute the term frequency for the query Please note that it is difficult to discuss the complete terms, and return a ranked list of documents in solution to the questions due to the large number which the query terms appear most frequently. of questions we cover in this column. Instead, I If this suffices to return the most relevant encourage you to send me your solutions to the documents, why do we need the measure of the questions posed, directly, so that I can provide inverse document frequency (IDF)? Can you feedback on the same. illustrate with an example why IDF is needed in 1. You are given the problem of disambiguating information retrieval? entities in a given set of documents. For 3. Parsing is one of the major components of a example, for the word ‘Apple’, you will need natural language processing pipeline. Depending to disambiguate between the use of the word on the nature of the task, different types of ‘Apple’ as an corporate entity and its use as a parsing of the input text may be required. Can word that describes a fruit. A more complicated you explain the differentiation between shallow example would be the following: parsing, dependency parsing and constituency parsing? Given the sentence, “John went to New S1: The White House announced its intention to support York yesterday by the midnight flight,” can you the newly formed government in Vanaru. provide the output for each of these parses? On S2: The New Year Party will be held at the White House, a related note, when would you use POS tagging and will be hosted by Michelle Obama. instead of any parsing? In sentence S1, the White House is an entity that 4. You have been asked to build an application that analyses the conversations in call centres. actually represents the American government. In These conversations typically happen between sentence S2, the White House is a location. Can you the agent and the customer. An automatic come up with an algorithm which can disambiguate speech recognition system has been used to such mentions of this entity to their appropriate convert the voice conversations into text. The types, based on the context. first stage of your application needs to analyse 2. In information retrieval, you are given a set the voice transcripts, and correct any mistakes of documents and a query. You are asked to in the words that would have been caused by retrieve the set of documents that is most the automatic speech recognition system. Can relevant to the query. Many of the common you explain how you would build the transcript information retrieval techniques are built around correction system? the measure of TF-IDF, where TF stands for 5. An easier and related problem to the question the term frequency, i.e., the number of times www.OpenSourceForU.com | OPEN SOURCE FOR YOU | SEPtEmbER 2016 | 17


CodeSport

Guest Column

above is analysing chat text in call centres. Chat communication happens between the agent and the customer in call centres, and you are given the chat text. Since this is the direct typed text from the agent and customers, there can be frequent spelling mistakes, non-standard annotations/greetings, etc. The first stage of your application needs to analyse the chat text, find the misspelt words and replace them with the corrected version. Your first thought is to use a standard dictionary to identify words that are not in the dictionary, and then find the words that are the best replacement. Can you come up with an algorithm which finds their best replacement? How would you handle misspelt words that are domain specific? 6. You are given a corpus of documents, and you are asked to find the important phrases in the corpus. The important phrases can be either unigrams, bigrams or trigrams. Can you come up with a simple algorithm that can do this? If you want to compare the performance of your algorithm with a standard one, you can use gensim.models.phrases. For more details, you can refer to https://radimrehurek.com/gensim/models/phrases. html and check how your algorithm is performing compared to that. 7. Naïve Bayes and logistic regression are two popular algorithms used in machine learning. When will you prefer to use Naïve Bayes instead of logistic regression, and vice versa? 8. Topic models are very popular in natural language processing to identify the topics associated with a set of documents, and the identified topics can be used to classify/cluster the documents. Can you explain the basic idea behind the topic model? Can you differentiate between topic modelling and latent semantic indexing? 9. Continuing with topic models, general topic model approaches output a set of unnamed topics, with a set of keywords associated with each of these unnamed topics. Then a domain expert needs to manually annotate these topics with their topic names. Is it possible to avoid this manual labelling of the topics generated by the topic modelling techniques? 10. One of the major issues in supervised learning is overfitting of the model to training data. What causes overfitting? How can you avoid it? 11. Distributional semantics is a popular concept in natural language processing and, of late, word embeddings that are based on the concept of distributional semantics are widely used in NLP. Word2vec and Glove are two popular techniques for generating word vectors or word embeddings. Can you explain either of these two techniques? 12. You are given a corpus and you have created the word embeddings for your corpus using the popular Word2Vec

18 | SEPtEmbER 2016 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com

technique. Now you need to represent each sentence in your corpus by means of a vector. You need to compose the word vectors of the words present in your sentence to generate a sentence vector representation. What are the possible compositional models you can apply to generate the sentence vector? 13. You are given the problem of identifying different types of relations in a text. You have come up with two different classifiers that are performing equally well, and you are unable to decide, based on your currently available data, which classifier to select. How would you decide which of the two classifiers to use? 14. What is ensemble learning? When would you use an ensemble of classifiers as opposed to a single classifier? Among the different ensemble learning methods, can you explain bagging and boosting? 15. You are asked to build a system that can do ‘parts of speech tagging’ for a given sentence. While parts of speech (POS) tagging assigns independent labels for each of the words in the sentence, it uses the sequence of words it has seen in the context to decide on the current label. For example, consider the sentence: “Robert saw the saw on the table and took it outside to the shed.” The first occurrence of ‘saw’ in the above sentence is a verb, whereas the second occurrence of the word ‘saw’ is a noun. By looking at the sequence of words occurring around the specific word, a POS tagging system can distinguish between the tags it should assign for each of the two occurrences of the word ‘saw’. Hence, POS tagging is an example of sequence labelling. Two of the important algorithms for sequence labelling are Hidden Markov Models and Conditional Random Fields. Can you explain each of them? 16. Consider the two sentences: S1: The bank will remain closed on Saturday. S2: He sat on the bank and watched the sunset.

The word ‘bank’ has two meanings—one as a commercial institution and another as the sand bank by the side of the river. Depending on the frame of reference, the appropriate context has to be assigned to the word ‘bank’. This is the problem of word sense disambiguation. Can you come up with an algorithm for word sense disambiguation using the supervised technique (provided that you are given sufficient labelled data)? How would your system handle words whose multiple senses are not present in the training data? What are the trade-offs in using a supervised approach vs an unsupervised approach towards word sense disambiguation.

Continued to page 25...


For U & Me CaseStudy

Government Leverages Open Source to Build DigiLocker for Indian Citizens

DigiLocker is an initiative by the government to offer Indian citizens a free platform to store and access important documents. The platform uses several open source technologies to deliver a mass solution and contributes back to the ever-growing community.

A

s the world is moving towards the concept of digital government and e-governance, Prime Minister Narendra Modi has sketched out his vision for the Digital India model. This envisages transforming India into a digitally empowered society. DigiLocker is perhaps the government’s largest project based on open source technologies. It has already attracted over two million Indians who have uploaded their documents on the dedicated cloud storage space that is available for free. “DigiLocker is targeted at paperless governance. It is a platform to issue and verify certificates and documents digitally, and thus eliminate the use of physical documents,” says Debabrata Nayak, additional director, National e-Governance Division (NeGD).

A small team behind a big project

Nayak is leading an in-house development team of just around 14 people who have developed DigiLocker. The team includes some government personnel as well as individuals from private firms. “Our technical team is a mix of people with a background in PHP, Java, .NET and Python, and we prefer those with previous experience in open source technologies. We have individuals from various domains such as the government, banking, insurance, services and products,” Nayak says.

The team collectively contributed to a robust solution to ultimately enable paperless governance. Unlike a traditional cloud storage solution like Dropbox or Google Drive, DigiLocker comes in two separate parts. One part is designed to store links to documents that are issued to you by the government agencies that have signed up with DigiLocker, while the other can be used to upload any legacy or old documents that you wish to. There is also 1GB of space to store documents on the cloud. Apart from various multi-skilled engineers, the team at NeGD has Amit Ranjan, who co-founded SlideShare, which was recently acquired by Microsoft-owned LinkedIn. Ranjan brings the spirit of the startup to DigiLocker. Nayak feels that finding the right talent for open source development is quite easy nowadays but searching for open source contributors is still difficult. “The culture of actively contributing to open source projects has yet to become mainstream in India among the IT community,” he explains.

Multiple components under one roof

DigiLocker sports three major components – a repository, an access gateway and cloud-based dedicated personal storage. The repository is used to archive e-documents, whereas the gateway provides a secure mechanism for users to access documents from various online repositories, and the cloud

20 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


CaseStudy For U & Me storage gives space to store documents on the Web. Building a solution for the masses is not an easy task for any team. And the people at NeGD, too, faced many challenges while designing DigiLocker. “Although the open source projects provide you ready solutions, you have to own them completely to be able to scale them for millions of users,” says Ranjan, while describing the prime challenge in deploying open source solutions for DigiLocker. To solve the problem of scale, the NeGD team divided the entire project into certain phases. “The phases provide concrete goals regarding the number of users and files. Additionally, the infrastructure is constantly monitored, adjusted and fine-tuned with every major release. Scaling an application is an ongoing process, and needs constant changes in architecture and the solutions used. At the same time, we realise that every primary issuer on DigiLocker brings unique challenges, and we learn new lessons now and then,” Ranjan states.

Security measures to deliver a safe and secure platform

DigiLocker comes with Aadhaar integration to offer citizens a secured solution to store e-documents online. Also, there is the eSign option to let users self-attest their documents. The NeGD team uses some other security measures to make DigiLocker a safe and secure platform for the public. “We are taking all the precautionary measures to ensure data is protected and uncompromised,” Ranjan asserts. The platform follows the OWASP (Open Web Application Security Project) security standards and guidelines. Additionally, there is a 256bit SSL encryption layer on the server that is hosted in an ISO 27001 security-certified data centre. Data is regularly backed up with proper redundancy, and a one time password (OTP) is generated at the time of each sign-up to authenticate users. Security audits have also been conducted by a recognised audit agency to ensure safety and security. “We follow standard software development practices of uniform coding standards, guidelines and reviews. Every product release is reviewed and tested internally for security vulnerabilities before it is deployed,” says Ranjan.

Open source technologies power this solution for the masses

DigiLocker is based on open source platforms including PHP, Python and Node.js. On the server front, there is Nginx and Apache, while MonoDB is used to enable the gateway access and MariaDB is deployed for user account related metadata. “Open source technology gives you the freedom to try, test and scale your solution, one step at a time. One has so much choice of open source products and frameworks to choose from nowadays,” says Amit Jain, product manager. The entire platform that brings the cloud storage solution to millions of Indian citizens is based on ownCloud Server. In fact, DigiLocker is supposedly the largest installation of ownCloud Server.

Citizens

issues documents digitally

issuer

Accesses documents online

DigiLocker

requestor

The way DigiLocker works

“With over two million users, DigiLocker is the largest installation of ownCloud Server worldwide. This demonstration of the capability of community software has created excitement in the ownCloud community. We are in constant touch with the ownCloud team and provide inputs for the community from time to time. We are also working on making DigiLocker source code available to the community under open source license,” says Amit Savant, technical product manager, NeGD. Instead of proprietary solutions, the team led by Nayak opted for open source technologies to easily scale DigiLocker. “It would have been a mammoth task to deploy a proprietary solution of the scale of DigiLocker, which is meant for a billion people, if it succeeds,” Nayak says. As the number of users grows, the NeGD team aims to scale up DigiLocker using more open source solutions. “Scaling up the application will be a constant endeavour with the growing number of issuers and requestors in the DigiLocker ecosystem. We are always experimenting with newer technologies that will help us make DigiLocker better,” Savant says. The platform already has frameworks like Nginx and Memcached that offer high scalability. However, the team at NeGD’s New Delhi headquarters is planning to expand its existing coverage by deploying more community-based technologies. Savant says there are plans to use only MongoDB for the gateway engine of the platform. “We have already developed an academic repository for DigiLocker, using MongoDB, and have developed the necessary skills within the team. We now think we are ready to take these skills to the next level,” he adds. In addition to the open source NoSQL database for the gateway, the team is set to implement an OpenStack-based cloud for the dynamic computing environment. Deployment of GlusterFS—apart from RabbitMQ for messaging and GearMan for backend job processing—is in the pipeline. “Open source technologies are better suited for government operations. These technologies can be customised and scaled in-house to suit your requirements. You are not locked in with one software vendor for the life of a project,” Nayak concludes. By: Jagmeet Singh The author is an assistant editor at EFY.

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 21


For U & Me Interview

We are transforming the classroom culture in many Ways Today, MOOCs or massive open online courses are transforming the education environment. One of the pioneers in the field, the MIT and Harvard Universitybacked edX, has been very successful with its Open edX that offers a spectrum of courseware to all. Jagmeet Singh of OSFY talked to Anant Agarwal, CEO of edX and MIT professor, for the details about the online platform. Edited excerpts: Anant Agarwal, CEO, edX

22 | September 2016 | OpeN SOUrCe FOr YOU | www.OpenSourceForU.com


Interview For U & Me Q What does edX have for the Indian market? Ours is a non-profit organisation and a learning website, where anybody can come and learn for free. We have around 900,000 students from India, and in total more than eight million students from all around the world. The share of the Indian market is about 11 per cent, and India is second only to the US, which accounts for 30 per cent of our students. Therefore, India is a gigantic market for us. The reason why Indians are very exciting is that the courses we have are quite useful and in demand worldwide. For instance, as computer science is very popular in India, you can take a fantastic course based on computer science simply using our online platform. You can choose introductory programming from Harvard or MIT, right from our website. Likewise, you can take some advanced courses related to frameworks like Ruby on Rails from Berkeley, or basic courses to learn about computers such as an introduction to computer science from IIT, Bombay.

The share of the Indian market is about 11 per cent, and India is second only to the US, which accounts for 30 per cent of our students. questions on the Q&A forum. Therefore, our biggest challenge is to generate sufficient revenue to sustain and make the courses available to the public.

Q Does using open source solutions help edX solve the challenge of generating revenue?

Apart from computer science, we have courses for data science, which is another hot area. Whether it is data analytics, business analytics or the analysis of data, we have many job-oriented data science programmes on edX. These are from Columbia, Microsoft, and so on. We also have business and language courses (such as learning to write well and business English), which are very popular on our platform. We have also launched a course on TOEFL preparation. This new programme comes directly from ETS, which is the organisation that offers the TOEFL exam, and lets students prepare for TOEFL for free. Many students are taking these courses and getting jobs all across the globe. It is hard to find these courses locally in India but, with a platform like edX, anyone with a computer and an Internet connection can join them and nurture their career.

We don’t take money from venture capitalists because if you do that, you need to deliver a return on the investment in a short span of time. It is hard to get a massive return on investment when you are giving things away for free. At edX, we not only offer courses for free, but have also made the online platform available for free as well. This includes the platform software as well as some analytic tools and mobile apps that are all available for free to the world. To maintain the integrity of the platform and offer free services, we do several things. We use the open source community to improve many of our existing features. A large number of people from the open source community are providing support, contributing back to the software base and helping us quickly improve edX. So that is a huge win for us. Apart from utilising the power of the open source community, we launched a new programme last year to translate courses into credit. Called MicroMasters, the initiative allows you to get an online credential from Massachusetts Institute of Technology (MIT). If accepted to the on-campus master’s programme, the credential converts directly into the one-semester worth of credit. Now that MOOCs translate into the credit from top universities like MIT, we have to build features into the platform to maintain the integrity and prevent instances of cheating. So we’ve integrated many mechanisms to restrict cheating practices. Numerous open source ecosystem partners are also building some advanced systems that are helping us solve some big operational challenges.

Q What are the main challenges in maintaining an online course platform?

Q Why was there a need to launch Open edX, when you already had edX in the space of MOOCs?

We are non-profit, and we make career-oriented courses available for free. Offering a free course is quite challenging, because it costs a fair amount of money to maintain and run a course. We have to pay the hosting fee to a hosting provider like Amazon. Also, each course has a faculty member and teaching assistant who answers

EdX uses a platform software, and we could just leave edX for the public. However, many people suggested that rather than developing software from scratch, we could share our software. So we introduced Open edX. This lets anybody take our code and launch their own website for education.

At edX, we not only offer courses for free, but have also made the online platform available for free as well.

www.OpenSourceForU.com | OpeN SOUrCe FOr YOU | September 2016 | 23


For U & Me Interview Jordan, Saudi Arabia and Israel have already deployed Open edX and launched their educational courses online. In India, premier institutions like IIT, Bombay, have used Open edX and designed a site particularly for a teacher training programme. Many government agencies in India and across the globe are also thinking of deploying Open edX.

We would love to see the government deploy Open edX and launch a national platform to spread the next generation of education in India. We think that by making edX open source, we can maximise the good that we are doing in the world instead of hiding the software and keeping it just for ourselves. If we were a for-profit company, then not giving away the software would be a good thing to do. But this is the best way for a non-profit organisation to reach a massive audience—by offering an open source solution.

Q What are all the open source technologies behind Open edX? Open edX is based on some foundational cloud systems such as Apache, Python and Django, among others. Also, there are tools for e-commerce and for assessments. We are also offering APIs to let third-party developers integrate our sources into their projects. There is a huge business ecosystem around edX, but anything that we develop, we make it open source. The original technology behind edX was based on Python and Django. We also used Nginx for the Web software and deployed Amazon Web Server. Similarly, the present platform has open source technologies like JavaScript and HTML5 on the design front and Ruby on Rails for the discussion forum. All in all, we can say edX is powered by Open edX and that is available for free to the world.

Q How easy is it to develop a solution like edX? We started edX in 2012. It is very challenging to build a non-profit organisation and hard to raise money if you are offering anything free to the public. Building a non-profit model is not easy because there is no standard model for investment. But we were grateful to MIT and Harvard. We could not launch as a non-profit without support from both these institutions. Today, we have the support of more than one hundred institutional partners, including some of the top universities in the world.

If you build software that is only used by you, then you can focus just on that software. But as we made it open source at edX, we have a team that is focused on Open edX and is maintaining the Open edX community to generate traction from various engineering resources. This certainly complements our own efforts at edX, because sometimes people want some support and help for particular features that are also useful for edX.

Q Do you think MOOC providers like edX will transform the traditional classroom culture in the coming future? We are transforming the classroom culture in many ways. I call this process of transformation of traditional classrooms and universities ‘unbundling’. If you look at a traditional university today, it is one bundle. You need to start in the first year, go through various sessions and then finish after some time, say three or four years. We are partnering with various worldwide institutions and creating more hybrid solutions that are unbundled. So we are trying to unbundle the three Cs—the clock (or the time), curriculum and credentials. Through MicroMasters, we are unbundling the clock and offering students education in a shorter span of time than any traditional university. We are also unbundling the credentials using solutions like MicroMasters, and enabling students to complete a masters programme by spending half the time and effort as well as at two or three per cent of the total normal cost. Besides, we are unbundling the curriculum by offering online content to students.

We’ve launched some of the hottest job-related courses to reduce unemployment. Most universities are now moving in the direction of blended education, which combines both online and in-person teaching. Presently, nine out of ten undergraduate students are using edX as part of their campus curriculum at MIT. In the next 20 to 30 years, all education is likely to become blended.

Q What are your views on the education system in India? Also, how can we improve the present system using advancements like edX? The education system is not perfect anywhere in the world. There are many challenges in the education system worldwide; India has its own challenges. Indian professors and teachers are quite dedicated. Likewise, institutions like the IITs and IIMs are in line with some leading universities overseas. However, there are challenges in the style of

24 | September 2016 | OpeN SOUrCe FOr YOU | www.OpenSourceForU.com


Interview For U & Me The Tatas have already announced that they are set to launch a platform using Open edX for high-school students. The same platform can be used for skills, college education and teacher training.

teaching. In India, students need to memorise their course material. We need to find ways of changing how to teach in order to help students understand their course instead of just memorising the material. Further, once you go beyond some of the top universities in India, the quality of education you get is very questionable. Most Tier-II and Tier-III universities do not have enough faculty to teach different subjects. Universities in India should allow students to take some online courses, especially in the areas they do not have their own faculty. Local education institutions should also let their students take a full course through an online platform like edX, but appear for examinations under a local teaching assistant. We are already seeing some success in using edX to improve the Indian education system. Teachers and professors can take courses about using blended learning on edX to learn how they can integrate these learning methods in their teaching practices. This will uplift the education system.

Q How do Indian developers benefit from a platform like edX? Many companies all over the world are building on Open edX and providing services to corporations and governments, for a certain fee. Entrepreneurs and software developers in India can utilise the same platform and start offering some new learning models. There is a vast ecosystem and many companies worldwide are utilising our software to garner profits.

Q Last of all, how can edX help the jobless skilled workers in India? I would encourage all the jobless skilled workers to go and take courses on edX. We have courses to nurture many in-demand skills. These are designed by institutions like IIT Bombay, MIT or Berkeley. In fact, just this summer, we’ve launched some of the hottest job-related courses to reduce unemployment. You don’t need to spend hours sitting in a classroom to learn something new and interesting. Instead, all you need is an active Internet connection to enhance your skills using edX.

Q Countries like Jordan, China and Israel are already using Open edX to launch their national education platform. What about your plans for India? We are in talks with certain sections of the Indian government and would be delighted to work for millions of Indians. We would love to see the government deploy Open edX and launch a national platform to spread the next generation of education in India.

Continued from page 18... 17. Distributional semantics is based on the idea that semantic similarities can be characterised based on the distributional properties of the words in a large corpus. This is popularly expressed by the statement, “A word is known by the company it keeps.” If two words have similar distributions or similar contexts, then they have similar meanings. Can you give a couple of examples of the applications of distributional semantics in natural language processing? 18. Machine translation is the automatic translation of text from a source language to a target language. Machine translation can either be driven by rule based techniques or by statistical techniques. Can you explain the advantages and disadvantages of both rule based machine translation and statistical machine translation? 19. What is transfer learning? Can you illustrate it with an example? 20. Many of the standard NLP systems have been designed for reasonable document sizes, be it automatic key phrase recognition, topic modelling or summarisation. Consider

that you have been asked to design a NLP pipeline to analyse the tweet texts. Tweets are typically very short text and are characterised by the extensive use of non-standard abbreviations. What are the major NLP challenges in analysing short text? If you have any favourite programming questions/ software topics that you would like to discuss on this forum, please send them to me, along with your solutions and feedback, at sandyasm_AT_yahoo_DOT_com. Till we meet again next month, happy programming!

By: Sandya Mannarswamy The author is an expert in systems software and is currently working as a research scientist at Xerox India Research Centre. Her interests include compilers, programming languages, file systems and natural language processing. If you are preparing for systems software interviews, you may find it useful to visit Sandya’s LinkedIn group, Computer Science Interview Training India, at http://www.linkedin. com/groups?home=&gid=2339182

www.OpenSourceForU.com | OpeN SOUrCe FOr YOU | September 2016 | 25


Admin

Insight

Volatility: The Open Source Framework for Memory Forensics Computer attacks are a constant concern for admins and users of computers. These are attacks that are stealthy enough not to leave any traces on the hard disk of the computer. To detect such attacks, we need to make a forensic analysis of the memory dump of the computer. This analysis is termed memory forensics. Volatility is the open source framework that could help us with memory forensics.

A

ccording to Wikipedia, “Memory analysis is the science of using a memory image to get information about running programs, the operating system, and the overall state of a computer.” Volatile memory contains

valuable information about the runtime state of the system (the network, file system and registry). In this article, we are going to investigate the digital artifacts of volatile memory using Volatility.

Introducing Volatility

Volatility is an open source framework used for memory forensics and digital investigations. The framework inspects and extracts the memory artifacts of both 32bit and 64-bit systems. The framework has support for all flavours of Linux, Windows, MacOS and Android. It can analyse raw memory dumps, crash dumps, virtual machine snapshots, VMware dumps (.vmem), Microsoft crash dumps, hibernation files, virtual box dumps, and many others. The framework is intended to investigate the system’s state independently and consists of over 35 plugins for analysing. Volatility is available as pre-installed binaries in several Linux flavours such as REMnux, Kali

Linux, etc. It is freely available in Git (https://github.com/ volatilityfoundation/volatility).

Memory format support

Volatility supports a variety of sample file formats and has the ability to convert between these formats. These are: ƒ Raw/padded physical memory ƒ Firewire (IEEE 1394) ƒ Expert Witness (EWF) ƒ 32-bit and 64-bit Windows Crash Dump ƒ 32-bit and 64-bit Windows Hibernation File ƒ 32-bit and 64-bit Mach-O files ƒ Virtualbox ELF64 Core Dumps ƒ VMware Saved State (.vmss) and Snapshot (.vmsn) ƒ HPAK format (FastDump/FDpro) ƒ QEMU memory dumps ƒ LiME (Linux Memory Extractor) format ƒ Mac-0 file format

Operating system support ƒ ƒ

26 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com

Support for all 32-bit and 64-bit Windows systems Support for 32-bit and 64-bit Linux kernels ≤ 4.2


Insight Installation

Python 2.7 is a pre-requisite for installing Volatility. To install on a Linux system, you can download and extract the archive from https://github.com/volatilityfoundation/volatility. Then run the following command: sudo python setup.py install

Or, run the following command: apt-get install volatility

Installation of Volatility in Windows is manual and needs additional packages. These are listed below. Distorm 3: A powerful disassembler library for x86/AMD64 Yara: A malware identification and classification tool Pycrypto: The Python cryptography toolkit Pillow: The Python imaging library Openpyxl: The Python library to read/write Excel Ujson: An ultra-fast JSON parsing library Pytz: This is for time zone conversion

Admin

In general, everything in the OS traverses RAM. Tracing the RAM fingerprints can be helpful to detect some advanced malware. In this article, we explore some plugins, which are available in the Volatility framework. Let me take a memory sample of the malware ‘stuxnet’ downloaded from the above link. Unzip the Stuxnet.zip file and extract Stuxnet.vmem, which is a virtual memory file. Given below are a few fingerprints that have been extracted from the sample. Volatility helps us to identify the OS’ profile information, which gives the meta information about the memory file. Volatility can inspect the live memory image of any operating system. The framework can give the status of an active process, a hidden process, unlinked processes, DLL loaded in runtime, socket information, external connection information, etc. Some of the plugins were tested and the results were given. 1. Imageinfo: This identifies the profile image. The syntax is: python vol.py imageinfo –f Stuxnet.vmem

Installation using an executable

A standalone executable can be downloaded from http://www. volatilityfoundation.org/#!24/c12wa (available in both the standalone version and the Python Win32 module version).

Memory inspection

Figure 2: Screenshot of the Imageinfo plugin which gives the suggested profiles

2. Pslist: This lists the running process. The syntax is: python vol.py pslist –f Stuxnet.vmem

In order to analyse the memory dumps, the profile should be defined prior to execution. You can start trying the software using the memory sample available for testing purposes at https:// github.com/volatilityfoundation/volatility/wiki/Memory-Samples. The syntax is: python vol.py –h

The above executed command shows you the available options in the Volatility framework.

Figure 3: Screenshot of the pslist plugin which lists the running processes

Figure 1: Screenshot of the Volatility framework

Figure 4: Screenshot of the psscan plugin which scans for hidden processes www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 27


Admin

Insight

Figure 5: Screenshot of the pstree plugin

Figure 8: Screenshot of the wintree plugin

7. Timeliner: This creates a timeline from various artefacts in memory. The syntax is: python vol.py timeliner –f Stuxnet.vmem

Figure 6: Screenshot of the dlllist plugin which scans for hidden processes

8. Wintree: This prints the Z-Order Desktop Windows tree. The syntax is: python vol.py wintree –f Stuxnet.vmem

Figure 7: Screenshot of the sockets plugin which shows the active TCP connections

3. Psscan: This plugin scans the hidden/inactive/unlinked processes and is used for malware analysis. The syntax is:

Present day malware are stealthier and remain hidden during dynamic behaviour analysis. In order to detect such types of malware and their behaviour, run time memory inspection can be carried out. Malware, including rootkit, traverses the RAM; hence the Volatility framework helps us to inspect the live memory of any operating system. This can help us to possibly detect some advanced malware, which is very persistent in its behaviour.

python vol.py psscan –f Stuxnet.vmem

4. Pstree: This plugin displays the running process in tree form. The syntax is: python vol.py pstree –f Stuxnet.vmem

5. DLL list: This displays the DLLs used by all the processes. Here, Figure 6 denotes the screenshot of the process ID: 660 and its associated DLLs. The syntax is: python vol.py dlllist –f Stuxnet.vmem

6. Sockets: This plugin is used to find out the listening socket connections during the time of the memory dump. The syntax is: python vol.py sockets –f Stuxnet.vmem

References [1] http://www.volatilityfoundation.org/#!24/c12wa [2] https://github.com/sibichakkaravarthy/Malware-Analysis [3] https://github.com/volatilityfoundation/volatility/wiki/ Memory-Samples [4] https://github.com/volatilityfoundation/volatility [5] http://resources.infosecinstitute.com/memory-forensicsand-analysis-using-volatility/

By: Sibi Chakkaravarthy Sethuraman The author holds an M. Tech degree in computer science and engineering, and is working on a Ph.D at Anna University. Currently, he is with the department of electronics engineering, MIT, Chennai. His research interests include network security, wireless security, wireless sensor networks, cloud security, etc. He can be reached at sb(DOT)sibi(AT)gmail(DOT)com.

28 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


How To Admin

Network Performance Monitoring and Tuning in Linux There’s nothing more annoying than a slow network connection. If you are a victim of poorly performing network devices, this article may help you to at least alleviate, if not solve, your problem.

T

he improvement in the performance of networking devices like switches, routers and connectors such as Ethernet cables has been huge in recent years. But our microprocessors lag behind when it comes to processing the data received and transferring it through them. Your organisation can spend money to buy high quality Ethernet cables or other networking devices to send terabytes or petabytes of data but may not experience performance improvements because, ultimately, the data has to be dealt with by the operating-system kernel. That is why, to get the most out of your fast Ethernet cable, as well as high bandwidth support and a high data transmission rate, it is important to tune your operating system. This article covers network monitoring and tuning methods—mostly parameters and settings available in the OS and network interface card (NIC)—that can be tweaked to improve the performance of your overall network.

How your operating system deals with data

Data destined to a particular system is first received by the NIC and is stored in the ring buffer of the reception (RX) present in the NIC, which also has TX (for data transmission). Once the packet is accessible to the kernel, the device driver raises softirq

(software interrupt), which makes the DMA (data memory access) of the system send that packet to the Linux kernel. The packet data in the Linux kernel gets stored in the sk_buff data structure to hold the packet up to MTU (maximum transfer unit). When all the packets are filled in the kernel buffer, they get sent to the upper processing layer – IP, TCP or UDP. The data then gets copied to the preferred data receiving process. Note: Initially, a hard interrupt is raised by the device driver to send data to the kernel, but since this is an expensive task, it is replaced by a software interrupt. This is handled by the NAPI (new API), which makes the processing of incoming packets more efficient by putting the device driver in polling mode.

Tools to monitor and diagnose your system’s network

ip: This tool shows/manipulates routing, devices, policy routing and tunnels (as mentioned in the man pages). It is used to set up and control the network. ip link or ip addr: This gives detailed information of all the interfaces. Other options and usage can be seen in man ip.

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 29


Admin Insight TrafficSource NIC Hardware

SoftIrq

Ring Buffer DMA

NIC & Device Driver

IP Processing

Socket RCV Buffer

TCP/UDP Processing

Kernel Protocol Stack

Process Scheduler

Traffic Sink

SOCK RCV SYS_CALL

Network Application

Data Receiving Process

Figure 1: Data receiving process Figure 4: The entire Net adapter details shown by ethtool eth0

Figure 2: ip addr output for loopback and eth0 interface

Figure 3: netstat-ta shows all the tcp connections happening in the system

netstat: This utility shows all the network connections happening in your system. It has lots of statistics to show, and can be useful to find bottlenecks or problems. ethtool: To tune your network, ethtool can be your best friend. It is used to display and change the settings of your network interface card. ethtool eth0: This shows the hardware setting of the interface eth0. ss: This is another helpful tool to see network statistics and works the same as netstat. ss –l: This shows all the network ports currently opened.

Benchmark before tuning

To get maximum or even improved network performance, our goal is to increase the throughput (data transfer rate) and latency of our network’s receiving and sending capabilities. But tuning without measurement (effected values) is useless as well as dangerous. So, it is always advisable to benchmark every change you make, because any change that doesn’t result in an improvement is pointless and may degrade performance. There are a few benchmarking tools available for networks, of which the following two can be the best choices. Netperf: This is a perfect tool to measure the different aspects of a network’s performance. Its primary focus is on data transfer using either TCP or UDP. It requires a server and a client to run tests. The server should be run by a

Figure 5: ss output

daemon called netserver, through which the client (testing system) gets connected. By running both server and client on localhost, I got the output shown in Figure 6. Remember that the netperf default runs on Port 12865 and shows results for an IPv4 connection. Netperf supports a lot of different types of tests but for simplicity, you can use default scripts located at /usr/share/doc/netperf/examples. It is recommended that you read the netperf manual for clarity on how to use it. Iperf: This tool performs network throughput tests and has to be your main tool for testing. It also needs a server and client, just like netperf. To run iperf as the server, use $iperf -s -p port, where ‘port’ can be any random port. On the client side, to connect to the server, use $iperf -c server_ip -p server_port. The result (Figure 8) has been tested on my localhost as both server and client, so the output is not appropriate.

Start the tuning now

Before proceeding to tune your system, you must know that every system differs, whether it is in terms of the CPU, memory, architecture or other hardware configuration. So a tunable parameter may or may not enhance the performance

30 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


How To Admin

Figure 6: netperf running as a server and client on the same system

Figure 8: iperf running as a client

Figure 7: iperf running as server

Figure 9: ip result for a healthy system with no packet drops

of your system as there are no generic tweaks for all systems. Another important point to remember is that you must develop a good understanding of the entire system. The only part of a system that should ever be tuned is the bottleneck. Improving the performance of any other component other than the bottleneck will result in no visible performance gains to the end user. But the focus of this article is on the general, potential bottlenecks you might come across; even on healthy systems, these tweaks may give improved performance. One obvious situation to tune your network for is when you start to drop received packets or transmit/ receive the message ‘error has occurred’. This can happen when your packet holding structures in the NIC, device driver, kernel or network layer are not able to handle all the received packets or are not able to process these fast enough. The bottleneck in this case can be related to any of the following: 1) NIC 2) Soft interrupt issued by a device driver 3) Kernel buffer 4) The network layer (IP, TCP or UDP) It’s better to find the exact bottleneck from among these options and then tune only that area. Or you can apply the methods given below, one by one, and then find which changes improve performance. To find out the packet drop/error details, you can use the ip utility. More specific results can be viewed by using ethtool, as follows:

Here, for eth0, which is the default network interface for the Ethernet, the MTU (maximum transfer unit) value is set to 1500. This value is defined for a single slot. Ethernet connections of 10Gbps or more may need to increase the value to 9000. These large frames are called Jumbo frames. To get Ethernet to use Jumbo frames, you can use ifconfig, as follows:

After the network adapter receives the packets, the device driver issues a single hard interrupt followed by soft interrupts (handled by NAPI). Interrupt Coalescence is the number of packets the network adapter receives before issuing a hard interrupt. Changing the value to cause interrupts fast can lead to lots of overhead and, hence, decrease performance, but having a high value may lead to packet loss. By default, your settings are in adaptive IC mode, which auto balances the value according to network traffic. But since the new kernels use NAPI, which makes fast interrupts much less costly (performance wise), you can disable this feature. To check whether the adaptive mode is on or not, run the following command:

ethtool -S eth0

$ ethtool -c eth0

Tuning the network adapter (NIC)

Jumbo frames: A single frame can carry a maximum of 1500 bytes by default as shown by the ip addr command.

$ifconfig eth0 mtu 9000

Note: Setting the value to 9000 bytes doesn’t make all frame sizes that large, but this value will be only used depending on your Ethernet requirements.

Interrupt Coalescence (IC)

To disable it, use the following command: $ethtool -C eth0 adaptive-rx off

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 31


TRACKS

FOSS For Everyone

Mobile App Dev

Open Source that you can

What’s hot on the Mobile app development front

Web App

IT infrastructure

Application development for the Web

State of the art Threat management, storage, computing and communication hardware

Cloud & Big Data

Database The latest and the best on the open source database front

Open Source in IoT

How Open Source is playing a part in the IoT world

How’s Open Source driving the cloud and big data

Success Stories (Entrepreneurs & Customers)

Customers talking about Open Source implementations

FREEZE YOUR CALENDER NOW!


www.osidays.com

Asia’s #1 Conference on Open Source

Stall Booking & Partnering Opportunities Open Now For more details, call on 011-40596605 or email us at info@osidays.com Silver Partner

Associate Partners

Lanyard Partner


Admin Insight

Figure 10: MTU value shown by ip addr command

Figure 11: TCP read and send buffer size

Caution: If your system generates high traffic as in the case of a hosting Web server, you must not disable it.

A pause frame is the duration of the pause, in milliseconds, which the adapter issues to the network switch to stop sending packets if the ring buffer gets full. If the pause frame is enabled, then loss of packets will be minimal. To view the pause frame setting, use the following command:

transmission (as in the case of a home user), this configuration will be helpful to increase performance. But for industrial use, it has to be tweaked for better performance, as stopping it may cause high load on a single CPU – particularly for services that cause fast interrupts like ssh. You can also disable irqbalance for particular CPUs. I recommend that you do that by editing /etc/sysconfig/irqbalance. To see how to balance specific CPUs, visit http://honglus.blogspot.in/2011/03/tuneinterrupt-and-process-cpu-affinity.html.

$ethtool -a eth0

Kernel buffer

Pause frames

To enable pause frames for both RX (receiving) and TX (transferring), run the command given below: $ethtool -A eth0 rx on tx on

There are lots of other things you can tune using ethtool to increase performance through the network adapter, like speed and duplex. But these are beyond the scope of this article. You can check them on the ethtool manual.

Software interrupts

SoftIRQ timeout: If the software interrupt doesn’t process packets for a long time, it may cause the NIC buffer to overflow and, hence, can cause packet loss. netdev_budget shows the default value of the time period for which softirq should run. $sysctl net.core.netdev_budget

The default value should be 300 and may have to be increased to 600 if you have a high network load or a 10Gbps (and above) system. # sysctl -w net.core.netdev_budget=600

IRQ balance: Interrupt requests are handled by different CPUs in your system in a round robin manner. Due to regular context switching for interrupts, the request handling timeout may increase. The irqbalance utility is used to balance these interrupts across the CPU and has to be turned off for better performance. $service irqbalance stop

If your system doesn’t deal with high data receiving and

The socket buffer size is: net.core.wmem_default net.core.rmem_default net.core.rmem_max net.core.wmem_max

These parameters show the default and maximum write (receiving) and read (sending) buffer size allocated to any type of connection. The default value set is always a little low since the allocated space is taken from the RAM. Increasing this may improve the performance for systems running servers like NFS. Increasing them to 256k/4MB will work best, or else you have to benchmark these values to find the ideal value for your system’s configuration. $sysctl $sysctl $sysctl $sysctl

-w net.core.wmem_default=262144. -w net.core.wmem_max=4194304 -w net.core.rmem_default=262144 -w net.core.rmem_max=4194304

Every system has different values and increasing the default value may improve its performance but you have to benchmark for every value change.

Maximum queue size

Before processing the data by the TCP/UDP layer, your system puts the data in the kernel queue. The net.core. netdev_max_backlog value specifies the maximum number of packets to put in the queue before delivery to the upper layer. The default value is not enough for a high network load, so simply increasing this value cures the performance drain due to the kernel. To see the default value, use sysctl with $sysctl net.core.netdev_max_backlog. The default value is 1000 and increasing it to 3000 will be enough to stop packets from being dropped in a 10Gbps (or more) network.

34 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


How To Admin TCP/UDP processing

TCP timestamp

The TCP buffer size is:

TCP timestamp is a TCP feature that puts a timestamp header in every packet to calculate the precise round trip time. This feature causes a little overhead and can be disabled, if it is not necessary to increase CPU performance to process packets.

net.ipv4.tcp_rmem net.ipv4.tcp_wmem

These values are an array of three integers specifying the minimum, average and maximum size of the TCP read and send buffers, respectively.

$sysctl -w net.ipv4.tcp_timestamps=0

TCP SACK

TCP Selective Acknowledgements (SACK) is a feature that allows TCP to send ACK for every segment stream of packets, as compared to the traditional TCP that sends ACK for contiguous segments only. This feature can cause a little CPU overhead; hence, disabling it may increase network throughput.

Note: Values are in pages. To see the page size, use the command $getconf PAGE_SIZE For the latest kernel (after 2.6), there is a feature for auto tuning, which dynamically adjusts the TCP buffer size till the maximum value is attained. This feature is turned on by default and I recommend that it be left turned on. You can check it by running the following command:

# sysctl -w net.ipv4.tcp_sack=0

TCP FIN timeout

In a TCP connection, both sides have to close the connection independently. Linux TCP sends a FIN packet to close the connection and waits for FINACK till the defined time mentioned in…

$cat /proc/sys/net/ipv4/tcp_moderate_rcvbuf

Then, to turn it on in case it is off, use the command given below:

net.ipv4.tcp_fin_timeout. $sysctl -w net.ipv4.tcp_moderate_rcvbuf=1

This setting allocates space up to the maximum value, in case you need to increase the maximum buffer size if you find the kernel buffer is your bottleneck. The average value need not be changed, but the maximum value will have to be set to higher than the BDP (bandwidth delay product) for maximum throughput. BDP = Bandwidth (B/sec) * RTT (seconds), where RTT (round trip time) can be calculated by pinging to any other system and finding the average time in seconds. You can change the value with sysctl by using the following command:

$sysctl -w net.ipv4.tcp_fin_timeout=20

UDP buffer size

UDP generally doesn’t need tweaks to improve performance, but you can increase the UDP buffer size if UDP packet losses are happening. $ sysctl net.core.rmem_max

Miscellaneous tweaks

$sysctl -w net.ipv4.tcp_rmem=”65535 131072 4194304” $sysctl -w net.ipv4.tcp_wmem=”65535 131072 194304”

Maximum pending connections

The default value (60) is quite high, and can be decreased to 20 or 30 to let the TCP close the connection and free resources for another connection.

An application can specify the maximum number of pending requests to put in queue before processing one connection. When this value reaches the maximum, further connections start to drop out. For applications like a Web server, which issue lots of connections, this value has to be high for these to work properly. To see the maximum connection backlog, run the following command:

IP ports: net.ipv4.ip_local_port_range shows all the ports available for a new connection. If no port is free, the connection gets cancelled. Increasing this value helps to prevent this problem. To check the port range, use the command given below: $sysctl net.ipv4.ip_local_port_range

You can increase the value by using the following command: $sysctl -w net.ipv4.ip_local_port_range=’20000 60000’

$ sysctl net.core.somaxconn The default value is 128 and can increase to much secure value. $sysctl -w net.core.somaxconn=2048

Tuned

Tuned is a very useful tool to auto-configure your system according to different profiles. Besides doing manual tuning, www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 35


Admin Insight tuned makes all the required tweaks for you dynamically. You can download it by using the following command: $apt-get install tuned

To start the daemon, use the command given below:

Linux, which you may want to change for better network performance. But it is recommended that you understand these features well before trying them out. Blind tuning can only be good if you benchmark continuously and properly to check if the changes are leading to positive results. Interesting places to find other available parameters are net.core and net.ipv4.

$tuned -d

To check the available profiles, you can see man tunedprofiles. You can create your own profile but for the best network performance, tuned already has some interesting profile options like network-throughput and networklatency. To set a tuned profile, use the following command: $tuned -p network-throughput

Making changes permanent

The changes made by sysctl get lost on every reboot; so to make them permanent, you need to save these changes to /etc/sysctl.conf. For example: $echo ‘net.ipv4.tcp_moderate_rcbuf=1’ >>/etc/sysctl.conf

There are many more tuning parameters available in

References [1] For data processing by the kernel: http://www.coderplay.org/ networkingdev/Linux-Kernel-Networking-Packet-Receiving.html [2] For network performance tuning: https://access.redhat. com/sites/default/files/attachments/20150325_network_ performance_tuning.pdf https://wiki.chipp.ch/twiki/pub/CmsTier3/ NodeTypeFileServerHPDL380G7/ols2009-pages-169-1842.pdf [3] For overall system performance tuning (has some good theory on how to correct benchmarks): http://www.atlanticlinux.ie/training/performance-tuning/ slides-with-notes.pdf

By: Shubham Dubey The author is a B. Tech student at the LNM Institute of Information Technology. He is a Linux enthusiast and works in the field of cloud computing, virtualisation and cyber security. You can contact him at sdubey504@gmail.com. LinkedIn: https://in.linkedin.com/in/shubham0d

Please share your feedback/ thoughts/ views via email at osfyedit@efy.in

36 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Insight Developers

Let’s Get Acquainted with PHP

PHP is a server side scripting language that is embedded in HTML. It is used to manage dynamic content, databases, session tracking, and even build entire e-commerce sites. This article gives an introduction for newbies and wannabe PHP users.

P

HP (Hypertext Preprocessor) is the most widely-used open source scripting language. It is particularly suited for Web development and can be entrenched into HTML. Its syntax is borrowed from C, Java and Perl languages. In 1994, Rasmus Lerdorf released the primary version of PHP. This language manages active content, databases and can even be used to construct a complete e-commerce site. PHP supports various known databases like MySQL, Oracle, PostgreSQL and MS SQL Server. PHP is very fast in its implementation, particularly when organised as an Apache component on the UNIX side. Also, once the MySQL server is on the go, it implements even the most complex queries with big result data sets in recordsetting times. It is the language that supports a huge number of chief protocols like IMAP and LDAP. PHP incorporates a C language-like syntax.

General uses of PHP ƒ

ƒ ƒ

ƒ ƒ ƒ

It has the right to use cookies variables and set cookies. PHP can limit users to accessing only a few pages of your website. PHP can encrypt data.

Characteristics of PHP

The most important characteristics of PHP are: ƒ Effortlessness ƒ Effectiveness ƒ Safety ƒ Suppleness ƒ Awareness

How PHP can be used

It performs functions of the system. For example, it can produce, unlock, examine, write and close files on a system. It can handle various forms—it can collect data from files and save it to a file via email. The user can insert, remove and alter elements within the database through PHP.

Let’s begin with a simple PHP script. Project 1! is a vital example, so let’s first make a welcoming Project 1! script. PHP is entrenched in HTML, which means that HTML includes PHP statements such as: <html> <head> <title> My First PHP Page </title> </head>

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 37


Developers Insight <body> <?php echo “Project 1!”;?> </body> </html>

Popular PHP Sites

Social Sites

Blogs, CMS

Others

Ecommerce

Now, the result will be: Project 1!

What are PHP namespaces?

A namespace can simply be defined along with the namespace keyword and a name. Also, it needs to be declared prior to any other code along with the exemption of infrequently used declare statements. Thus, once a namespace has been defined, its scope applies to the whole file. For example:

Facebook

<?php Namespace Myassignment\blog\admin;

Nested namespaces can be easily defined in this way. Likewise, there are many other forms for namespaces like defining multiple namespaces, global namespaces, and a lot more.

Global namespaces

Global namespaces are great for simplifying storage organisation in environments that have abundant material file systems. These give a consolidated outlook into numerous network file systems, common Internet file systems, networkattached storage systems or file servers located in different physical locations. They are mainly helpful in dispersed implementations along with formless data. They also

Joomla

PrestaShop

OpenCart

Magento

Yahoo

Wikipedia

Web Application Development Customized Scalable Applications

It may also be defined alternatively just by covering its contents with brackets. Also, there is no code allowed outside the brackets apart from the formerly mentioned declare statement. For example:

The two methods discussed above cannot be combined, so the user should stick to any one of these methods, at a particular time. Now let’s get to know something about nested namespaces as well. These are defined by adding a backslash to separate each level. For example:

Wordpress

Figure 1: PHP in industry

<?php Namespace Mynamespace; //....

<?php Namespace Mynamespace { //.... }

Digg

Application Reengineering and Enhancement

Zend Framework Development Web Application Maintenance

Open Source PHP Implementation

MY-SQL Data Modeling

Figure 2: Areas of suitability

support environments that are rising rapidly, so that data can be easily accessed without the need to know where it physically resides. Users should keep in mind that without a global namespace, multiple file systems would require to be handled individually.

Sub-namespaces

Namespaces can pursue a definite chain of command, like the directories in the file system on your desktop. Subnamespaces are tremendously helpful for maintaining the arrangement of a project. For example, if your project needs database entrée, then you may need to put all the databaserelated secret code, like a database exemption and link manager, in a sub-namespace known as a database. Also, to uphold the suppleness, it is prudent to save sub-namespaces in sub-directories. This will allow you to structure your plan and will make it much easier to employ autoloaders that satisfy the PSR-0 criterion.

Calling code from a namespace

If the user needs to instantiate a new objective, then you need to either call a function or use a constant from a diverse namespace, along with using the backslash notation. This can be determined from the three dissimilar categories listed here.

38 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Insight Developers 1. Unqualified name: This is basically the name of a class, function or a constant, devoid of counting a reference to any namespace. Also, if the user is new to namespacing, then this is the basic category you need to work from. For example: <?php Namespace My_assignment; Class Myclass { Static function staticMethod() { Echo ‘Hi, Everyone!’; } } Myclass:staticMethod();

Figure 3: Scope of PHP

2. Qualified name: This is the case in which we know how to access the sub-namespace chain of command along with using the backslash notation. For example:

describe a namespace, but it can also be used to explicitly determine the current namespace.

Namespace constant

<?php Namespace My_assignment; Require ‘my_assignment/database/connection.php’; $connection = new database\connection();

As this keyword does not determine the current class name, hence the namespace keyword cannot be used to determine the current namespace. This is why we use the namespace constant.

3. Fully qualified name: The above discussion on unqualified and qualified names is based on the namespace you are presently in. Also, these two categories can only be used to access definitions at that particular stage or to jump deeper into the namespace chain of command. But, if you need to access a function, a class or a constant that is residing at a superior stage in the chain of commands, then you need to employ the fully qualified name, which is the complete path. Your call needs to be made with a backslash notation. With this, PHP gets to know that this particular call should be determined from the global space as a replacement for approaching it moderately. For example:

In PHP, namespacing supports importing. This is basically referred to as aliasing or importing, which can only be done on classes, interfaces and namespaces. Importing, which is the most useful feature of namespacing, allows the easy use of external code packages such as libraries without the worry of contradictory names. Importing is thus attained by the use of a keyword. PHP is the perfect choice for you because of the following reasons. 1. Quick loading time: It has the fastest speed in loading a site. This is because it runs in its own memory space. 2. Low cost software: Since PHP is an open source language, most of the tools connected with the code don’t need to be paid for. 3. Low cost hosting: PHP can easily be executed on a Linux server, which is simply available via a hosting provider without any additional cost. 4. Flexibility of the database: Database connectivity can be easily done in PHP. These are some of the common reasons why PHP is considered the best programming language.

<?php Namespace My_assignment\database; Require ‘My_assignment/fileaccess/input.php; $input = new\My_assignment\Fileaccess\Input();

Dynamic calls

PHP is a dynamic encoding language, so this functionality can be used for calling namespaced code. It is actually the same as instantiating variable classes or counting variable files. Also, never forget to remove the backslash at the time of storing a namespace name in a string.

Namespace keyword

A namespace keyword is not only the one that is used to

Aliasing/importing

By: Meghraj Singh Beniwal The author has a B.Tech in electronics and communication, is a freelance writer and an Android app developer. He currently works as an automation engineer at Infosys, Pune. He can be contacted at meghrajsingh01@rediffmail.com or meghrajwithandroid@gmail.com.

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 39


Developers Insight

Julia: A Language that Walks Like Python, Runs Like C Julia is an emerging star in the world of programming languages. It offers the coveted combination of high performance and productivity. What Julia offers in terms of execution speed may almost seem unbelievable, especially to newcomers to this language. This article provides an introduction to Julia, which is as simple as Python to learn, and on par with C and Fortran in terms of execution speed.

A

s a programmer you may question the very need for yet another programming language. Such a question is very fair with the available ecosystem of a few hundred languages. However, each programming language is designed to achieve a specific goal. So it would be right to start this article by stating the specific objective of designing the Julia language. There is a basic conundrum in programming: execution speed vs development ease. If a language leans towards programmers, then there are likely to be issues with the execution speed. On the other hand, if you want to generate optimised code, then the development process will probably be harder. For example, a system completely designed in C programming language will generate optimal assembly code. However, development of such a system would take longer or

the process might not be very friendly for a major section of programmers. Julia addresses these issues, as it combines the best of both worlds. It generates optimal machine code that runs on par with C programs, and at the same time the development process is equivalent to friendlier languages such as Python. The popular phrase in the Julia community is: â&#x20AC;&#x153;Walk Like Python; Run Like C.â&#x20AC;? Julia is an open source, high level programming language. It is a language for scientific, technical computing. You may be aware that scientific, technical computing problems are resource heavy. Julia is designed in a way that it allows developers to build solutions for scientific and technical computing problems swiftly, without compromising on performance. It comes loaded with an extensive library

40 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Insight Developers for mathematical functions and has a state-of-art compiler, which generates the optimised code with very high levels of numerical precision. Julia has the inherent capability to execute code in the parallel mode. The development of Julia has been spearheaded by Jeff Bezanson, Stefan Karpinski, Viral Shaw and Alan Edelman. The initial version was released in 2012. The Julia community is on the rise, which is very evident from the official GitHub page of Julia (https://github.com/JuliaLang/julia). The current version of Julia is 0.4.6, as of June 19, 2016.

Figure 1: Julia welcome screen

Julia installation

Julia can be installed in all major operating systems such as Linux, Mac OS X and MS Windows. The latest version of Julia may be downloaded from http://julialang.org/downloads/, where 32- and 64-bit versions are provided for all platforms. You need to select the version matching your operating system and configuration. Installation is straightforward. If you are using Windows, download the Julia set-up file, extract and execute. If you are a Mac OS X user, the corresponding ‘dmg’ file may be downloaded and executed. For Linux users, the distribution specific instructions are provided in http:// julialang.org/downloads/platform.html. In this article, the Ubuntu installation is provided as a sample. Execute the following commands for Julia installation in Ubuntu: sudo sudo sudo sudo

Figure 2: Julia execution modes Free and Open Source

Multiple Dispatch Dynamic Type System

Parallelism & Distributed Computation Coroutines Attractive REPL Cross Language Calling

Julia Features

Superior Performance Package Manager Lisp-Like Macros

Figure 3: The features of Julia

add-apt-repository ppa:staticfloat/juliareleases add-apt-repository ppa:staticfloat/julia-deps apt-get update apt-get install julia

On successful installation, type ‘Julia’ at the terminal to start Julia. The welcome screen of Julia is shown in Figure 1.

Julia execution modes

Julia can be executed in three different ways as illustrated in Figure 2. ƒ Command line execution: To work in this mode, just download Julia as mentioned above and start working at the terminal by typing julia. ƒ Using the Integrated Development Environment Juno: To work in this mode, first the Julia command line needs to be downloaded. If Atom is not installed in your system, then get it from https://atom.io/. After the successful installation of Atom, navigate to the Settings::Install panel (press Ctrl to open Settings). Enter uber – juno and install the same. ƒ Using JuliaBox.com. No installation is required for this mode. You can visit this URL and start interacting with Julia.

The features of Julia

The significant features of Julia are illustrated in Figure 3. This language has the ability to define function behaviour

Figure 4: Julia shell mode

across many combinations of parameter types. This feature is known as Multiple Dispatch. Availability of an in-built package manager, attractive REPL, the ability to execute code in a parallel and distributed environment, and optimal code generation are the other noteworthy features of Julia.

Julia REPL

Julia provides an attractive REPL (Read – Evaluate – Print – Loop), which facilitates the execution of Julia expressions. The expressions are evaluated immediately and the results are printed. julia> 5 + 3 8

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 41


Developers Insight If you enter a ‘?’ in the REPL, it will switch to the HELP mode. In this mode, you may enter keywords and search for information on them. In the HELP mode, the prompt changes to yellow.

REPL function / macros

Description

versioninfo()

Provides the information regarding the installed Julia version as well as platform-specific information:

help?> peakflops() .. peakflops(n; parallel=false)

‘peakflops’ computes the peak flop rate of the computer by using double precision :func:`Base.LinAlg.BLAS.gemm!`. By default, if no arguments are specified, it multiplies a matrix of size ‘n x n’, where ‘n = 2000’. If the underlying BLAS is using multiple threads, higher flop rates are realised. The number of BLAS threads can be set with blas_set_num_threads(n). If the keyword argument ‘parallel’ is set to ‘true’, ‘peakflops’ is run in parallel on all the worker processors. The flop rate of the entire parallel computer is returned. When running in parallel, only one BLAS thread is used. The argument ‘n’ still refers to the size of the problem that is solved on each processor. The code above provides helpful information regarding the peakflops(). To enter into shell mode, press ‘;’. From the shell mode, you can execute shell commands, as shown in Figure 4. In the shell mode, the prompt changes to red. The REPL provides many more functions. A few popular functions and their descriptions are provided in Table 1.

julia> versioninfo() Julia Version 0.4.6 Commit 2e358ce (2016-06-19 17:16 UTC) Platform Info: System: Linux (x86_64-linux-gnu) CPU: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz WORD_SIZE: 64 BLAS: libopenblas (NO_LAPACKE DYNAMIC_ ARCH NO_AFFINITY Sandybridge) LAPACK: liblapack.so.3 LIBM: libopenlibm LLVM: libLLVM-3.3

edit(file_name)

Edits the file with name file_name

@edit fn_ name()

Edits the definition of built-in function fn_name(). Julia allows you to edit the definition of built-in functions as well.

@time

Provides timing information for the command executed. For example:

The power of Julia

An interesting introduction to Julia by one of its co-creators, Viral Shaw, is available at https://www.youtube.com/ watch?v=6fioFiKMXFk. The illustration provided in this video enables us to understand and appreciate the power of Julia. As stated earlier, Julia code runs very fast. Let’s prove that with a simple but powerful example:

julia> @time randn(10^4) 0.000131 seconds (7 allocations: 78.359 KB)

@code_native

function test(n::Array) sum = 0.0 for i = 1:length(n) sum+=n[i] end println(sum) end

julia> @code_native 45+34 .text Filename: int.jl Source line: 8 pushq %rbp movq %rsp, %rbp Source line: 8 addq %rsi, %rdi movq %rdi, %rax popq %rbp ret

num = randn(10^7) @time test(num)

The code above has a simple function to navigate through an array and find the sum of elements. The real power of Julia lies in the scalability. The code has a statement to generate 10 to the power of seven (10 million) random numbers (with normal distribution). num = randn(10^7)

Provides native code generated for the execution of the function. For example:

Table 1: Handy REPL functions

The 10 to the power of seven elements array is supplied as an input to the function. Julia is capable of looping through the 10 million elements array and finding the sum in 0.016 seconds.

42 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Insight Developers Define a generic function:

COMPLEX Irrational

julia> f(x) = 4x^3+3x^2+8 f (generic function with 1 method)

Rational Bool

Number

Int8

Call f(x) with an integer value:

Int16 Signed

Int32 Int64

REAL

Integer

Int128 UInt8 UnSigned

julia> f(5) 583

UInt16

Call f(x) with a float value:

UInt32 UInt128 BigInt Float 16 Abstract Float

julia> f(25.67) 69645.719752

Float 32 Float 64 BigFloat

Figure 5: Julia numeric data types

Cross-language calls

1332.9215017279294 0.016124 seconds

Julia has the option to call functions from languages such as Python, C, Fortran, etc. This facilitates seamless integration between components written in different languages. For example, to call Python functions from Julia, type:

The code was modified to loop through 10 to the power of 8 (100 million) elements, and Julia was able to give the result in 0.120297 seconds.

using PyCall @pyimport math math.sin(math.pi / 4) - sin(pi / 4) # returns 0.0

a = randn(1000,1000) b = randn(1000,1000) @time a*b

The output is: 0.041690 seconds (9 allocations: 7.630 MB, 5.00% gc time)

Julia focuses on numeric processing. It has an extensive hierarchy of types for handling numbers (as shown in Figure 5).

Type neutral function definition

Julia facilitates the defining of generic functions. Let’s consider the following example.

1000 900

0.2 0.3 0.4 0.5

800 700 600 500

v0.4.1 v0.4.3 v0.4.0 v0.4.2

400

v0.4.5

Jun 2016

Mar 2016

Dec 2015

v0.3.12

Sep 2015

Jun 2015

200

Mar 2015

300

Dec 2014

v0.3.1 v0.3.3 v0.3.5 v0.3.7 v0.3.9 v0.3.11 v0.3.0 v0.3.2 v0.3.4 v0.3.6 v0.3.8 v0.3.10

Sep 2014

Note that the successive execution of code is comparatively faster than the first time execution. This is due to the JIT (Just in Time) compilation adopted by Julia. Let’s take a look at another example that involves the multiplication of two matrices of size 1000x1000 and its timing values.

1100

Jun 2014

8247.19158291053 0.120297 seconds (2.03 k allocations: 102.361 KB)

Number of Tagged Packages

The output is:

Note that we didn’t make any modification to the function definition when calling with a different data type. This generic function definition is a very important feature of Julia.

Figure 6: Package ecosystem A sinusoidally modulated sinusoid

1.0

0.5

0.0

–0.5

–1.0

0

1

2

3

4

5

6

7

Figure 7: Julia plotting demo with PyPlot

Continued on page 49... www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 43


Developers Let’s Try

Using Python for Data Mining This article presents a few examples on the use of the Python programming language in the field of data mining. The first section is mainly dedicated to the use of GNU Emacs and the other sections to two widely used techniques—hierarchical cluster analysis and principal component analysis.

T

his article is introductory because some topics such as varimax, oblimin, etc, are not included here and will be discussed in the future. The complete code is too long for a printed article, but is freely available at https://github.com/ astonfe/iris. The toolbox used in this article is dependent on WinPython 3.4.4.2 and GNU Emacs 24.5 on Windows. My Emacs configuration for the Python language is very simple. The following lines are added to the dot emacs file:

(setq python-indent-guess-indent-offset nil) (org-babel-do-load-languages ‘org-babel-load-languages ‘((python . t))) (add-to-list ‘exec-path “C:\\WinPython32bit-3.4.4.2\\python-3.4.4”) (global-set-key (kbd “<f8>”) (kbd “C-u C-c C-c”)) (setenv “PYTHONIOENCODING” “utf-8”) (setenv “LANG” “en_US.UTF-8”)

The first line is useful to avoid the warning message: ‘Can’t guess python-indent-offset, using defaults: 4’ from Emacs. The next three lines are to use Python in the orgmode, and the last four lines are to use Emacs as an IDE. In the following org file, text, code, figures and a table are present at the same time. This is not very different from a Jupyter Notebook. Each code section can be evaluated with C-c C-c. The export of the whole file as HTML (C-c C-e h h) produces the output shown in Figure 1. #+title: #+options: #+options: #+options:

PAH method GCMS toc:nil num:nil html-postamble:nil

<...some-text-here...> #+begin_src python :var filename=”method.png” :results file :exports results <...some-python-code-here...> #+end_src #+results: [[file:method.png]] #+begin_src python :var filename=”chromatogram.png” :results file :exports results <...some-python-code-here...> #+end_src #+results: [[file:chromatogram.png]] #+attr_html: :frame border :border 1 :class center <...a-table-here...>

44 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com

To do this, it is necessary that Python is recognised by


Let’s Try Developers

Figure 2: Python and Emacs

Figure 3: Spyder IDE Iris flower data set Dendrogram (center, euclidean, ward)

30

25

Distance

20

15

10

5

(3) (7) (8) (8) (4) (3)

107 (9) (5)

(8) (7) (5) (3) (2) (2) (4) (12)

(8) (5) (4)

42 (5) (17) (4)

(15)

0

Figure 4: Agglomerative hierarchical clustering Figure 1: Python and org-mode

the system. You can do this by going to (Windows 7) Start → Control panel→System→Advanced system settings → Environment variables → User variables for <yourusername>→Create, if not present, or modify the variable path→Add C:\WinPython-32bit-3.4.4.2\python-3.4.4; Another method is to use Emacs as an IDE. A Python file can simply be evaluated by pressing the F8 function key (see the above mentioned kbd “<f8>” option). Figure 2 shows an Emacs session with three buffers opened. On the left side is the Python code, on the right side on the top

a dired buffer as file manager and on the right side bottom is the Python console with a tabular output. This is not very different from the Spyder IDE (which is included in the WinPython distribution) shown in Figure 3, with the same three buffers opened.

Hierarchical cluster analysis

This example is about agglomerative hierarchical clustering. The data table is the famous iris flower data set and is taken from http://en.wikipedia.org/wiki/Iris_flower_data_set. It has 150 rows and five columns: sepal length, sepal width, petal length, petal width, species' name (iris setosa from row 1 to

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 45


Developers Let’s Try Iris flower data set Scree plot

100

Iris flower data set Loadings

0.2

3 4

0.0

60

–0.2 PC 2

Variance explained (%)

80

40

–0.4

20

–0.6 1 2

0

1

2 3 Component number

0.0

4

Iris flower data set Scores

1.5

0.6

0.8

Iris flower data set Scores

1.5 Setosa Versicolor Virginica

1.0

1.0

0.5

0.5

PC 2

PC 2

0.4 PC 1

Figure 7: Loadings - PC 1 vs PC 2

Figure 5: Scree plot

0.0

0.0

–0.5

–0.5

–1.0

–1.0

–1.5

0.2

–3

–2

–1

0 PC 1

1

2

3

–1.5

4

–3

–2

–1

0 PC 1

1

2

Figure 6: Scores - PC 1 vs PC 2

Figure 8: Scores with clipart

50, iris versicolor from row 51 to 100, iris virginica from row 101 to 150). The code is short, as shown below:

figtext(0.5,0.95,”Iris flower data set”,ha=”center”,fontsize=12) figtext(0.5,0.91,”Dendrogram (center, euclidean, ward)”,ha=”center”,fontsize=10) savefig(“figure.png”,format=”png”,dpi=300)

import scipy.cluster.hierarchy as hca import xlrd from pylab import * rcParams["font.family"]=["DejaVu Sans"] rcParams["font.size"]=10 w=xlrd.open_workbook("iris.xls").sheet_by_name("Sheet1") data=[] data=array([[w.cell_value(r,c) for c in range(w.ncols)] for r in range(w.nrows)]) dataS=data-mean(data,axis=0) o=range(1,w.nrows+1) y=hca.linkage(dataS,metric=”euclidean”,method=”ward”) hca.dendrogram(y,labels=o,color_threshold=10,truncate_ mode=”lastp”,p=25) xticks(rotation=90) tick_params(top=False,right=False,direction=”out”) ylabel(“Distance”)

3

4

First, the table is read as an array with a nested loop; then, column centring is performed. There are various other scaling techniques, column centring is an example of one of them. The ‘S’ in dataS is for ‘scaled’. In this example, the euclidean metric and the ward linkage method are chosen. Many other metrics are available—for example, canberra, cityblock, mahalanobis, etc. There are also many other linkage methods —for example, average, complete, single, etc. Finally, the dendrogram is plotted as shown in Figure 4. In this example, the clusters are coloured by cutting the dendrogram at a distance equal to 10, using the option color_threshold. To enhance its readability, the dendrogram has been condensed a bit using the option truncate_mode.

46 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Let’s Try Developers Iris flower data set Biplot

0.6

Iris flower data set Loadings 0.8

0.4

0.6

4 0.2

0.4

2

0.2 3

0.0

PC 2

0.0

PC 3

–0.2 –0.2

–0.4 –0.6

–0.4

–0.8 0.4 0.2 0.0 –0.2 –0.4 PC 2

1

–0.6 –0.2 –0.4

–0.2

0.0

0.4

0.2 PC 1

0.6

0.0

0.8

Figure 9: Biplot

0.2

0.4 PC 3

0.6

0.8

–0.6 0.8

1.0

Figure 11: Loadings 3D Iris flower data set Scores

Iris flower data set Scores

1.5

Setosa Versicolor Virginica

1.0 1.0

0.8 0.6 0.4

0.5

0.0 –0.2

PC 2

0.2 PC 3 0.0

–0.4 –0.6

–0.5

–0.8

–4

–3

–2

–1

0 1 PC 1

2

3

4

5

1.5 1.0 0.5 0.0 –0.5PC 2 –1.0 –1.5 –2.0

Figure 10: Scores 3D

Principal component analysis

This second example is about three different techniques —matrix algebra, singular value decomposition (SVD) and modular toolkit for data processing (MDP). About the first, the covariance matrix is calculated on the scaled data. Then, Eigenvalues and Eigenvectors are calculated from the covariance matrix. Lastly, the Eigenvalues and Eigenvectors are sorted. The scores are calculated by a dot product with the data scaled and the Eigenvectors. The percentage of variance is explained and its running total is also calculated. covmat=cov(dataS,rowvar=False) eigval,eigvec=linalg.eig(covmat) idx=eigval.argsort()[::-1] eigval=eigval[idx] eigvec=eigvec[:,idx] scores=dot(dataS,eigvec) percentage=[0]*w.ncols runtot=[0]*w.ncols

–1.0

–1.5

–4

–3

–2

–1

0 PC 1

1

2

3

Figure 12: Scores SVD for i in range(0,w.ncols): percentage[i]=eigval[i]/sum(eigval)*100 runtot=cumsum(percentage)

The number of components (N), the variance explained by each component (VAR), its percentage (PCT) and the percentage running total (SUM) can be presented as a table. This table can be drawn using the package prettytable. The results are formatted with a certain number of decimal figures and then each column is added to the table. from prettytable import * o=range(1,w.nrows+1) v=range(1,w.ncols+1) e=[“%.2f” % i for i in eigval] p=[“%.4f” % i for i in percentage] r=[“%.2f” % i for i in runtot] pt=PrettyTable() pt.add_column(“N”,v) pt.add_column(“VAR”,e) www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 47


Developers Let’s Try Iris flower data set Scores

1.5

Setosa Versicolor Virginica

ab=AnnotationBbox(ib,[i,j],xybox=None,xycoords=”data”,fram eon=False,boxcoords=None) ax.add_artist(ab)

1.0

The two plots about scores and loadings can be overlapped to obtain a particular plot called the biplot. The example presented here is based on a scaling of the scores as in the following code:

PC 2

0.5

0.0

–0.5

xS=(1/(max(s1)-min(s1)))*1.15 yS=(1/(max(s2)-min(s2)))*1.15

–1.0

–1.5

–4

–3

–2

–1

0 PC 1

1

2

3

Figure 13: Scores MDP pt.add_column(“PCT”,p) pt.add_column(“SUM”,r) pt.align=”r” print(pt)

The result is a well-formatted table: +----+------+--------+-------+ | N | VAR | PCT | SUM | +----+------+--------+-------+ | 1 | 4.23 | 92.4619| 92.46 | | 2 | 0.24 | 5.3066 | 97.77 | | 3 | 0.08 | 1.7103 | 99.48 | | 4 | 0.02 | 0.5212 | 100.00| +----+------+--------+-------+

The scree plot is plotted with a simple bar plot type (Figure 5), the scores (Figure 6) and the loadings (Figure 7) with plot. For the scores, the colours are chosen according to the different iris species, because in this example, the data are already categorised. A bit more complex is the scores plot with clipart, as shown in Figure 8 as an example. The original clipart is taken from http://www.worldartsme.com/images/irisflower-clipart-1.jpg, and then processed via ImageMagick. Each clipart is read with imread, zoomed with OffsetImage and then placed on the plot at the scores coordinates with AnnotationBbox, according to the following code: import matplotlib.image as imread from matplotlib.offsetbox import AnnotationBbox,OffsetImage i1=imread(“iris1.png”) i2=imread(“iris2.png”) i3=imread(“iris3.png”) o=range(1,w.nrows+1) ax=subplot(111) for i,j,o in zip(s1,s2,o): if o<51: ib=OffsetImage(i1,zoom=0.75) elif o>50 and o<101: ib=OffsetImage(i2,zoom=0.75) elif o>100: ib=OffsetImage(i3,zoom=0.75)

Then the loadings are plotted with arrow over the scores, and the result is shown in Figure 9. This solution is based on the one proposed at http://sukhbinder.wordpress. com/2015/08/05/biplot-with-python; it probably is not the best way, but it works. The 3D plots (Figures 10 and 11) do not present any particular problems, and can be done according to the following code: from mpl_toolkits.mplot3d import Axes3D ax=Axes3D(figure(0),azim=-70,elev=20) ax.scatter(s1,s2,s3,marker=””) for i,j,h,o in zip(s1,s2,s3,o): if o<51: k=”r” elif o>50 and o<101: k=”g” elif o>100: k=”b” ax.text(i,j,h,”%.0f”%o,color=k,ha=”center”,va=”center”,fo ntsize=8)

Using the singular value decomposition (SVD) is very easy—just call pcasvd on the scaled data. The result is shown in Figure 12. from statsmodels.sandbox.tools.tools_pca import pcasvd xreduced,scores,evals,evecs=pcasvd(dataS)

The modular toolkit for the data processing (MDP) package (see References 4 and 5) is not included in WinPython; so it’s necessary to download the source MDP-3.5.tar.gz from https:// pypi.python.org/pypi/MDP. Then open the WinPython control panel and go to the install/upgrade packages tab. Drag the source file and drop it there. Click on ‘Install packages’. Last, test the installation with the following command: import mdp mdp.test()

This is a bit time consuming; another test is the following command: import bimdp bimdp.test()

48 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Let’s Try Developers In the following example, the scores are calculated using the singular value decomposition; so the Figures 12 and 13 are equal among them but rotated compared with Figure 6. This has been explained at http://stackoverflow. com/questions/17998228, where, to quote user Hong Ooi: "The signs of the Eigenvectors are arbitrary. You can flip them without changing the meaning of the result, only their direction matters."

past, R was my preferred language. It’s useful to develop the same script for both and then compare the results. By: Stefano Rizzi The author works in the areas of analytical chemistry and textile chemistry. He has been a Linux user since 1998.

References

import mdp pca=mdp.nodes.PCANode(svd=True) scores=pca.execute(array(dataS))

The MDP package is more complex than described here. Many other things can be done with it. It's also well documented; for example, the tutorial is more than 200 pages long. The examples presented here are also typical applications for another, very widely used, free and open source software, R. An interesting comparison of Python and R for data analysis was published some time ago (Reference 7). I can’t make a choice, because I like them both. Currently, I use Python almost exclusively, but in the

Continued from page 43... @pyimport is used to import the sub-modules of Python.

Julia packages

Another interesting feature of Julia is the extensibility similar to what we have in languages like Python. As of July 3, 2016, the official repository has 1035 packages. The evolution of the package ecosystem (http://pkg.julialang. org/pulse.html) is illustrated in Figure 6. Primarily, packages enable us to extend the functionality of the core Julia engine. The areas covered by the packages are really diverse in nature. There are packages available for mathematical operations, machine learning, GUI, etc. To add a package, enter the following command at the Julia prompt: Pkg.add(“ package name”) Pkg.add(“PyPlot”)

To use the package, the using keyword is employed:

[1] http://en.wikipedia.org/wiki/Iris_flower_data_set, last visited on 26/07/2016. [2] http://www.worldartsme.com/images/iris-flowerclipart-1.jpg, last visited on 26/07/2016. [3] http://sukhbinder.wordpress.com/2015/08/05/biplotwith-python, last visited on 26/07/2016. [4] Zito, Wilbert, Wiskott, Berkes, Modular toolkit for Data Processing (MDP): a Python data processing framework, Frontiers in Neuroinformatics, 2009, 2, 8. [5] http://mdp-toolkit.sourceforge.net, last visited on 26/07/2016. [6] http://stackoverflow.com/questions/17998228, last visited on 26/07/2016. [7] http://blog.datacamp.com/wp-content/ uploads/2015/05/R-vs-Python-216-2.png, last visited on 26/07/2016.

plot(x, y, color=”red”, linewidth=2.0, linestyle=”--”) title(“A sinusoidally modulated sinusoid”)

The output plot of the above code is shown in Figure 7. (The successful execution of this code requires Python and MatPlotLib availability in the system.) Julia is certainly a promising programming language, and its use is on the rise among scholars and data scientists. In the years to come Julia will evolve further; so try and get familiar with its ecosystem. The Julia home page has plenty of resources in the form of PDFs, video lectures, live notebooks, books, blogs, etc, which makes the process of getting insights into the Julia development process simple, efficient and enjoyable. Computing with Julia can really be awesome. Give it a try.

References [1] [2] [3] [4]

http://julialang.org/downloads/platform.html http://julialang.org/learning/ http://bogumilkaminski.pl/files/julia_express.pdf https://upload.wikimedia.org/wikipedia/ commons/2/2e/Julia.pdf

using PyPlot

Plotting with Julia

Julia facilitates plotting as well. A sample code is shown below: using PyPlot x = linspace(0,2*pi,1000); y = sin(3*x + 4*cos(2*x));

By: Dr K.S. Kuppusamy The author is assistant professor of computer science at Pondicherry Central University. He has more than 10 years of teaching and research experience in the academia and industry. His research interests include accessible computing, Web information retrieval and mobile computing. He can be reached via mail at kskuppu@gmail.com.

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 49


Developers Let’s Try

An Introduction to the Go Programming Language

Here’s how Google describes Go in its blog post: “Go attempts to combine the development speed of working in a dynamic language like Python with the performance and safety of a compiled language like C or C++. In our experiments with Go, to date, typical builds feel instantaneous; even large binaries compile in just a few seconds.” Let’s take a quick look at this powerful language.

G

o is an open source concurrent programming language published by Google. It is similar to the C programming language, but it has several features that make it unique. ƒ Compilation of the project is so fast, it feels as if Go is an interpreted language rather than a compiled one. Programs are also easily edited or run on-the-fly. ƒ It has garbage-collection incorporated into it. So developers do not have to perform memory management as Go takes care of most of this work. ƒ It has built-in concurrency. So it can support concurrent work that requires permitting both communication and synchronisation. ƒ Go is a robust and statically-typed language. So several possible bugs can be detected during compilation, and resolved. ƒ Go has a fully working Web server as part of its standard library. ƒ There is no need to mess around with build configurations or make files as Go’s built-in build system is very simple.

Development environment set-up

following pre-requisites on our computer: ƒ Text editor: This is used to store our programs written in the Go language. Examples of a few editors are Windows Notepad, Vim or Vi. The files containing program source code are called source files. They must contain the extension .go. ƒ Go compiler: The source code written in files needs to be converted to a language that the CPU understands from the human readable source program. The compiler will compile the source code and convert the code into machine language, which the CPU can run as per program instructions. • Download the Go compiler from https://golang.org/dl/ • The GOPATH environment variable set-up specifies the location of the current Go workspace. It is the only environment variable we need for the Go code. The workspace can be located anywhere one wants. In my case I have used $HOME/dev. $ mkdir $HOME/dev $ export GOPATH=$HOME/ dev

To set up the local development environment, we need the 50 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com

To test if everything works correctly, type the following


Let’s Try Developers command in a terminal: String Types

go version

The result will display the Go version number.

Hello World: A sample Go program

Boolean Types

A Go program basically consists of the following components: ƒ Package declaration ƒ Package import ƒ Functions and variables ƒ Statements and expressions ƒ Comments In our sample program, we will try to use all the components. Since we have already created our workspace $HOME/dev, where we are going to create all our programs, let’s add the following code to any text editor:

• Two predefined constants to represent Boolean values • true • false

• Two way to represent Numbers • Integer types: Numbers without a decimal component. Ex-.. -2, -1, 0, 1,...

Numeric Types

• Floating point values : Numbers which contain a decimal component (real numbers) ex- 1.234, 123.4, 0.00001234, 12340000

Figure 1: Data types in Go

package HelloWorld // package declaration (every GoProgram must start with it) import "fmt" //importing standard package for formatting input and output // first sample program – comment always starts with two / and it is ignored by Go Compiler func main() { fmt.Println("Hello World!!") //calling the method of the package fmt which prints line }

Let’s save it as HelloWorld.go in the workspace directory. Enter the following command in a terminal: cd $GOPATH //change the current directory to our workspace directory go run HelloWorld.go //run the code stored in the file “HelloWorld.go”

The result will be displayed in the terminal as ‘Hello world!!’. The Go run command read the files (separated by spaces), compiled them, converted them into an executable saved in a temporary directory and then ran the program.

Data types, variables and constants

• A set of characters with a definite length used to represent text. • Its value is a sequence of bytes.

is composed of a data type and contains a list of one or more variables of that type. The syntax is var var_list optional_data_type; Here, optional_data_type can be of any Go data type, i.e., byte, int, float32, complex64, Boolean or any user-defined object. And var_list may consist of one or more variables separated by commas. Some valid declarations are shown here: var i, j, k int; var a, c byte; var b, sal float32; d = 42;

Variables have two types of declarations. 1) Static type: The compiler knows about the type and name of the variable at the compilation time only. 2) Dynamic type: The compiler needs to interpret the type of variable based on the value passed to it. Constants are basically variables whose values cannot be changed during the lifetime of the program. We can create constants in the same way we declare variables, but instead of var, we should use the const keyword. const x string = “Hello World”

Flow control

Data types are used to describe what type of data will be stored and how much space will be acquired to store that data. Go comes with several built-in data types. Variables are used to give a name to the storage area. The name of a variable can contain letters, digits, and the underscore character. But it must start with either a letter or an underscore. Go is a case-sensitive language; therefore, upper case and lower case letters are distinct. A variable definition is used to tell the compiler where and how much space to create for the variable. This definition

The decision-making structure: When program execution depends on one or more conditions, then we should use decision-making structures. These enable us to evaluate the condition at the point of execution, and only execute a statement or statements if the condition is true or false. The basic structure is shown in Figure 2. Here is a sample program to print odd and even numbers between 1 and 50. func main() { for i := 1; i <= 50; i++ { if i % 2 == 0 {

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 51


Developers Let’s Try fmt.Println(i, “even”) } else { fmt.Println(i, “odd”) } } }

Loops

When one needs to run the same code multiple times, Figure 2: The if structure loops come into play. These enable the program to execute the statements several times. The basic structure for loops is shown in Figure 3. Here is the sample program to print numbers from 1 to 50. package main import “fmt” func main() { i := 1 for i <= 50 { fmt.Println(i) i = i + 1 } }

// set value of an element at location i to 10*i } for j = 0; j < 10; j++ { fmt.Printf("Value at Element[%d] = %d\n", j, n[j] ) } } //Output Value at Element [0] = 0 Value at Element [1] = 10 Value at Element [2] = 20 Value at Element [3] = 30 Value at Element [4] = 40

Structures: A structure is a user defined data type, which enables us to make a set of data items of different types. Structure definition and declaration syntax is as follows: type struct_name struct { member member_type; member member_type; ...} //Structure definition var_name := struct_name {value1, value2...valuen} //declaration of structure variable

Figure 3: Loop structure

Arrays and structures

Arrays: An array is a fixed-size sequential set of data items of the same type. To declare an array of some datatype, use the following syntax: var array_name [SIZE] data_type //declaration of an array var y [5]int // y is an array of 5 integer data items.

Initialisation of an array can be done using a single statement as follows: var salary = [5]float32{10000, 20000, 3400,70000,5000}

Here, the number of elements can’t be more than the size of the array. If you omit the size of the array, an array can hold as many elements as you can include. Here is the sample code that describes the use of an array in Go language: package main import "fmt" func main() { var n [5]int /* n is an array of 10 integers */ var i,j int for i = 0; i < 5; i++ { n[i] = 10*i;

The member access operator (.) is used to access a member. This operator is coded between the structure variable name and the structure member variable that we want to access. type Employee struct { ID int; Name string; Salary int; Department string; } //Employee structure definition /* Declaration of emp1 of type Employee */ var emp1 Employee /* emp1 specification */ emp1.ID = 822141; emp1. Name = "Palak"; emp1. Salary = 50000; emp1. Department = "Oracle"; emp1.ID = 822141;

Interfaces

Interfaces are nothing but a set of method declarations. To declare the interface, use the following syntax:

52 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com

type interface_name interface { method_name1 [return_type] method_name2 [return_type]..}

The struct data type is used to implement the interfaces to


Let’s Try Developers define methods that are declared in the interfaces: Input : Max(10,20)

type Shape interface { area() float64 } /* define a circle */ type Circle struct { x,y,radius float64 }

Output : 20

Figure 4: Max function func main() { /* local variable definition */ var num1 int = 10 var num2 int = 20 var result int

/* define a rectangle */ type Rectangle struct { width, height float64 }

/* calling a function to get max value */ result = max(num1, num2)

/* define a method for circle for implementation of Shape. area()*/ func(cir Circle) area() float64 { return math.Pi * cir.radius * cir.radius }

fmt.Printf( “Maximum value is : %d\n”, result ) } /* function returning the max between two numbers */ func max(num1, num2 int) int { /* local variable declaration */ var result int

/* define a method for rectangle for implementation of Shape. area()*/ func(rec Rectangle) area() float64 { return rec.width * rec.height }

if (num1 > num2) { result = num1 } else { result = num2 } return result

Functions

A function is a collection of statements to perform a specific task. Every Go program must have at least one function, which is main(). A function is defined as follows:

}

Pointers

A pointer stores the address of the memory’s location. Before using pointers in a program, we have to declare them. The syntax for pointer declaration is as follows:

func fun_name( [param list] ) [return_types] { // body of the function }

var pointer_name *pointer_data_type

func: The func keyword is required to declare it as a function. Func name: The name of the function. Parameters: Used to store the value that is being passed to the function. A function may or may not contain parameters. Return type: The list of data types of the values the function returns. Function body: The function body contains a collection of statements that define the behaviour of the function. Here is an example. This function is used to find the maximum number between two numbers. Here,‘Max function’ contains two parameters and, after processing, it will return the maximum number as a result. package main import “fmt”

Here, pointer_data_type is the pointer’s base type. It can be any valid Go data type. The asterisk is used to declare a variable as a pointer. Valid pointer declarations are: var p *int var f *float32

/* pointer to an integer */ /* pointer to a float */

The actual data type for all pointers is the same, as pointers will only store the memory’s address. The difference between pointers of different data types is the data type of the variable the pointer points to.

The ‘*’and ‘&’ operators

‘*’ is used to find the value pointer variables. By dereferencing a pointer, we can access the value the pointer points to. So for ‘*x = 0’, we are storing 0 in the memory location x refers to. www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 53


Developers Let’s Try If we instead write ‘x = 0’, it will generate a compiler error because x is not a variable of type int, but a pointer of type *int. ‘&’ is used to find the address of a variable. ‘&x’returns a pointer to an int because x is an int. To modify the original variable, we should use ‘&x’. Here is an example that describes the best use of the‘*’ and ‘&’operators: package main import "fmt" func main() { var test int= 10 /* actual variable declaration */ var address *int /* pointer variable declaration */

paramsWithoutProg := os.Args[1:] param := os.Args[2] fmt.Println(paramsWithProg) fmt.Println(paramsWithoutProg) fmt.Println(param)} //OUTPUT $ go build test-cmd-args.go $ ./ test-cmd-args w x y z [./test-cmd-args w x y z] [w x y z] x

Go commands

The syntax is go command [arguments]. Command build clean doc

address = &test /* store address of a in pointer variable*/ fmt.Printf("Address of test variable: %x\n", &test )

env fix fmt generate

/* address stored in pointer variable */ fmt.Printf("Address stored in address variable: %x\n", address )

get

/* access the value using the pointer */ fmt.Printf("Value of *address variable: %d\n", *address )

install

}

list run test tool version vet

//OUTPUT Address of test variable: 10428004 Address stored in address variable: 10428004 Value of *address variable: 10

The Go command line argument

Command line arguments are a way to execute programs by passing the parameters along to the OS. Args provides access to command line arguments. Note that the first value is the path to the program, and then it will pass all the parameters to the program. os.Args[1:] holds the parameters that are being passed to the program. We can get individual parameter values by using normal indexing. package main import “os” import “fmt” func main() { paramsWithProg := os.Args

Usage Compiles packages and dependencies Removes object files Shows documentation for packages or symbols Prints Go environment information Runs Go tool fixes on packages Runs gofmt on package sources Generates Go files by processing the source Downloads and installs packages and dependencies Compiles and installs packages and dependencies Lists packages Compiles and runs Go programs Tests packages Runs the specified Go tool Prints the Go version Runs the Go tool ‘vet’ on packages

You can use the ‘go help [command]’ for more information on any command.

References [1] [2] [3] [4]

http://www.golang-book.com/books/intro/ https://golang.org/doc/effective_go.html#introduction http://www.tutorialspoint.com/go/ http://blog.smartbear.com/programming/anintroduction-to-the-go-language-boldly-going-whereno-man-has-ever-gone-before/ [5] https://golang.org/cmd/go/ [6] https://sendgrid.com/blog/intro-to-go-programminglanguage/

By: Palak Shah The author is a senior software engineer. She loves to explore new technologies and learn innovative concepts, and is also fond of philosophy. She can be reached at palak311@gmail.com.

54 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Let’s Try Developers

A Few Fundamentals of Ruby Programming Ruby is an open source, general programming language that has been influenced by Perl, Smalltalk, Eiffel, Ada and Lisp. It has an elegant syntax, and is focused on simplicity and productivity. Ruby is easy to learn and works on the principle of ‘writing more with less’.

P

rogramming languages and spoken languages are quite similar; both have one or more categories. A few programming language categories you might have heard of include imperative, object-oriented, functional or logicbased. Ruby is a powerful and dynamic open source, objectoriented language created by a developer known as Matz, and runs on almost all platforms such as Linux, Windows, MacOS, etc. Every programmer’s first computer program is usually the ‘Hello World Program’, as shown below. C++

Ruby

#include <iostream> using namespace std; int main() { cout << “Hello World” << endl; return 0; }

puts “Hello World”

Are you, now, more interested in Ruby? Ruby is one of the easiest languages to learn as it focuses on the productivity of program development. It is a server-side scripting language

and has features that are similar to those of Smalltalk, Perl and Python.

Ruby and Ruby on Rails

Ruby is a programming language while Ruby on Rails, or simply Rails, is a software library that extends the Ruby programming language. Rails is a software framework dependent on the Ruby programming language for creating Web applications. Web applications can also be written in Ruby, but writing a fully functional Web application from scratch in Ruby is a daunting task. Rails is a collection of pre-written code that makes writing Web applications easier and helps make them simpler to maintain. Still confused? Think of how a pizza is made. You simply spread the tomato sauce on the pizza base, top it with the veggies and spread the grated cheese. But where did the pizza base come from? It’s easier to get it from the grocery store instead of baking your own using flour and water. In this case, the Ruby programming language is the flour and water. By learning Ruby, you are a step closer to Rails and can create Web applications like Twitter one day.

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 55


Developers Let’s Try Application domains

Text processing: Ruby can be embedded into HTML, and has a clean and easy syntax that allows a new developer to learn it very quickly and easily. CGI programming: Ruby can be used to write Common Gateway Interface (CGI) scripts. It can easily be connected to databases like DB2, MySQL, Oracle and Sybase. Network programming: Network programming can be fun with Ruby’s well-designed socket classes. GUI programming: Ruby supports many GUI tools such as Tcl/Tk, GTK and OpenGL. XML programming: Ruby has a rich set of builtin functions, which can be used directly into XML programming. Prototyping: Ruby is often used to make prototypes that sometimes become production systems by replacing the bottlenecks with extensions written in C.

Programming fundamentals 1. Instructions and interpreters

Ruby is an ‘interpreted’ programming language, which means it can’t run on your processor directly, but has to be fed into a middleman called the ‘virtual machine’ or VM. The VM takes in Ruby code on one side, and speaks natively to the operating system and the processor on the other. The benefit of this approach is that you can write Ruby code once and, typically, execute it on many different operating systems and hardware platforms. A Ruby program can’t run on its own; you need to load the VM. There are two ways of executing Ruby with the VM — through IRB and through the command line. Running Ruby from the command line: This is the durable way of writing Ruby code, because you save your instructions into a file. This file can then be backed up, transferred, added to source control, etc. We might create a file named my_program.rb like this: class Sample def hello puts “Hello World” end end s=Sample.new s.hello

Then we could run the program in the terminal like this: $ruby my_program.rb Hello World

When you run ruby my_program.rb you’re actually loading the Ruby virtual machine, which in turn loads your my_program.rb.

Running Ruby from IRB: IRB stands for ‘interactive Ruby’ and is a tool you can use to interactively execute Ruby expressions read from the standard input. The irb command from your shell will start the interpreter.

2. Variables

Programming is all about creating abstractions, and in order to create an abstraction we must be able to assign names to things. Variables are a way of creating a name for a piece of data. In some languages you need to specify what type of data (like a number, word, etc) can go in a certain variable. Ruby, however, has a flexible type system where any variable can hold any kind of data. Creating and assigning a variable: In some languages you need to ‘declare’ a variable before you assign a value to it. Ruby variables are automatically created when you assign a value to them. Let’s try an example: C:\Users\Administrator>irb irb(main):001:0> a = 5 => 5 irb(main):002:0> a => 5

The line a = 5 creates the variable named ‘a’ and stores the value ‘5’ into it. Right side first: In English, we read left to right, so it’s natural to read code left to right. But when evaluating an assignment using the single equals (=), Ruby actually evaluates the right side first. Take the following example: C:\Users\Administrator>irb irb(main):001:0> b = 10 + 5 b => 15

The 10 + 5 is evaluated first, and the result is given the name ‘b’. Flexible typing: Ruby’s variables can hold any kind of data, and can even change the type of data they hold. For instance: C:\Users\Administrator>irb irb(main):001:0> c = 20 => 20 irb(main):002:0> c = “hello” => “hello”

The first assignment gave the name ‘c’ to the number 20. The second assignment changed ‘c’ to the value ‘hello’.

3. Strings

In the real world, strings tie things up. Programming strings have nothing to do with real-world strings. Programming strings are used to store collections of letters and numbers.

56 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Let’s Try Developers String interpolation: The second approach is to use string interpolation, where we stick data into the middle of a string. String interpolation only works on a doublequoted string. Within the string we use the interpolation marker #{}. Inside those brackets, we can put any variable or Ruby code, which will be evaluated, converted to a string, and output in that spot of the outer string. Our previous example could be rewritten like this:

That could be a single letter like ‘a’, a word like ‘hi’, or a sentence like ‘Hello my friends’. Writing a string: A Ruby string is defined as a quote sign (“) followed by a zero or more letters, numbers or symbols, which are followed by a closing quote sign (”). The shortest possible string is called the empty string: “”. It’s not uncommon for a single string to contain paragraphs or even pages of text. Common string methods: Let’s experiment with strings and some common methods in IRB. The .length method tells you how many characters (including spaces) are in the string:

C:\Users\Administrator>irb irb(main):001:0> name = “Frank” irb(main):002:0> puts “Good morning, #{name}!”

If you compare the outputs, you’ll see that they give the same results. The interpolation style tends to have fewer characters to type, and fewer open/close quotes and plus signs to forget.

C:\Users\Administrator>irb irb(main):001:0> greeting = “hello everyone!” => “hello everyone!” irb(main):002:0> greeting.length => 15

4. Numbers

Often, you’ll have a string that you want to break into parts. For instance, imagine you have a sentence stored in a string and want to break it into words: C:\Users\Administrator>irb irb(main):001:0> sentence = “this is my sample sentence” => “this is my sample sentence” irb(main):002:0> sentence.split => [“this”, “is”, “my”, “sample”, “sentence”]

The .split method gives you back an array, which we’ll learn about later in this article. It cuts the string wherever it encounters a space (“ ’’) character. Combining strings and variables: Very often, we want to combine the value of a variable with a string. For instance, let’s start with this example string: “Good morning, Frank!” When we put that into IRB, it just spits back the same string. If we were writing a proper program, we’d want it to greet users with their name rather than ‘Frank’. What we need to do is combine a variable with the string. There are two ways of doing this. String concatenation: The simplistic approach is called ‘string concatenation’, which is joining strings together with the plus sign: C:\Users\Administrator>irb irb(main):001:0> name = “Frank” irb(main):002:0> puts “Good morning, ” + name + “!”

In the first line, we set up a variable to hold the name. In the second line, we print the string “Good morning”, combined with the value of the variable name and the string “!”.

There are two basic kinds of numbers: integers (whole numbers) and floats (have a decimal point). Integers are much easier for both you and the computer to work with. You can use normal math operations with integers including +, -, /, and *. Integers have a bunch of methods to help you do math-related things, which you can see by calling 5.methods. Repeating instructions: A common pattern in other languages is the ‘for’ loop, used to repeat an instruction a set number of times. For example, in JavaScript you might write: for(var i = 0; i < 5; i++) { console.log(“Hello, World”); }

‘For’ loops are common, but they’re not very readable. Because Ruby’s integers are objects, they have methods. One of these is the .times method to repeat an instruction a set number of times. You can rewrite the above loop in Ruby, as follows: 5.times do puts “Hello, World!” end

In this example, we’re using both the .times method and what’s called a block.

5. Blocks

Blocks are a powerful concept used frequently in Ruby. Think of them as a way of bundling up a set of instructions for use elsewhere.

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 57


Developers Let’s Try Starting and ending blocks: You just saw a block used with the .times method on an integer. The block starts with the keyword ‘do’ and ends with the keyword ‘end’. The do/ end style is always acceptable. Bracket blocks: When a block contains just a single instruction, though, we often use the alternate markers ‘{’ and ‘}’ to begin and end the block: 5.times{ puts “Hello, World!” }

6. Arrays

Usually, when we’re writing a program it’s because we need to deal with a collection of data. There are lots of cool things to do with an array. Here are a few examples. .sort: The sort method will return a new array where the elements are sorted. If the elements are strings, they’ll come back in alphabetical order. If they’re numbers, they’ll come back in ascending value order. Try these: C:\Users\Administrator>irb irb(main):001:0> one = [“this”, “is”, “an”, “array”] irb(main):002:0> one.sort irb(main):003:0> one

You can rearrange the order of the elements using the sort method. You can iterate through each element using the each method. You can mash them together into one string using the join method. You can find the address of a specific element by using the index method. You can ask an array if an element is present with the include method. We use arrays whenever we need a list in which the elements are in a specific order.

7. Hashes

A hash is a collection of data, in which each element of the data is addressed by a name. A hash is an unordered collection, where the data is organised into ‘key/value pairs’. Hashes have a more complicated syntax that takes some getting used to: C:\Users\Administrator>irb irb(main):001:0> produce = {“apples” => 3, “kiwi” => 1} irb(main):002:0> puts “There are #{produce[‘apples’]} apples in the fridge.”

Simplified hash syntax: We commonly use symbols as the keys of a hash. When all the keys are symbols, then there is a shorthand syntax that can be used: C:\Users\Administrator>irb irb(main):001:0> produce = {“apples” => 3, “kiwi” => 1} irb(main):002:0> puts “There are #{produce[:apples]} apples in the fridge.”

Notice that the keys end with a colon rather than beginning with one, even though these are symbols.

8. Conditionals

Conditional statements evaluate to true or false. The most common conditional operators are == (equal), > (greater than), >= (greater than or equal to), < (less than), and <= (less than or equal to). Conditional branching/instructions: Why do we have conditional statements? Most often, it’s to control conditional instructions, especially if/elseif/else structures. Let’s write an example by adding a method like this in IRB:

def water_status(minutes) if minutes < 7 puts “The water is not boiling yet” elsif minutes == 7 puts “It’s just barely boiling” elsif minutes == 8 puts “It’s boiling!” else puts “Hot! Hot! Hot!” end end

Try running the method with water_status(5), water_ status(7), water_status(8) and water_status(9). Understanding the execution flow: When the minutes are 5, here is how the execution goes: “Is it true that 5 is less than 7? Yes, it is; so print the line The water is not boiling yet”. When the minutes are 7, it goes like this: “Is it true that 7 is less than 7? No. Next, is it true that 7 is equal to 7? Yes, it is; so print the line It’s just barely boiling”. When the minutes are 8, it goes like this: “Is it true that 8 is less than 7? No. Next, is it true that 8 is equal to 7? No. Next, is it true that 8 is equal to 8? Yes, it is; so print the line It’s boiling!”. Lastly, when the total is 9, the execution goes like this: “Is it true that 9 is less than 7? No. Next, is it true that 9 is equal to 7? No. Next, is it true that 9 is equal to 8? No. Since none of these are true, execute the else and print the line Hot! Hot! Hot!.

By Miren Karamta The author is a project scientist and IT systems manager at the Bhaskaracharya Institute for Space Applications and Geo-informatics (BISAG), Gandhinagar, Gujarat. You can reach him via email at mirenkaramta@yahoo.com. LinkedIn: https://in.linkedin.com/in/miren-karamta-2b929122

58 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Overview Developers

Choosing the Right Open Source Programming Language Selecting a programming language is often a daunting task as the choices are vast. In this article, novices attempting to learn coding get a birdâ&#x20AC;&#x2122;s eye view of the top programming languages, allowing them to examine the simplicity or complexity of coding and then weigh the pros and cons of each language.

T

he world of technology is expanding immensely with each passing year. The competition in the market is getting keener as every company tries to hold on to the highest position. To face up to the competition, every IT organisation requires programmers, Web developers and app developers who are well versed with programming languages. Various open source and commercial programming languages are available today, each with distinct features and functionalities. The backbone of software development is the source code, which is made up of thousands of lines of instructions that programmers write for computers to interpret. The source code instructs the application what to do and how to do it. The source code is the blueprint of the program. In order to code, what language the programmers

decide to use is important, as so many are available. Some of the programming languages are open source and some are commercial. The trend right now is for every IT enterprise to shift towards open source. As per the latest survey, more than 80 per cent of the enterprises are using open source technology to build all sorts of applications. Some of the top open source programming languages are given below, along with a short description to get the reader acquainted with them.

1. Googleâ&#x20AC;&#x2122;s Go

Googleâ&#x20AC;&#x2122;s Go programming language, often referred to as golang, was created by Robert Griesemer, Rob Pike and Ken Thompson. The main objective of Go language, along with its accompanying tools, is to be expressive and efficient in

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 59


Developers Overview both compilation and execution, and be perfectly effective in writing reliable and robust programs. Go is a staticallytyped language with programming syntax similar to C programming. It provides garbage collection, type safety, dynamic typing capabilities and many advanced built-in types like variable length arrays and key value maps. Go is expressive, concise, clean and efficient. Its concurrency mechanisms make it easy to write programs that get the most out of multi-core and networked machines, while its novel type system enables flexible and modular program construction. Go compiles quickly to machine code, yet has the convenience of garbage collection and the power of runtime reflection. It’s a fast, statically typed, compiled language that feels like a dynamically typed, interpreted language. The latest version of Go is 1.6.3. Here’s the syntax for the ‘Hello World’ program in Go: package main import "fmt" func main() { fmt.Println("Hello World”) }

The advantages of Go ƒ Go language is very fast in compilation and execution, compared to other languages, and even the compiler compiles the program really fast. It’s a great modern language with high performance. ƒ Go is regarded as a highly powerful language with builtin concurrency and a high degree of abstraction. It is equipped with Go routines to start concurrent work and the concept of channels to permit both communication and synchronisation. ƒ Go, being a high level language, has standard documentation features and a powerful rich standard library that contains a full-fledged, working Web server. ƒ Go has an active community that offers support for any sort of problems.

C and Objective-C, without any of the constraints of C compatibility. The development of Swift was started by Chris Lattner with other programmers at Apple, by taking into consideration ideas from many programming languages like Objective-C, Rust, Haskell, Ruby, Python, C# and CLU. Swift was basically developed to work with Apple’s Cocoa and Cocoa Touch frameworks, and has an inbuilt LLVM compiler framework included in Xcode 6, using the Objective-C runtime library. The latest version of Swift is 3.0 Preview 2. The syntax for the ‘Hello World’ program in Swift is: import Cocoa /* My first program in Swift */ var myString = "Hello, World!" println(myString)

The advantages of Swift ƒ Swift is comparatively easy to read and write because of less overhead and syntax requirements, and it is the friendliest language for beginners. ƒ Swift programs are overall shorter in length, which gives programmers unique advantages in terms of passing functions as variables. Swift enables developers to write highly generic code, which can perform lots of different things and reduces repetition. ƒ Swift compiles directly to native code and utilises both Objective 2.0 runtime as well as Apple’s ARC memory management technology. ƒ Swift is less error-prone as its syntax and language construction exclude several types of mistakes, thereby making it relatively free from crashes and unexpected behaviour.

The disadvantages ƒ Go, being a new programming language, does not have enough libraries, so developers work hard to develop libraries on their own. ƒ Go does not have enough resources in terms of books, research articles and other online resources for users to learn in a systematic manner. ƒ Go is a tough language to learn and error handling is a tedious task.

Swift’s disadvantages ƒ Most of the examples which come inbuilt with Swift are written in Objective C, so to begin Swift programming, one has to learn Objective C. ƒ Swift is still undergoing a major paradigm shift as newer versions are coming out with lots of changes, which sometimes become hard for professionals and newbies to understand. ƒ Developers can use Swift only for iOS and OS X app development, so it has limited programming platform opportunities. ƒ Swift is slower than Objective-C, which means for all modern syntax, simplified code construction, playground app simulation and testing, Swift programming code has a longer execution time period.

2. Swift

3. Hack

Swift is a new general-purpose, multi-paradigm, compiled programming language for iOS and OS X, watchOS, tvOS and Linux. It is being built by Apple Inc., on the best of

Hack is an open source general-purpose programming and scripting language for HipHop Virtual Machine (HHVM) created by Facebook as a dialect for PHP. Hack is specially

60 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Overview Developers designed for Web development, and can be easily integrated with HTML. Hack supports fast development for PHP by enabling programmers to use both static and dynamic typing which, in turn, is called gradual typing. Hack provides various important features, some of which are listed below. Type annotations: This feature allows code to be explicitly typed on parameters, class member variables and return values. Generics: This allows classes and methods to be parametrised in a manner similar to C# or Java. Nullable types: This feature enables Hack to deal with nulls in a safer way by making use of the ‘?’ operator. Collections: This provides first class, built-in parametrised types such as Vector, Map, Set and Pair. Lambdas: This allows the definition of first class functions. Here’s the syntax for the ‘Hello World’ program in Hack. After installing HHVM, start it by typing the following: Hhvm –m server –p 8080 <?hh Echo “Hello World!”;

The advantages of Hack ƒ Hack code is mixed with PHP code. Instead of using <?php in PHP, use <?hh in Hack, which makes for a smooth migration between Hack and PHP. ƒ Hack runs with HHVM, which has a type-checking phase that verifies the consistency of the code. ƒ Hack is used to build complex websites at great speeds by ensuring proper organisation and error-free code, and gives programmers a unique safety advantage to write bug-free code. Hack’s disadvantages ƒ It contains abandoned features that make PHP a simple language. ƒ HTML code cannot be directly embedded into Hack and code can’t be written outside of a function or class. ƒ Being a new language, Hack has lots of bugs and errors which, right now, make it an unstable language.

4. Rust

Rust is regarded as a general-purpose, multi-paradigm system programming language that is designed to meet three main objectives: safety, speed and concurrency. Rust was designed by Graydon Hoare from Mozilla Research, and uses LLVM as its backend. The design of the language has been refined due to the developer team’s experience of writing the Servo Web browser layout engine and the Rust compiler. Rust, being an intelligent open source language, has an edge over other programming languages by having a number of compile-time safety checks, which generate no runtime overhead and eliminate all data traces. Rust has the inbuilt

functionality most needed for concurrent execution on multicore machines, making concurrent programming memory safe without garbage collection. And it is the only language that does this job. The latest stable release for Rust is 1.10. The syntax for the ‘Hello World’ program in Rust is: fn main() { println!("Hello, world!"); }

The advantages of Rust ƒ Rust is suitable for developers and projects where safety and stable execution, in addition to low-level optimisation and performance, are required. Rust adds lots of high-level functional programming techniques, making it feel like a low level and a high level language at the same time. ƒ Rust has an enlarged standard library, which is expanding constantly, focusing on file system access, networking, time and collections APIs. ƒ Rust is supportive of multi-platform development, ranging from Windows, Android and even ARM devices, apart from other platforms. Rust’s disadvantages ƒ Rust is a bit harder and more complex to learn and code. Error handling is a complex task, especially for newbies. ƒ Rust is a very immature language in terms of documentation, which is not available. ƒ Rust is currently not being adopted too much in the industry.

5. Scala

Scala (Scalable Language) is regarded as a generalpurpose programming language designed to define programming patterns in a concise, elegant and type-safe way. Scala was developed by Martin Odersky in 2001 by using the Funnel programming language as the base. Scala basically integrates the features of object-oriented and functional languages, making programmers more productive and effective in writing source code. Scala has many of the features of functional programming languages like Scheme, Standard ML and Haskell, in addition to currying, type inference, immutability, lazy evaluation and pattern matching. Scala is more advanced compared to Java because it integrates other features like operator overloading, optional parameters, named parameters, raw strings and no checked exceptions. Scala and Java are related to each other by the fact that they are compiled to a bytecode and use the JVM. Scala is completely interoperable with Java. The latest stable release of Scala is 2.11.8.

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 61


Developers Overview The syntax for the ‘Hello World’ program in Scala is: object HelloWorld extends App { println("Hello, World!") }

The advantages of Scala ƒ Scala enables developers to write simple and straightforward text, as it requires two-thirds less code than Java and that code is more flexible. This, in turn, makes the code human-readable and easier to understand. ƒ Scala enables quick implementation and enhanced performance, as it reduces various thread-safety concerns and treats functions on first priority. ƒ Scala solves various concurrency issues because it is well equipped with the Actor library. ƒ Scala has the fastest growing ecosystem compared to any other language in the current scenario, as the IDE tools, testing tools, documentation and even the libraries are improving, while its capabilities are getting enhanced. Scala’s disadvantages ƒ Scala is right now very hard to learn as the language is completely different from pure Java and the programming environment is also different, requiring more professional developers to understand and start building up the logic and developing bug-free code. ƒ Even today, there are very few projects written purely in Scala. In a nutshell, we see projects with a hybrid composition of both Java and Scala, and mixing up these two is somewhat time consuming and slow. And testing becomes a pain. ƒ Because of limited backward compatibility, Scala is not compatible with previous versions, which creates problems when the development time period is short. So, Scala is not recommended for projects with time constraints.

6. Dart

Dart is a class-based, single inheritance, object-orientedcum-general-purpose programming language originally developed by Google and later approved as a standard by ECMA (ECMA-408). Being an open source programming language enables Dart developers to build more complex, high performance and scalable-rich apps for the modern Web. Dart comes bundled with a Web Component Library, which contains Web code comprising HTML and JavaScript, which can be used in different pages or projects. Dart has tools such as: ƒ Dartboard: Enables the developers to write and run Dart code in the Web browser.

ƒ ƒ ƒ

Dart Editor: Enables the developers to create, modify and run Dart apps. SDK: Contains command-line tools such as a Dart-toJavaScript compiler, a Dart virtual machine, etc. Dartium: Contains a built-in Dart VM. The latest stable release of Dart is 1.18. The syntax for the ‘Hello World’ program in Dart is:

void main() { print('Hello, World!'); }

The advantages of Dart ƒ Dart is regarded as more than just a programming language, providing a good platform for Web developers and a new feature set that includes optional typing and isolates. ƒ Dart core libraries provide functionality including collections, dates, math, HTML bindings, server-side I/O like sockets and even JSON. ƒ Dart’s VM has been built from scratch. It can run on the command line for server-side applications, and can be embedded into browsers for client-side applications. ƒ Dart has other useful features like mixins, implicit interfaces, lexical closures, lexical this, named constructors, string interpolation, online functions and noSuchMethod. Dart’s disadvantages ƒ Dart’s SDK doesn’t provide access to SQL based databases at the server level. Third party packages come to the rescue. ƒ Dart is currently not very successful in generating consumable JavaScript. The entire app build in Dart has to be built in JavaScript at once, for now. ƒ Dart is not entirely interoperable with JavaScript right now as the Dart JavaScript library is not stable. It is currently in the development stage.

7. Clojure

Clojure is regarded as a dynamic, general-purpose programming language that combines the approachability and interactive development of a scripting language with efficient and robust infrastructure for multi-threaded programming. Clojure is closely related to Lisp, which was created by Rich Hickey. It operates on the Java VM and is integrated with Java. It fully supports calling Java code from Clojure and vice-versa. Clojure supports functions as first class objects, a read-eval-print loop (REPL) and a macro system. It encourages the use of first-class and high-order functions with values, and comes with its own set of efficient immutable data structures. Clojure offers innovative solutions to the challenges inherent in

62 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Overview Developers

8. Haskell

concurrency and parallelisation. The latest stable release of Clojure is 1.8. The syntax for the ‘Hello World’ program in Clojure is:

Haskell is a modern, standard, non-strict, purely functional programming language, and is built with features like polymorphic typing, lazy evaluation and higher order functions. Haskell features a type system with type inference and lazy evaluation, and also includes type classes. It is purely based on semantics, but not the syntax of the language of Miranda which focused on the efforts of the initial Haskell working group. Haskell has a strong, static type system based on the Hindley–Milner type inference. Haskell’s principal innovation in this area is to add type classes, originally conceived as a principled way to add overloading to the language, but since then this function has been found to address many more uses. Haskell has an open, published specification, and multiple implementations exist. Its main implementation, the Glasgow Haskell Compiler (GHC), is both an interpreter and nativecode compiler that runs on most platforms. GHC is noted for its high-performance implementation of concurrency and parallelism, and for having a rich type system incorporating recent innovations such as generalised algebraic data types and type families. Another interesting feature of Haskell is that all functions are treated as values like integers and strings. The latest version is Haskell 2014. The syntax for the ‘Hello World’ program is:

On repl Start repl. When you use Leiningen, type lein repl on Terminal (Command Prompt): bash-3.2$ lein repl nREPL server started on port 59553 on host 127.0.0.1 REPL-y 0.2.1 Clojure 1.5.1 Docs: (doc function-name-here) (find-doc "part-of-name-here") Source: (source function-name-here) Javadoc: (javadoc java-object-or-class-here) Exit: Control+D or (exit) or (quit) Results: Stored in vars *1, *2, *3, an exception in *e user=> Type below on repl: user=> (prn "Hello World") "Hello World" nil user=> (println "Hello World") Hello World nil user=> (pr-str "Hello World") "\"Hello World\""

module Main where main :: IO () main = putStrLn "Hello, World!"

The advantages of Clojure ƒ Clojure code can be used on any Java library. Clojure libraries can, in turn, be used from Java and Clojure applications can be packed like Java applications and deployed anywhere. ƒ Clojure is part of the Lisp environment and it retains the best features of Lisp. Clojure contains macros, which is regarded as an important approach to meta programming and syntactic extensions. ƒ Clojure, being a functional programming language, enables developers to use first-class and high order functions with values, and comes with its own set of efficient immutable data structures. ƒ Being a dynamic programming language, it supports updating and loading of new code at runtime, either locally or remotely.

The advantages of Haskell ƒ Being a purely functional programming language, Haskell enables users to get things done by giving the computer a sequence of tasks and then executing them. While executing, the tasks change state. ƒ Haskell is a lazy programming language and will not execute functions and calculate things until really forced to show the results. It goes well with referential transparency, and allows developers to think of programs as a series of transformations on data. ƒ Haskell is elegant and concise. Its programs are shorter compared to other programming languages and debugging becomes a lot easier.

Clojure’s disadvantages ƒ Debugging in Clojure with regard to error handling and removal is a tedious task. ƒ Clojure can only run on the Java virtual machine. ƒ Clojure is hard to master for newbies, and sometimes even developers take a lot of time when using it for core and professional program development.

Haskell’s disadvantages ƒ Haskell is not suitable for making time-critical applications. ƒ Being a new programming language, it is very difficult to understand, compared to other languages like C or C++, and has limited community support and documentation. ƒ There are very few platforms where Haskell code has been used.

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 63


Developers Overview 9. Apache Groovy

According to its website, “Apache Groovy is a powerful, optionally typed and dynamic language, with static-typing and static compilation capabilities, for the Java platform aimed at improving developer productivity, thanks to a concise, familiar and easy-to-learn syntax. It integrates smoothly with any Java program, and immediately delivers to your application powerful features, including scripting capabilities, domain-specific language authoring, runtime and compiletime meta-programming and functional programming.” Apache Groovy is a dynamic language and has features similar to other programming languages like Python, Ruby, Perl and Smalltalk. Groovy also supports modularity, type checking, static compilation, Project Coin syntax enhancements, multi-catch blocks and ongoing performance enhancements using JDK7’s invoke dynamic instruction. Groovy provides native support for various mark-up languages such as XML and HTML, accomplished via an inline DOM syntax. This feature enables the definition and manipulation of many types of heterogeneous data assets with a uniform and concise syntax and programming methodology. The latest version of Apache Groovy is 2.4.7 The syntax for the ‘Hello World’ program is: class Hello { String name; void say Hello() { Systematisation("Hello "+get Name()+"!"); } void set Name(String name) { pathnames = name; } String get Name() { return name; } static void main(String[] rags) { Hello hello = new Hello(); Hellenisation("world"); Hellenisations(); } }

THE COMPLETE MAGAZINE ON OPEN SOURCE

The advantages of Apache Groovy ƒ Apache Groovy is easy to learn and code, especially for Java developers, as it contains closures, builders, runtime and compile time meta programming, functional programming, type inference and static compilation. ƒ Groovy has a rich and vibrant ecosystem in terms of Web development, reactive applications, concurrency, parallelism library, test frameworks, build tools, code analysis and GUI building. ƒ Groovy supports smooth Java integration, and also interoperates with Java and other third party libraries. The disadvantages ƒ When used with lots of dynamic features like ducktyping, dynamic code, meta-programming, etc, Groovy demands more processing at runtime and this results in a slow performance. ƒ Groovy lacks documentation as well as online resources, so there’s not much help for newbies. Also, it has not been implemented in high-end and critical applications.

References [1] [2] [3] [4] [5] [6] [7] [8] [9]

https://golang.org/ https://developer.apple.com/swift/ http://hacklang.org/ https://www.rust-lang.org/en-US/ http://www.scala-lang.org/ https://www.dartlang.org/ https://clojure.org/ https://www.haskell.org/ http://www.groovy-lang.org/

By: Prof. Anand Nayyar The author is assistant professor in the department of computer applications and IT at KCL Institute of Management and Technology, Jalandhar, Punjab. He loves to work on open source technologies, embedded systems, cloud computing, wireless sensor networks and simulators. He can be reached at anand_nayyar@yahoo.co.in.

Your favourite magazine on Open Source is now on the Web, too.

OpenSourceForU.com Follow us on Twitter@LinuxForYou

64 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


How To Developers

Create a Web Database in

App Inventor 2

On our App Inventor journey, which we embarked upon many months ago, we now move on to creating a Web database using App Inventor 2. This tutorial takes you step by step into the practical details and the nitty-gritty of creating the Web database.

I

hope all of you are learning a few things through these Android tutorials, and I am so glad to hear from some of you. Continuing the series, we have another interesting topic to coverâ&#x20AC;&#x201D;an online database. So far we have learned how to keep or store things locally on your mobile phone or, say, TinyDB. But have you thought of storing the data virtually to some remote location and making it available to all the connected devices? All users should have the permission to access this database or the database security should be public. In this tutorial, we will set up an online database and make it accessible across devices. Before we move ahead, let me first introduce you to the database. In our application, we are going to use the Firebase database, which is a Google provided cloud database, and the good thing is that it is open source and freely available for use.

Features of Firebase

1. Firebase uses the GUI based data centre, so that you can see all the entries and manage them as well. 2. Being on the cloud, it is accessible on the go to any of the devices. 3. Smart authentication provides secure access. 4. The real-time database reflects changes as soon as the data is updated. 5. You can add collaborators so that many people can manage the database. 6. No coding is needed for fetching and storing data.

Creating a project in the Firebase database

1. To use Firebase, you must have a working Google account, which will also be needed to run App Inventor. 2. Type https://console.firebase.google.com/ in the address bar of the browser you are using and hit Enter. 3. If asked for login credentials, give your Google account credentials and proceed. 4. Follow the on-screen instructions, which include accepting the terms, etc. 5. Once everything is done, you will see the screen shown in Figure 1. 6. Click on the Create New Project button as shown in Figure 1. 7. Give the name of the project and country/region as shown in Figure 2. 8. Once your project is created, you will see the dashboard of your application. It will be similar to what is shown in Figure 3. 9. Now select Database from the left hand palette and then Rules from the next page. It will display something like what is shown in Figure 4. 10. This time, we wonâ&#x20AC;&#x2122;t give access to the database to all users; hence, we need to change the rules. You can manually change the rules by typing values, as shown in Figure 5. Once this is done, click on the Publish button. You have, now, successfully created an empty space for your project in the Firebase database, and have given read and

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 65


Developers How To create a project project name

demo project

Welcome back to Firebase Continue building your apps with Firebase using some of the resources below. Documentation

Sample code

ApI reference

Country/region Support

India

creAte NeW prOject

Your projects using Firebase

afsadfsafsd

My First App

By default, your Firebase Analytics data will enhance other Firebase features and Google products. You can control how your Firebase Analytics data is shared in your settings at anytime. learn more

IMpOrt GOOGle prOject

new-client-requirement CAnCel

afsadfsafsd.firebaseio.com

blazing-fire-221.firebaseio.com

Figure 2: Add the name of the project

new-client-requirement.firebaseio.com

implementing in this article.

Figure 1: Firebase dashboard Firebase demo project

demo project

Go to docs

Overview

Analytics Develop

Welcome to Firebase! Get started here.

Auth Database Storage Hosting Add Firebase to your Android app

Remote Config

Add Firebase to your ioS app

Add Firebase to your web app

Test lab Crash

Figure 3: Demo project in the Firebase dashboard demo project

realtime Database DATA

Analytics

RuleS

uSAGe

Develop

SImulAToR

Auth Database

Default security rules require users to be authenticated

leARn moRe

DISmISS

Storage Hosting Remote Config Test lab Crash GRoW

Notifications

Figure 4: Reading and writing rules with authentication

RuleS

Unpublished changes

Theme of the application

I guess you are familiar with the signup procedure of various websites or applications. Each user is requested to fill the desired form with personal details, which are stored in the database to make him or her uniquely identifiable. We will also create a sign-up page asking for a few details from users, and as soon as they hit the Sign Up button, we will allot a unique identification number to them and save the details in the Web database. Being the admin, we will change the identification number and display it on the screen. In this way, we will cover both the storing and fetching of the data from the Web database. You are already familiar with all the components that I will be using for this application, say, buttons, labels, horizontal arrangement, text box, etc.

GUI requirement

realtime Database DATA

CReATe pRojeCT

uSAGe

pUblIsh

DIscArD

sIMUlAtOr

Default security rules require users to be authenticated

leArN MOre

DIsMIss

Figure 5: Rules with read and write permission

write permissions to all the users who have the URL. If you are new to this Android journey, I would request you to please

go through a few beginner articles to make yourself familiar with App Inventor. Letâ&#x20AC;&#x2122;s proceed to the concept of our application, which we will be

66 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com

For every application we have a graphical user interface or GUI, which helps the user to interact with the on-screen components. How each component responds to user actions is defined in the block editor section. As per our current requirement, we hope to have multiple text boxes and a signup button, using which the application user will be able to write data and initiate methods.

GUI requirements for Screen 1 1. Label: Labels are the static text components, which are used to


How To Developers Realtime Database DAtA

screen1

rUles

UsAGe

name: GO

https://fir-project-e480e.firebaseio.com/

email: fir-project-e480e: nu11

phone: password:

Figure 8: URL of the project sign Up

Figure 6: Designer screen

components Screen1 HorizontalArrangement1 Label1 name_Textbox HorizontalArrangement2 Label2 email_Textbox HorizontalArrangement3 label3 phone_Textbox HorizontalArrangement4 Label4 password_Textbox HorizontalArrangement5 Signup_button FirebaseDB1 Notifier1

Rename

Delete

Figure 7: Components view

display some headings or markings on the screen. 2. Button: Buttons let you trigger the event and are very essential components. 3. Horizontal arrangement: This is a special component that keeps all child components aligned with it. 4. Notifier: This is used to display some instructions or give controls over your existing components. You will see its functionality in more detail as we implement it in our project. 5. Firebase DB: Fire DB, as you already know, is the cloud based database utility from Google. We will use it to store the user’s data over the cloud. Listed in Table 1 are the components that we will need for this application. We will drag them on to the designer from the left hand side palette. 1. Drag and drop the components mentioned in the table to the viewer. 2. Visible components will be visible to you, while the non-visible components will be located beneath the viewer under the tag Nonvisible. 3. Labels, along with the respective text boxes, need to be put within the Horizontal arrangement so as to keep them aligned horizontally. 4. If you have dragged and placed everything, the layout will look something like what’s shown in Figure 6.

5. Make the necessary property changes like we did in changing the text property for the label and button components. 6. Renaming the components helps to identify them in the block editor. 7. Your graphical user interface is now ready. Figure 6 shows how the application will look after the installation. 8. Figure 7 shows the hierarchy of the components that we have dragged to the designer. If you are a bit confused after seeing the designer and the components viewer, let me explain things a bit. Here is the hierarchy for our application. 1. We have just placed four text boxes to get the details from the user. 2. For easy reading, we have inserted a label before the text box. 3. We need to configure the Firebase component to connect to the demo project we have created using the steps mentioned earlier. For that, just assign the URL of your project to the Firebase URL property of the Firebase component in the designer. Now, let’s head towards the blocks editor to define the behaviours. Let’s first discuss the actual functionality that we expect from our application. 1. The user should be able to write the text in the respective boxes. 2. Upon clicking the Sign Up button, all details should be saved to the database and a unique

Table 1

Component name Label Button Horizontal arrangement Notifier Firebase DB

Purpose To display a label To trigger events To arrange the child components To display on-screen information To store data persistently

Location Palette-->User Interface-->Label Palette-->User Interface-->Button Palette-->Layout-->Horizontal Arrangement Palette-->User Interface-->Notifier Palette-->Experimental-->Firebase DB

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 67


Developers How To components

properties

Screen1 HorizontalArrangement1 Label1 name_Textbox HorizontalArrangement2 Label2 email_Textbox HorizontalArrangement3 label3 phone_Textbox

FirebaseDB1 FirebaseToken eyJhbGciOiJIUzI1NilsInR5

FirebaseuRl https://fir-project-e480e.fire

use Default persist

projectBucket

HorizontalArrangement4 Label4 password_Textbox HorizontalArrangement5 Signup_button FirebaseDB1

Figure 9: Set the URL to the Firebase component

is a button available right above the Properties pane to do this.

Block editor blocks

I have already prepared the blocks for you. All you need to do is drag the relevant blocks from the left side palette and drop them on the viewer. Arrange the blocks in the same way that you see them in Figure 10. I will explain each block, what it does and how it is called. ƒ On clicking the button, we are setting the Firebase bucket to the name of the user. ƒ The Firebase bucket is a kind of folder that will contain other subfolders or details within. ƒ With a click of the button we save each detail provided by the user with the unique tag name. ƒ The concept of tag name and value is already known to you since you have used TinyDB. ƒ The Notifier in the end displays the alert that the data has been successfully added. Your database will look like what’s shown in Figure 11 once you fill the entries and click on the Sign Up button. ƒ Once done with the data storing, this block will keep track of the data changes. ƒ Specifically, this will display the notice if the value of the unique number has been changed. Now you are done with the block editor too. Next, we will move to download and install the app on your phone to check how it is working.

Packaging and testing

Figure 10: Block editor image 1

number should also be generated for the user. 3. Being admin, the user should be able to change the unique number in the database. 4. The same number should be

displayed on the screen in the Notifier. So let’s move on and add these behaviours using the block editor. I hope you remember how to switch from designer to block editor. There

68 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com

To test the app you need to get it on your phone. You first have to download the application on your computer and then move it to your phone via Bluetooth or USB cable. I’ll tell you how to download it. 1. On the top row, click on the Build button. It will give you the option to download the apk to your computer. 2. After the download is successful, the application will be placed in the Download folder of your directory or the preferred location you have


How To Developers Debugging the application

Realtime Database DAtA

rUles

UsAGe

https://fir-project-e480e.firebaseio.com/

https://fir-project-e480e.firebaseio.com/

fir-project-e480e meghraj email: " \ "name@xyz.com\ " " password: " \ " 123456\ " " phone: " \ " 1234567890\ " " Unique_number: "78"

Figure 11: Database view

Figure 12: Block editor image 2

set for it. 3. Now you need to get this apk file on your mobile phone either via Bluetooth or via USB cable. Once you have placed this file on your SD card you need to install it. Follow the on-screen instructions to install it. You may get a notification or warning saying that you are installing from an untrusted source. Allow this from the Settings and after successful installation you will see the icon of your application in the menu of your mobile. This is the default icon, which can be changed,

and we will tell you how as we move ahead in this course. I hope your application is working exactly as per the requirements you have given. Now, depending on your usability and customisation, you can change various things like image, sound and behaviour also.

We have just created the prototype of the application with very basic functionality, based on what the user may be interested in. Now let’s look at the various possible features that should be provided in your app so as not to annoy the user, and require some serious thought. Consider the following: 1. Wouldn’t it be nice to add some data validation upon entering the data, such as no field should be blank? 2. In the phone number field, can we have only digits or numbers? 3. Can we add the validation of the email id by checking ‘@’ in the text provided? 4. Taking the same things into consideration, can we design a log in page as well? These are some features in the app the user will be pretty happy to see implemented. Think about how you can integrate these into the application. Do ask me if you fail to accomplish any of the above cases. You have successfully built another useful Android app for yourself. Happy inventing!

By: Meghraj Singh Beniwal The author has a B. Tech in electronics and communication, is a freelance writer and an Android app developer. He currently works as an automation engineer at Infosys, Pune. He can be contacted at meghrajsingh01@rediffmail.com or meghrajwithandroid@gmail.com

oSFY Magazine Attractions During 2016-17 Month

theMe

March 2016

Open source Databases

April 2016 May 2016 June 2016 July 2016

Backup and Data Storage Web Development Open Source Firewall and Network security Mobile App Development

August 2016

Network Monitoring

September 2016

Open Source Programming Languages

October 2016

Cloud Special

November 2016 December 2016 January 2017 February 2017

Open Source on Windows Machine Learning Virtualisation (containers) Top 10 of Everything

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 69


Developers How To

Faster File Search with Python This article presents a file search utility created by using the power of the versatile Python programming language. Read on to discover how it works and how it can be used in Windows systems.

C

omputer users often have a problem with file search as they tend to forget the location or path of a file even though Windows provides a file search utility. The Explorer in Windows 7 offers a search facility but it takes around two to three minutes to search a file. In this article, I will give you a Python program which will search a file on your computer’s hard disk, within hardly one second. Let us first understand the program’s logic. Figure 1 explains this. Let us first do indexing or, in Python language terms, let’s construct a dictionary in which the file will be the key of the dictionary and the value will be the path of the file. The dictionary will be dumped into the pickle file. The next time, the file will be searched in the dictionary (dumped in the pickle file). Now that you have understood the logic of the program, let us look at the program in detail. I have broken it into different functions. Let’s see what each function does.

#program created by mohit #offical website L4wisdom.com # email-id mohitraj.cs@gmail.com

The block of code below imports the essential modules: import os import re import sys from threading import Thread from datetime import datetime import subprocess import cPickle dict1 = {}

Next, let’s write a function to acquire the drives. This function gets all the drives in your Windows machine. If you have inserted any external/USB pen drive or hard drive disk, the function also obtains details for them. def get_drives(): response = os.popen("wmic logicaldisk get caption") list1 = [] total_file = [] t1= datetime.now() for line in response.readlines(): line = line.strip("\n") line = line.strip("\r") line = line.strip(" ") if (line == "Caption" or line == ""): continue list1.append(line) return list1

Our next function is the search1 function, which constructs a dictionary in which the file name is the key and the path is the value of the dictionary. def search1(drive): for root, dir, files in os.walk(drive, topdown = True): for file in files: file= file.lower() if file in dict1: file = file+”_1” dict1[file]= root else :

70 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


How To Developers Finder –C to create database file

Start

Search the file in databases Database file present or not

NO Create database file

Figure 2: Creating exe file of the Python program

Yes Search the file in database file

elif sys.argv[1] == ‘-c’: create()

Figure 1: Program logic

else:

dict1[file]= root

The create function opens the thread process for each drive, and each thread process calls the search1 function. def create(): t1= datetime.now() list2 = [] # empty list is created list1 = get_drives() print list1 for each in list1: process1 = Thread(target=search1, args=(each,)) process1.start() list2.append(process1) for t in list2: t.join() # Terminate the threads

After creating the dictionary, the following code dumps the dictionary into the hard disk as a pickle file. pickle_file = open(“finder_data”,”w”) cPickle.dump(dict1,pickle_file) pickle_file.close() t2= datetime.now() total =t2-t1 print “Time taken to create “ , total print “Thanks for using L4wisdom.com”

Next time, when you search any file, the program will search the file in the dumped dictionary, as follows: if len(sys.argv) < 2 or len(sys.argv) > 2: print “Please use proper format” print “Use <finder -c > to create database file” print “Use <finder file-name> to search file” print “Thanks for using L4wisdom.com”

t1= datetime.now() try: pickle_file = open(“finder_data”, “r”) file_dict = cPickle.load(pickle_file) pickle_file.close() except IOError: create() except Exception as e : print e sys.exit() file_to_be_searched = sys.argv[1].lower() list1= [] print “Path \t\t: File-name”

Here, we used the search method of regular expressions so that we can use a regular expression to find the file. for key in file_dict: if re.search(file_to_be_searched, key): str1 = file_dict[key]+” : “+key list1.append(str1) list1.sort() for each in list1: print each print “-----------------------” t2= datetime.now() total =t2-t1 print “Total files are”, len(list1) print “Time taken to search “ , total print “Thanks for using L4wisdom.com”

The rest of the code is very easy to understand. Let us save the complete code as finder.py (you can also download it from http://opensourceforu.com/article_source_ code/sept16/finder.zip) and make it a Windows executable (exe) file using the Pyinstaller module. You can also download it from http://l4wisdom.com/finder_go.php. Run the command shown in Figure 2. After running it successfully, you can find the finder.exe in folder C:\PyInstaller-2.1\finder\dist . www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 71


Developers How To

Figure 3: Creating a database of all files

You can put the finder.exe file in the Windows folder, but if you place this in a different folder, you will have to set the path to that folder. Let us run the program. You can see from Figure 3 that just 33 seconds are required to create the database. Now search the file and see the power of the program.

Figure 5: File searching using regular expressions

Figure 6: Searching used power for regular expressions

Figure 4: File searching

I am going to search for songs which contain the string waada. Look at Figure 4. You can see that two searches have taken approximately half a second. The program is case insensitive, so using upper case or lower case doesn’t matter. The program also has the power of regular expressions. Let’s assume that you want to search files which contain the wada or waada strings. Let us look at Figure 5. The regular expression ‘a+’ means the letter ‘a’ can appear once or many times. Again, you can get the result in less than one second. Let us consider one more example of a regular expression search. Let’s assume that you want to search the files which contain wa+da with digit numbers (see the first search of Figure 6). Assume that you want

Read more stories on security and surveillance in

www.electronicsb2b.com

to search the files that start with the string wa+da (see the results of the second search in Figure 6). The program is indeed very useful. Suppose, for instance, you have forgotten the file path but have a vague idea of the file name. You can search the file within one second by using regular expressions. The best part is the speed of the search. You can try searching with the file name repeatedly. But if you use Windows Explorer, each search will take around two to four minutes. By: Mohit The author is a certified ethical hacker and EC Council certified security analyst. He has a master’s in computer science from Thapar University. He is the author of ‘Python Penetration Testing Essentials’. You can contact him at mohitraj.cs@gmail.com and https://in.linkedin.com/in/mohit-raj-990a852a.

TOPSECURITY STORIES

2015 is expected to double by • CCTV camera market devices • The latest in biometric to a bright future turers can look forward • CCTV Camera manufac CCTV into a proactive tool a ing Turn ems Syst s • Video Analytic gies ving with new technolo • Security cameras evol eras cam e dom in t lates • The nt security cameras -proof and vandal-resista • The latest in weather

Log on to www.electronicsb2b.com and be in touch with the Electronics B2B Fraternity 24x7 72 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com

ELECTRONICS

INDUSTRY IS AT A


How To Developers

Start Programming on Raspberry Pi with Python

One of the most revolutionary things that happened in computing in recent times has been the invention of the Raspberry Pi, as it has brought the computer within everyone’s reach. As a fallout, there has been a coding revolution. This article is a primer for coding on a Raspberry Pi.

R

aspberry Pi is a low-cost computing platform. The goal of the Raspberry Pi Foundation is to make computing available to everyone globally to help them to learn programming. Since its initial release in 2012, the Raspberry Pi has seen several enhancements in terms of the amount of RAM, CPU power, peripheral support, and support for networking protocols; yet, it has managed to hold on to its original US$ 35 price tag (about `2400). The latest version, Raspberry Pi 3, was announced in February 2016. It comes with a 1.2GHz 64-bit quad-core ARMv8 CPU, 1GB RAM, built-in wireless/Bluetooth support and much more. This amount of computing power is more than sufficient to run your applications and to program them using a variety of programming tools/environments. In this article, let’s get started with programming on the Raspberry Pi using one of the most popular languages in the world, Python.

Why Raspberry Pi and Python?

The Raspberry Pi has been nothing short of a revolution in introducing millions of people across the world to computing and being one of the drivers behind introducing computer programming to everyone. It has powerful enough hardware to get started with programming and the US$ 35 price tag is hard to beat. The makers of Raspberry Pi have also paid special attention to ensuring that barriers to getting started are minimal. The recommended Linux distribution for Raspberry Pi, Raspbian, comes bundled with multiple programming languages and IDEs so that you are ready to go from the time you power on the mini development board. Python, on the other hand, is one of the most popular languages in the world and has been around for more than two decades. It is heavily used in academic environments and is a widely supported platform in modern applications, especially utilities, and desktop and Web applications. Python is highly recommended as a language that is easy for newcomers to program. With its easy-to-read syntax, the introduction is gentle and the overall experience much better for a newbie. The latest version of the Raspbian OS comes bundled with both Python 3.3 and Python 2.x tools. Python 3.x is the

latest version of the Python language and is recommended by the Raspberry Pi Foundation too.

What you can do with Raspberry Pi and Python

The combination of Raspberry Pi and Python can be used for multiple purposes. Some of the popular items include: ƒ Learning how to program with Python ƒ Connecting your Raspberry Pi to multiple sensors and receiving data from them or control hardware—for example, home automation, environment monitoring via temperature sensors, etc. ƒ Using your Raspberry Pi as a Web server with the program written in Python ƒ Writing various utilities in Python and using your Pi as a server for monitoring and tracking multiple applications, services, etc. These are just some of the things that you can do. You have the full power to convert your ideas to reality using Raspberry Pi along with Python.

The Python IDLE shell and command line

You can use Python from both an IDE (Integrated Development Environment) and from the terminal, depending on your comfort level. Python comes with a simple IDE called IDLE. Both the Python 2 and 3 IDLE are available, and since this article is focused on Python 3.x, we will look specifically at Python 3 IDLE. Figure 2 shows that Python 3 IDLE is available and you can launch the IDE by clicking on it. This will bring up the Python shell as shown in Figure 3. The shell is an REPL (Read Eval Print Loop) that allows you to interactively try out the Python language, including taking inputs from the user. In the Python shell (Figure 3), you can see multiple Python statements that have been tried out, including the print method, assigning variables and using the addition operator (+). On the other hand, if you wish to use the Python shell from the terminal, you can do so by opening up a new terminal and then typing ‘python3’, as shown in Figure 4.

Updating Python packages

Python is known for its strong community and the packages

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 73


Developers How To

Figure 1: Raspberry Pi and Python

that are available to perform various types of functionalities. Since this is outside the core language, you need to install these packages in your local development environment. There are multiple Python packages that are made available in the Raspbian archives. And one of the recommended ways of installing these packages is via the standard apt-get commands as shown below:

Programming

Arduino IDE

Office

BlueJ Java IDE

Internet

Greenfoot Java IDE

Games

Mathematica

Accessories

Python 2 (IDLE)

Electronics

Python 3 (IDLE)

Help

Scratch Sonic Pi

Preferences Wolfram Run... Shutdown...

Figure 2: Starting Python 3 IDLE

sudo apt-get update sudo apt-get install <python-package-name>

But not all packages are made available in the Raspbian archives and, in that case, it is advisable that you use the standard Python Pip package management system. For example, there is a Python requests package that makes it easy for Python applications to handle HTTP call functionality. We can install this package using the following command: $ pip3 install requests

If you would like to see the current versions of the packages being managed by Pip, you can use the commands shown below. For instance, to check on the requests package that we have just installed in the previous step, use the following commands: $ pip show requests --Name: requests Version: 2.4.3 Location: /usr/lib/python3/dist-packages Requires:

Writing a Python application

Python is a popular language for writing various utilities because of its simplicity and powerful packages. We shall write one such program to show you how you can write a simple Python application to get the current scores for all live cricket matches happening around the globe. The steps for writing this program are as follows: 1) We will use the public API provided by cricapi.com. 2) The API endpoint at http://cricapi.com/api/cricket will give us a JSON response for all the current live matches, along with their scores. 3) We will then print out all the live matches happening,

Figure 3: Python shell

Figure 4: Python 3 shell in the terminal

along with their scores. We can write this program in any editor provided on Raspberry Pi, like vi and nano. At the same time, we can also use the IDLE shell to write our program. We will do just that. Make sure that IDLE 3 is launched and from the shell’s main menu, click on File → New File (Ctrl + N). This will open up an editor in which we can type out our program, as shown below: import requests import html r = requests.get(“http://cricapi.com/api/cricket”) if r.status_code == 200: currentMatches = r.json()[“data”] for match in currentMatches: print(html.unescape(match[“title”])) else:

74 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


How To Developers Hampshire 548/6 v Lancashire 310/4 * Leicestershire 380/10 & 109/5 * v Derbyshire 362/10 Middlesex 293/10 v Surrey 415/10 & 234/6 * Australia A 181/1 * v South Africa A 304/10 Zimbabwe v New Zealand 436/4 * South Africa Emerging Players v Sri Lanka Development Emerging Team 129/2 * Northern Knights v North-West Warriors Western Storm v Surrey Stars Guyana Amazon Warriors v Jamaica Tallawahs

Figure 5: Creating a new file in Python 3 IDLE

GPIO programming

The Raspberry Pi is often used in conjunction with other hardware to create interesting electronic projects. The Pi 3 comes with 40 GPIO pins that you can use to interface with various hardware devices—for both receiving data from them or for writing data to them. This way, we can write applications to both read and also to control devices, i.e., turn them on and off, etc. To program the GPIO pins on Raspberry Pi with Python, there is a Raspberry Pi GPIO Python library that is made available. The following command is used to install the GPIO library for Python 3: Figure 6: Saving a Python file sudo apt-get install python3-rpi.gpio print(“Error in retrieving the current cricket matches”)

Notice the following points about the program code: 1. We are using the requests and HTML package. So ensure that you have used Pip to install the packages. 2. We invoke the get method from the requests package, which makes an HTTP GET call to the cricket API. You can paste the URL http://cricapi.com/api/cricket in any browser to see the JSON data that is returned. 3. If the HTTP response returned is 200, which means OK, then we are using the in-built JSON decoder available in the requests library to extract the data element, which is the root element. 4. We finally iterate through all the records, i.e., matches that are currently happening and then print out the title attribute, which contains the team names and the scores. To run the program from IDLE, you can simply press F5. This will prompt you to save the file and you can save it with a name of your choice. We choose cricket.py and it is saved in the /home/pi folder. The program will then be executed in the Python IDLE shell and you will see the output with the list of matches. Another way of running the application is from the terminal. Simply launch the terminal and go to the /home/pi folder where you saved the cricket.py file. Next, execute the application using the Python 3 interpreter as shown below: $ python3 cricket.py England 297/10 & 414/5 * v Pakistan 400/10 Essex 358/10 & 163/4 * v Sussex 448/10

Once the library is installed, we can use it in a simple Python application, the quintessential LED blinking program as shown below (remember to launch the IDLE shell or the Python shell at the terminal with sudo privileges). This assumes that you have connected a breadboard, LED, resistor to the Raspberry Pi GPIO and the LED is being driven via pin 18. import RPi.GPIO as GPIO import time GPIO.setmode(GPIO.BCM) # set board mode to Broadcom GPIO.setup(18, GPIO.OUT) # set up pin 18 while True: GPIO.output(18, 1) # turn on pin 18 time.delay(1) #delay for 1 seconds GPIO.output(18,0) #turn off pin 18

References [1] Raspberry Pi Home: www.raspberrypi.org [2] Raspberry Pi Hardware (https://www.raspberrypi.org/ learning/hardware-guide/components/raspberry-pi/) [3] Python: www.python.org

By: Romin Irani The author has been working in the software industry for more than 20 years. His passion is to read and write about technology, and teach it to make successful developers. He blogs at www.rominirani.com.

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 75


Developers How To

Build a Website Using Bootstrap and the Express.js Framework The Express.js framework is the most popular and widely used framework for Node.js, while Bootstrap is the frontrunner in the development of responsive, mobile first Web applications on the Net. This article presents a tutorial on how to create a website using the two popular open source frameworks.

B

ootstrap is an open source, free and popular front-end framework that’s used to design websites and Web applications. It contains HTML and CSS based design templates for buttons, navigation, forms, typography, other interface components and also optional JavaScript extensions. Bootstrap mainly focuses on the front-end development of Web applications. The Bootstrap framework is also the second most popular project on GitHub, with over 99,000 stars and over 44,000 forks at the time of writing this article. It has some great features, which are explored further in this article. Browser compatibility: Bootstrap is compatible with all the latest and popular versions of Web browsers, such as Google Chrome, Safari, Opera, Internet Explorer and Mozilla Firefox. Responsive Web design: Bootstrap version 2.0 onwards has support for responsive Web design, which allows the layout of the Web pages to be adjusted dynamically, depending upon the characteristics of the device used – whether it is a desktop, mobile phone or tablet. From version 3.0 onwards, Bootstrap has adopted a mobile-first-design philosophy, which basically emphasises responsive design, by default. Style sheets: Bootstrap offers a set of style sheets that

provide basic style definitions for all key HTML elements or components. These basic style definitions provide a uniform, modern appearance for formatting text, tables and form elements. Reusable components: Bootstrap also contains other commonly used interface elements. These are reusable components that are implemented as CSS classes, which can be applied to certain HTML elements in a Web page. JavaScript components: Bootstrap has several JavaScript components in the form of jQuery plugins. These JavaScript components provide some additional user interface elements such as alert boxes, tooltips and carousels, and also extend the functionality of some existing interface elements like the auto-complete function for the input fields. SASS and Flexbox support: The Bootstrap version 4.0 alpha release has SASS and Flexbox support. SASS (Syntactically Awesome Style Sheets) is a scripting language that is interpreted into Cascading Style Sheets. Flexbox, also known as CSS Flex Box Layout, is a CSS3 Web layout model which allows responsive elements within a container to automatically arrange themselves according to different screen sizes and devices.

76 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


How To Developers This command will make project your working directory. Now, use the npm init command to create a very important package.json file for the website/ application. Run the following command in the newly created folder, i.e., the project folder:

Tip: The Bootstrap framework is open source and the source code of the framework is available on GitHub. Developers/programmers are encouraged to participate and contribute to the project.

Express.js framework

Express.js or Express is a Web application framework for Node.js designed for building Web applications and APIs. It is the de facto standard server framework for Node. js. Express is also the back-end part of the MEAN stack, together with the MongoDB database and the AngularJS front-end framework. Some of the salient features of the Express.js framework are: ƒ It’s a cross-platform framework, which means it is not limited to one OS. ƒ It is a server-side Web and mobile application framework written in JavaScript. ƒ It provides Express Generator, which allows you to create complex applications quickly. ƒ It supports the MVC pattern. ƒ Express.js comes with two template engines, Jade and EJS, which facilitate the flow of data into the structure of a website. ƒ It has a provision for building single-page, multi-page and hybrid Web and mobile applications, as well as APIs or Application Programming Interfaces.

Installing Node.js

Express.js is a Node.js based framework, so Node.js must be installed first. Download and install Node.js (https://nodejs. org/). Express.js is distributed as an npm package, so we will use Node and npm to install Express.js.

Building a website

Let us build a website using these two frameworks – Bootstrap and Express.js. We will use open source programming languages like HTML, CSS and JavaScript along with these two frameworks, to create a website. Note: It is assumed that you have a basic knowledge of Web technologies like HTML, CSS and JavaScript. If you don’t, W3Schools (http://www.w3schools.com/) is a good place to start. The site has some great tutorials for Web technologies which are easy to follow. First, let’s create a folder/directory for our website. Open the command prompt/terminal and type the following command:

C:\project>npm init

This command prompts you to enter certain fields, such as name, version of your website or application. You can either hit the Enter key and accept the default setting or enter the details you want, with the following exception: Entry point: (index.js)

At this point, you will be asked to enter the entry point file name. If you hit Enter, it will set the index.js file as the entry point, by default. But for this tutorial, we are going to give it a new name. Let’s name it website.js. Type website.js and press the Enter key. Now, let us install Express in the project directory and save it in the dependencies list. Run the following command in the project folder: C:\project>npm install express –save

The above command will install all the important Express modules, and save the dependencies in the package.json file. The package.json file will look like what follows: { “name”: “project”, “version”: “1.0.0”, “description”: “Sample Project”, “main”: “website.js”, “scripts”: {    “test”: “echo \”Error: no test specified\” && exit 1” }, “author”: “AniketKudale”, “license”: “ISC” }

ƒ ƒ ƒ ƒ ƒ

C:>mkdir project

ƒ This command creates a folder called project: ƒ C:>cd project

Here’s a description of the code snippet given above. name: Name of the Web application. version: Version of the Web application. description: Description of the Web application. main: Main entry point of the Web application; in this file, we store our application logic. scripts: Here, we specify script commands that run at various intervals in the life cycle of our package. author: Here, we specify the author’s name, i.e., your name. license: Here, we can specify the licence type. The Express.js framework has now been successfully

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 77


Developers How To Creating the HTML pages

Open Source For You!

Navigate to our project folder, and create a new folder called ‘views’ in it. After that, open Notepad or any of your favourite code editors, and copy the following HTML code: <!doctype html> <html lang=”en”>

Figure 1: Sample example running in the browser

installed; so let us test it by creating a sample ‘Hello World’ example. In our project directory, create a file named website.js and add the following code: var express = require(‘express’); var app = express(); app.get(‘/’, function (req, res) { res.send(‘<h1>Open Source For You!</h1>’); }); app.listen(3000, function () { console.log(‘Example app listening on port 3000!’); });

The above code starts a server and listens on port 3000 for connections. It also responds with the text ‘Open Source For You!’ in the header 1 HTML formatting for requests made to the root URL (/), and for every other path, it will respond with a message ‘404 Not Found’. To execute, run the following command in the project folder: C:\project>node website.js

Then, load http://localhost:3000/ in the browser to see the output. If you see the text printed in the browser, as shown in Figure 1, then you are all set for creating a website using Bootstrap and Express.js.

Creating a website

We are going to use Bootstrap and basic HTML for the view, and the Express.js framework as a Web server and to handle routes. You can download all the necessary Bootstrap files from http://getbootstrap.com/ or you can use files from CDN (Content Delivery Network). For our sample website, we are going to use Bootstrap files from CDN. Let’s start by creating views first; our website will have three Web pages. 1. index.html: Index page of our sample website. 2. product.html: The product page of our sample website, where we will add some sample information about products. 3. about.html: The ‘About us’ page of our sample website, where we will add contact details, etc.

<head> <meta charset=”UTF-8”> <title>Sample website using Bootstrap and ExpressJS</title> <!---CDN Links--> <script src=”//ajax.googleapis.com/ajax/libs/jquery/1.11.1/ jquery.min.js”></script> <link rel=”stylesheet” href=”http://maxcdn.bootstrapcdn.com/ bootstrap/3.3.1/css/bootstrap.min.css”> <script src=”//maxcdn.bootstrapcdn.com/bootstrap/3.3.1/js/ bootstrap.min.js”></script> <script src=”website.js”></script> </head> <body> <div> <div> <nav class=”navbar navbar-inverse” role=”navigation” style=”padding-left:130px;”> <ul class=”nav navbar-nav”> <li class=”active”><a href=”/”>Home<span class=”sronly”>(current)</span></a></li> <li><a href=”/product”>Products</a></li> <li><a href=”/about”>About us</a></li> </ul> </nav> </div> <br/> <div class=”jumbotron”> <p> This is place to put your Sample Content. </p></div> </div> </body> </html>

Save the above code as index.html in the views folder, which is present inside the folder named project (i.e., at C:\ project\views>). As you can see in the above code, we have used Bootstrap and jQuery files from CDN. Also, we have included the website.js file, where we are going to write routing logic for this sample website. In the above code, we have also used Bootstrap’s Navbar class to provide navigation to the HTML pages present in the views folder. Since this is the index page, we have set the class of the Home link in the navbar as ‘active’.

78 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


How To Developers For product.html also, open Notepad or any of your favourite code editors and copy the following HTML code: <html> <head> <link rel=”stylesheet” href=”http://maxcdn.bootstrapcdn.com/ bootstrap/3.3.1/css/bootstrap.min.css”> <script src=”//maxcdn.bootstrapcdn.com/bootstrap/3.3.1/js/ bootstrap.min.js”></script> </head> <body> <div> <div> <nav class=”navbar navbar-inverse” role=”navigation” style=”padding-left:130px;”> <ul class=”nav navbar-nav”> <li><a href=”/”>Home</a></li> <li class=”active”><a href=”/product”>Products<span class=”sr-only”>(current)</span></a></li> <li><a href=”/about”>About Us</a></li> </ul> </nav> </div> <br/> <div class=”jumbotron”> <p>

Put the product details here! </p> </div> </div> </body> </html>

As you can see in the above code too, we have used Bootstrap files from CDN, and we have also used Bootstrap’s Navbar class to provide navigation to HTML pages present in the views folder. In this case, since this is the product page, we have set the class of the Products link in Navbar as ‘active’. Similarly, for about.html, open Notepad or any of your favourite code editors and then copy the following HTML code: <html> <head> <link rel=”stylesheet” href=”http://maxcdn.bootstrapcdn.com/ bootstrap/3.3.1/css/bootstrap.min.css”> <script src=”//maxcdn.bootstrapcdn.com/bootstrap/3.3.1/js/ bootstrap.min.js”></script> </head> <body> <div> <div>

<nav class=”navbar navbar-inverse” role=”navigation” style=”padding-left:130px;”> <ul class=”nav navbar-nav”> <li><a href=”/”>Home</a></li> <li><a href=”/product”>Prodcts</a></li> <li class=”active”><a href=”/about”>About Us<span class=”sr-only”>(current)</span></a></li> </ul> </nav> </div> <br/> <div class=”jumbotron”> <p>

Put the contact details here! </p> </div> </div> </body> </html>

Here, too, we have used Bootstrap files from CDN, and we have used Bootstrap’s Navbar class to provide navigation to HTML pages present in the views folder. In this case, since this is the About page, we have set the class of the ‘About Us’ link in the Navbar as ‘active’. We are now done with the view/presentation part. Let’s add logic to these Web pages by making use of the Express.js framework. Let’s navigate to the root of our project directory, open the file website.js present there, and delete the code inside it before copying the following code: var express = require(‘express’); var app = express(); var router = express.Router(); var path = __dirname + ‘/views/’; app.use(‘/’,router); router.get(‘/’,function(req, res){ res.sendFile(path + ‘index.html’); }); router.get(‘/product’,function(req, res){ res.sendFile(path + ‘product.html’); }); router.get(‘/about’,function(req, res){ res.sendFile(path + ‘about.html’); }); www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 79


Developers How To

This is place to put your Sample Content.

Figure 2: Server running and listening for connections on a port app.use(‘*’,function(req, res){ res.send(‘Error 404: Not Found!’); }); app.listen(3000,function(){ console.log(“Server running at Port 3000”); });

Here is an explanation of the code snippet given above. First, we load the dependencies, i.e., the Express.js framework. We are also loading the router(), which is the built-in routing service provided by the Express.js framework. Since we have stored our HTML files in the ‘views’ folder, we have assigned the path using the __dirname keyword, which basically points to our current working directory. Then, we used app.use (‘/’, router) since we are using routes in the code. After this, we define the routes: /, /product and /about. These router definitions have a sendFile() function, which is a built-in function and is designed to send files to the Web browser. For example, in our case, in the index.html page, if the user clicks on any one of the Navbar links, then the router.get() function provides the file associated with that particular link. And if the user enters some invalid routes, we can also display the custom error message by using the * regex pattern in the app.use() function. Finally, we declare the port number that listens to connections using the app.listen() function. Your project folder must end up with the following directory structure: -- node_modules -- views + -- index.html + -- product.html

Figure 3: The website we created, running in the browser + -- about.html package.json website.js

The above project directory structure has two folders (named node_modules and views) and two files (named package.json and website.js). The views folder contains three HTML files named index.html, product.html and about.html.

Running the website

To execute, run the following command in the project folder: C:\project>node website.js

The message shown in Figure 2 will be displayed. Then, load the address http://localhost:3000/ in the browser to see the output. You should be able to see your first website created using Bootstrap and the Express.js framework, as shown in Figure 3.

References [1] [2] [3] [4]

http://getbootstrap.com/ https://expressjs.com/ https://en.wikipedia.org/wiki/Express.js http://www.w3schools.com/bootstrap/default.asp

By: Aniket Eknath Kudale The author is an open source enthusiast who has more than two years of experience as a software engineer at Tibco Software Inc., Pune. You can reach him at kudale@aniket.co

Know the Leading Players in Every Sector of the Electronics Industry

ACCESS ELECTRONICS B2B INDUSTRY WITH A

www.electronicsb2b.com Log on to www.electronicsb2b.com and be in touch with the Electronics B2B Fraternity 24x7 80 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Let’s Try For U & Me

Programming in R R is a language and environment for statistical computing and graphics. It is a simple and effective programming language, which includes conditionals and loops. Often called GNU S, it is used to statistically explore datasets and to make many graphical displays of data. 

R

is a highly powerful computer language, an environment and integrated suite of software facilities. The functionality of R can be easily extended via packages. A typical R studio window might have four panes as depicted in Figure 1. The user can type the commands in Pane 1 and press Ctrl + Enter in order to execute the entered command. The output will appear in Pane 2, i.e., the pane with the caption ‘Console’. Alternatively, the user might type the command in the Console pane itself and obtain the result on pressing the Enter key. This section dwells on the following: accepting input from the keyboard, generating sequences, and random numbers in R. scan() can be used for obtaining numeric inputs from the keyboard, as shown below:

> x2 <- scan()

Now the user can enter the numbers in the Console window and press the Enter key instead of entering a number in order to indicate the end of data input. The numbers will be assigned to x2.

Sequence-generating operator

The colon (:) can be used to generate sequences in R, as follows: > x <- 11:20

# the integers in the range 11 to 20 (both

inclusive) will be # stored in x > x # print the values stored in x

For shuffling the values stored in x, the sample() function can be used: > q<-sample(x) > q

The values stored in q might be: 13 14 16 19 17 15 12 11 18 20. The values stored in x remain intact. seq() can be used in R for generating a sequence, as follows: > z <- seq(from=11,to=30,by=3) > z

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 81


For U & Me Let’s Try 11:15 is the range between which random numbers have to be generated, 3 the number of random numbers to be generated and replace = T, which indicates that repeats are permissible. For generating a sequence of random numbers, and to generate the same sequence later, set.seed() and runif() can be used as follows: > set.seed(7) > runif(7)

Note: 1) Help is readily available in R. In order to obtain help on random numbers, the following command might be helpful:

Figure 1: A typical R studio window

help.search(“random numbers”)

A sample output follows:

2) <- (Less than, followed by the minus symbol) is the assignment operator in R. Alternatively, the ‘=’ operator can also be used. 3) R is a case-sensitive language.

[1] 11 14 17 20 23 26 29

In Figure 1, the command from is for specifying the starting value of the sequence, to is for specifying the ending value of the sequence, and by is to increment the sequence. The concatenation function can be used to store values in x1, as shown below: > x1 <- c(11,12,13,14,15,16,17,18,19,20) #assigning integers in the range #11 to 20 > x1 #print the values stored in x1

Random numbers in R

Random numbers are used in simulation and they are also used by statisticians. To generate a random number (having a fractional part) between 11 and 15, and store it in rnum, the following command can be used: > rnum <- runif(1,11,15) > rnum # print the value of rnum [1] 14.75596 (Sample output)

The section below dwells on vector and matrix operations in R.

Vectors in R

The R programming environment provides very powerful vector and matrix operation tools. The code below creates a vector for storing the first 13 members of the Fibonacci series: Code snippet 1 > fib <- numeric(13) > > > + + >

fib[1] <- 0 fib[2] <- 1 for(idx in 3:13) { fib[idx] <- fib[idx-1] + fib[idx - 2] } fib

The output of the above code would be:

By default, runif(n) generates random numbers in the range of 0 to 1.

[output] 144

> rnum <- runif(10,11,15) # for generating 10 random numbers # in the range of 11 to 15

ƒ

fib <- numeric(13): Sets up a numeric vector the length

ƒ

fib[1] <- 0: Sets the first element of vector fib to the

In order to generate three integer random numbers between 11 and 15, the following command can be used: > x <- sample(11:15,3,replace=T) > x

0

1

1

2

3

5

8 13 21 34 55 89

of which is 13 and this vector is initialised with zeroes.

value 0. The next statement is self-explanatory. ƒ The for() loop computes the third to the thirteenth element (except for the first two elements, each term in the Fibonacci series is the sum of the preceding two terms). ƒ fib: Displays the elements of the vector fib.

82 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Let’s Try For U & Me The contents of the fib vector can also be printed with the help of the for() loop, as shown below: > for(idx in 1:13) + print(fib[idx])

Matrix operations in R

This part of the article dwells on the terseness of complex matrix operations in R. The command… M <- matrix(c(9,8,7,6,5,4,3,2,1),3,3)

A sample output is shown below:

…creates a 3 x 3 matrix as follows:

0 1 1 2 3 5 8 13 21 34 55 89 144

[1,] [2,] [3,]

[,1] [,2] [,3] 9 6 3 8 5 2 7 4 1

Note: 1) [2,] refers to the second row and [,3] refers to the third column. 2) The numbers are, by default, entered into the matrix in a column-wise fashion. The command… N <- matrix(c(10,11,12,13,14,15,16,17,18), 3, 3,byrow=TRUE)

is.vector(x) and is.matrix(x) can be used to determine whether x is a vector or matrix.

…creates a 3 x 3 matrix as follows:

> is.vector(fib) will yield the following:

[1,] [2,] [3,]

[output] TRUE

And > is.matrix(fib) will yield what follows:

In order to display matrix M, the user can simply type the following command…

[output] FALSE

A few more commands (along with the output) related to vector operations are as follows: > min(fib)

[,1] [,2] [,3] 10 11 12 13 14 15 16 17 18

M

…at the ‘>’ prompt and press Ctrl + Enter. For adding matrices M and N, and storing the result in matrix A, the user can type the following commands:

#for finding minimum value in a vector

[output] 0 > max(fib) #for finding maximum value in a vector [output] 144 > mean(fib) #for finding mean value [output] 28.92308 > median(fib) #for finding median value [output] 8 > var(fib) #for finding variance [output] 1889.744 > sd(fib) #for finding standard deviation [output] 43.47118 > sort(fib,decreasing = TRUE) #for sorting in descending order [output] 144 89 55 34 21 13 8 5 3 2 1 1 0 > sum(fib) #for finding sum of the vector elements > summary(fib) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.00 2.00 8.00 28.92 34.00 144.00

> A <- M + N > A

The output will appear as follows:

[1,] [2,] [3,]

[,1] [,2] [,3] 19 17 15 21 19 17 23 21 19

In order to store the transpose of M into TM and for displaying TM, the following commands can be used: > TM <- t(M) > TM [,1] [,2] [,3]

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 83


For U & Me Let’s Try [1,] [2,] [3,]

9 6 3

8 5 2

7 4 1

$rank [1] 2 $qraux [1] 1.646162e+00 1.283611e+00 1.280443e-16 $pivot [1] 1 2 3 attr(,”class”) [1] “qr”

To multiply matrices M and N and store the result in matrix MM, the following commands can be used: > MM <- M > MM [,1] [1,] 216 [2,] 177 [3,] 138

%*% N [,2] 234 192 150

For combining matrices M and N (columnwise), use the following command:

[,3] 252 207 162

> cbind(M,N)

The output will be as follows:

Note: The system won’t crash if the matrices are not conformable for multiplication. If the commands typed are…

[1,] [2,] [3,]

> SM <- M * N > SM

For combining matrices M and N (row wise), use the following command:

...then the output will be: [,1] [,2] [,3]

SM <- M * N

> rbind(M,N).

Figure 2: Output of M * N

To find the determinant of M and display the result, use the following commands: > d <- det(M) > d

Assume that a csv file (C:\Users\faculty\Documents\info.csv) contains the data given in Figure 3. A pie chart can be drawn based on the data in Figure 3, with the following commands:

> MI <- solve(M)

rm <- qr(M) > rm

Note: 1) To get a list of all the variables that have been defined in a R session, use the ls() command. 2) To display the warnings being generated by a code snippet, use warnings(). 3) To exit from R, use q().

Pie charts in R

To find the inverse of matrix M, use the following command:

To find the rank of matrix M, type in the following command:

[,1] [,2] [,3] [,4] [,5] [,6] 9 6 3 10 11 12 8 5 2 13 14 15 7 4 1 16 17 18

A

C

B

1

Strongly disagree

10

2

Disagree Neutral

13

3 4

Agree

11

5

Strongly agree

2

7

Figure 3: Sample data in the csv file

A typical output is shown below: $qr [,1] [,2] [,3] [1,] -13.9283883 -8.7590895 -3.589791e+00 [2,] 0.5743665 0.5275893 1.055179e+00 [3,] 0.5025707 0.9589395 1.280443e-16

> d<- read.csv(“C:/Users/faculty/Documents/info. csv”,header=FALSE,sep=”,”) # for assigning the data in the file info.csv to d > d # for displaying the value of d

R automatically assigns names to the rows and columns. The row names and column names can be displayed using the following commands: > rownames(d) [sample output] “1” “2” “3” “4” “5” > colnames(d) [sample output] “V1” “V2”

To display the values of V1 and V2, the following commands can be used:

84 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Let’s Try For U & Me > d$V1 {Sample output: [1] Strongly disagree Disagree Neutral [4] Agree Strongly agree 5 Levels: Agree Disagree Neutral ... Strongly disagree > d$V2 [1] 10 13 2 11 7} > lab<- round(d$V2/sum(d$V2) * 100,1) > pie(d$V2,labels=paste(d$V1,lab, sep = “ “),main=”Responses” ,clockwise=TRUE)

> library(plotrix) > pie3D(d$V2,labels=paste(d$V1,lab, sep = “ “),main=”Response s”,explode=0.2)

Responses

Strongly disagree 23.3

Disagree 30.2

A pie chart will appear in the Plots window as shown in Figure 4.

Neutral 4.7 Agree 25.6

Strongly agree 16.3

Responses Figure 5: 3D pie chart Strongly agree 16.3

Strongly disagree 23.3

Agree 25.6 Neutral 4.7

Disagree 30.2

Figure 4: Pie chart in the Plots window

pie3D() can be used for generating a 3D pie chart. To use pie3D(), it is necessary to install the plotrix package: > install.packages(“plotrix”)

R language can be used to perform highly complex operations related to statistics as well as econometrics. It can also be used for working on images and mathematical modelling. In fact, many premier institutions in several countries have made R a part of their regular curriculum, and insist on researchers using this language for carrying out statistical research. By: Wallace Jacob The author works as senior assistant professor at Tolani Maritime Institute in Pune. He is the author of the much acclaimed book The Unfathomable World of Amazing Numbers and the co-author of Puzzles – Exercises For Your Brain. He can be reached at wallace_jacob@yahoo.co.in

Read more stories on Components in

www.electronicsb2b.com COMPONENTS STORIES

TOP

nverters • The latest in power co ent distributors • India’s leading compon ry onics components indust • Growth of Indian electr components for LEDs • The latest launches of ics components for electron • The latest launches of

ELECTRONICS

INDUSTRY IS AT A

Log on to www.electronicsb2b.com and be in touch with the Electronics B2B Fraternity 24x7 www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 85


For U & Me Let’s Try

Audacity: Yet Another Tool for Speech Signal Analysis Most of us have heard of Audacity in connection with making ringtones. But it has many more uses, such as recording live audio from a voice or from musical instruments, creating audio books, mixing music, and so on. This article gives readers a good introduction to the software.

A

udacity is a free, multi-track audio editor and recorder. An open source tool for speech analysis, its interface is translated into many languages, and its source code is available from https://github.com/ audacity/audacity. Audacity has several functionalities. It is used to: ƒ Record live audio from voice or musical instruments ƒ Record audio from the system (e.g., Media Player, YouTube audio, etc) ƒ Create podcasts by mixing voice with background music ƒ Deploy audio from any external recording device (tape decks, radio, video cassette recorders, personal video recorders, DVD players, TVs, etc) to the digital format (CD, computer, portable music player, etc) ƒ Edit sound files with extensions like WAV, AIFF, FLAC, MP2, MP3, etc ƒ Apply many effects, such as removing vocals from a song, changing the pitch/speed/frequency/amplitude of the audio, generating or removing the noise, etc ƒ Split an audio into separate tracks ƒ Export audio files to iTunes ƒ Create audio books for kids ƒ Plot the spectrum and spectrogram of a sound file

Installation guide

Audacity is a cross-platform tool, available for Windows, Mac and Linux. Its latest version is Audacity 2.1.2. There are

many ways to get this software: ƒ From the official Audacity website ƒ From the Ubuntu Apps Directory (for Linux) ƒ From the SourceForge website After downloading the software, the installation process is very simple to follow, as directed by the installation wizard.

Getting familiar with the Audacity environment

Audacity offers various functionalities. Some of them are explored here by using the screenshots from Audacity’s official documentation website (Reference 4). Interested readers could refer to this site for more details. The most commonly used toolbars are described below. Transport Toolbar: This manages recording and playback. Tools Toolbar: This is used for audio selection, volume adjustment, zooming and time-shifting. Recording Meter Toolbar: This displays recording levels. Edit Toolbar: This has features such as cut, copy, paste, trim audio, silence audio, undo, redo, sync-lock, zoom, etc. Device Toolbar: This selects the audio host, recording device, recording channels and playback device. ƒ The Audio Host selects a specific interface, using which Audacity connects with the selected playback and recording devices. In Windows, MME is the default setting and is compatible with almost all audio devices, while on Linux the option is ALSA. ƒ The Recording Device can be chosen as per

86 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Let’s Try For U & Me ALSA

default

default

2 (Stereo)

Figure 2: Device Toolbar

Figure 3: Mixer Toolbar

Figure 1: The Audacity GUI environment

requirements. The internal microphone (the default recording device of your system) or the external microphone (like your headphones’ microphone) is selected when recording your own audio. Stereo Mix should be selected if you want to record the audio from your system (say, from Media Player). ƒ Recording Channels can either be mono, stereo or as specified by sound device drivers. • For a mono recording device (most microphone ports), selecting ‘2 (Stereo)’ in Recording Channels copies the mono source to both channels, producing a dual mono recording. • For a stereo device, setting Recording Channels to ‘2 (Stereo)’ ensures that any settings in the system or sound device control panels are stereo. ƒ In order to select the Playback Device, choose the built-in or attached sound device that you want to use for playback. In Windows, the default Playback Devices are Speakers and Headphones (IDT high definition audio codec), where IDT is the name of the sound card manufacturer. You will observe two vertical bars to the right, which are meant for resizing the toolbar by clicking and dragging on the bars. Mixer Toolbar: This has the following features. Recording volume slider: Sets the recording volume (on the left side) Playback volume slider: Sets the playback volume (on the right side) Stereo audio track: Figure 4 shows the separate components of an Audacity stereo track. The signal in the top panel and the vertical scale represent the left channel, while the signal in the bottom panel and the vertical scale represent the right channel. By clicking on Mute in the Track Control panel, you can mute the selected track and listen to all other parallel tracks, if any. By clicking on Solo, only the selected track is played.

Getting started

Importing and playing existing files: Follow the instructions given below to import and play existing files. ƒ To load the files into the current project window,

Figure 4: Stereo audio track

choose File->Import->Audio. Press the Play button to start playback. Pressing the Pause button once will pause playback, and pressing it again will resume the process. ƒ Press the Stop button to stop the playback. Recording your audio: To record audio, this is what you need to do. 1) Set the recording device from the Device Toolbar: ƒ To record audio (your live audio, any live external musical instrument or both together), select Internal Mic (the default recording device of your system) or External Mic ( to record from your headphones’ mic, for instance). ƒ To record the system’s audio (e.g., a music file from Media Player, or audio from YouTube’s live video), select Stereo Mix as a Recording Device from the Device Toolbar. 2) Press the Record button to record. 3) Stop the recording using the yellow Stop button. (You can pause in between by selecting the Pause button, and clicking it again to resume.) 4) After recording the audio, you can listen to it by clicking the Play button. 5) To save the audio, choose File->Export Audio. Then save it in the desired format from the drop down list. The default extension is the ‘.wav’ format. Editing the audio: For editing the audio, go to File>Import->Audio. If the audio is in the workspace folder of Audacity, then Audacity will offer you the option to open a separate copy of the file for editing, which is recommended. To select a particular region for editing, click on the track and drag the shaded area with the mouse. If no audio is selected, the entire audio file in the project window is selected, by default. Figure 5 shows the Edit Toolbar in which the Zoom buttons are highlighted. The is the ‘Zoom in’ tool, and is the ‘Zoom out’ tool. The Selection Tool is used to zoom in to get a finer, more in-depth look at the signal. For this, choose the selection tool, then click near the desired location, and click the Zoom ƒ ƒ

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 87


For U & Me Let’s Try press the Delete key. 6. If you have made a mistake, go to Edit->Undo. Figure 5: Edit Toolbar

Features of Audacity Change Pitch

Change Pitch without Changing Tempo by Vaughan Johnson & Dominic Mazzoni using SoundTouch, by Olli Parviainen

Pitch: From:

Up

G

Semitones (half-steps):

Down

Preview

A#/Bb

3.00

Frequency (Hz): from Percent Change:

To:

760.34!

to

904.20!

Cancel

OK

18.921

Figure 6: Changing the pitch

Figure 7: Spectrum of an audio signal

In button. Keep clicking the Zoom In button until you see the details. To delete some part of the audio (like a 15-second clip), follow the steps below: 1. Stop the playback, click near the point where you want the 15-second portion to begin. 2. Zoom in until the Timeline shows 15 seconds or more before and after the cursor. 3. Holding down the Shift key, click 15 seconds to the right of the cursor. 4. Adjust the start and end of the selection with the mouse. 5. From the Edit Toolbar, select the icon ‘Trim Audio’ or

Audacity has many exciting features. Some of them are discussed here. Changing the pitch: Select the entire audio (by clicking on the Track Control panel). Go to Effects->Change Pitch. Now, change the pitch from the given tone to the desired one with the help of the drop-down list. Click OK. Low Pass Filter: This feature of Audacity allows users to keep only those frequencies that are lower than a certain cut-off frequency. It attenuates the frequencies above this threshold. It can be useful to reduce the high-pitched noise. It can also be very helpful for removing vocal parts from the song. Select the track and go to Effect->Low Pass Filter. Alternatively, go to Effect->LS Filter. Just change the cutoff frequency and apply. Equalisation: With the help of this effect, we can manipulate the sounds by frequency. It allows us to increase the volume of certain frequencies and to reduce others. Select the track, go to Effect->Equalization. Let’s suppose that in the following example, we want to change the balance of high and low frequencies in the audio, to convert it in the form of a ‘Walkie-Talkie’. So from the drop-down menu at the bottom, select the option ‘WalkieTalkie’ and click OK. Plot the spectrogram: Select the track. From the dropdown list near the name of the track on the Track Control panel, select Spectrogram. Plot the spectrum: Select the track and go to Analyze>Plot Spectrum. You can change the window for the plot from the drop-down list. Audacity is a very useful tool for speech analysis. It is considered the richest tool among all audio analysis software. It is basically good at recording podcasts, different musical tracks, mixing them or separating them, while applying many effects. Moreover, it has the ability to import audio from various sources of recordings, and to export it in digital form.

References [1] [2] [3] [4]

https://github.com/audacity/audacity http://www.audacityteam.org/download/ https://sourceforge.net/projects/audacity/ http://manual.audacityteam.org/o/index.html

By: Raag Anarkat and Sapan H. Mankad Raag Anarkat is a FOSS enthusiast from Nirma University who can be contacted at 12bit053@nirmauni.ac.in Sapan H. Mankad is assistant professor in the CSE department of the Institute of Technology, Nirma University. You can contact him at sapan.mankad@gmail.com. He blogs at htttp://www.shmlive. wordpress.com. Web page: http://www.nirmauni.ac.in/ITNU/ Faculty/Prof-Sapan-H-Mankad.

88 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Let’s Try For U & Me

Have Some Fun Converting Plain Text to Handwritten Text Recurrent neural networks (RNN) can be used to generate handwritten text. This is a fun project that you might like to try out for yourself. This article details the steps to be taken to generate handwritten text from plain text.

T

yping has brought a lot of ease to content generation, but sometimes, you may need handwritten text rather than the typed version. Cursive fonts try to address this but fail since the nature of the problem is not addressed; instead, only the symptoms are treated. Alex Graves has published a paper which demonstrates how to generate sequences with recurrent neural networks (RNN). It can be found at http://arxiv.org/abs/1308.0850. The long and short of it is that we can now generate handwriting using RNNs. In this article, we demonstrate a nifty little program which converts any plain text file into a series of images that have handwriting on them instead of the usual printed text. This article does not aim to discuss Graves’ paper, but rather, lets you use the research and create your own handwritten text. So, let's dive in. This has been tested on Arch Linux and Ubuntu 15. It should work on any platform where Python3.x is installed. The steps are as follows. First off, we need to install what’s required to run this software. Let’s first install Python3 from http://python.org.

If you are on a Linux system, Python is already installed. On Ubuntu, however, you will need the correct version of Python and so need to run the following command: sudo apt-get install python3

Once that is done, we need to install some of the packages required for our software. To avoid complicating matters, let’s use the virtual environment system for Python. To create a virtual environment, we run the following command: virtualenv -p python3 env

After this, to activate the virtual environment, use the command given below: source env/bin/activate

Now that the virtual environment is created and active, we can install libraries in it using the following command: www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 89


For U & Me Let’s Try

Figure 1: Sample page 1

Figure 2: Sample page 2

pip install requests pillow

python get_hand.py example.txt 0

This installs the two libraries requests and pillow for URL retrieval and image processing, respectively. With that, our pre-requisites are done. We can now download the software and get going. There are two ways to do so. First, if you have git installed, you can clone the repository at https://github.com/theSage21/handwritten using the following command:

Here, the first part invokes Python and instructs it to run the script named get_hand.py. The variables given to the script are the name of the text file used to generate the text and the starting line number. If you do not provide the latter, it is assumed to be 0. This will create a folder called images in the handwritten folder and put each line’s handwritten text in there. The lines will be PNG images with the line number as names. You need not worry about this as it is an intermediate step in the process. In case some line does not seem good enough, you can always regenerate it by specifying the line number and stopping the script once it’s done. Now we need to run the other script in the folder called make_page.py using the following command:

git clone https://github.com/theSage21/handwritten

The other method does not require git, though it does require some sort of unzipping tool like Unzip in Linux systems or Win-zip in Windows. To go this way, use the link https://github.com/theSage21/handwritten/archives/ master.zip to download the Zip file. Once downloaded, you can unzip the file using the following command on Linux systems: unzip handwritten.zip.

This will create a folder called handwritten, where the program will be. Now, we navigate to the handwritten folder using the following command: cd handwritten

We need to create a plain text file of what we need written. So let’s use our favourite editor and open a blank txt file with the following command: vim example.txt

or gedit example.txt

or notepad example.txt

Let’s copy-paste our text into the file, and make sure that no line is longer than 100 characters. Next, let’s get the individual lines using the following command:

python make_page.py 20

What 20 denotes is the number of lines to be used per page. In case it is omitted, the default value is 20. This script will create a folder called pages in the handwritten folder, and the pages of your text will be generated there as PNG images. They will be labelled with numbers denoting the page numbers. If you would like to use this multiple times, you must make sure that the images and pages folders do not contain anything before running the script. If they do contain something, it will be included in the generation of the handwritten text. As an example use case, I had submitted one of my assignments using this software. Some of the pages of that assignment are shown in Figures 1 and 2.

References [1] Handwritten: <http://theSage21.github.io/handwritten/> [2] Issues may be reported at <https://github.com/ theSage21/handwritten/issues/>

By: Arjoonn Sharma The author is currently pursuing his masters in computational sciences and is also doing research in machine learning. You can link up with him at github.com/ theSage21 or arjoonn.blogspot.com

90 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Overview For U & Me

The Pros and Cons of Open Source Programming Languages

You can always look forward to a good debate when the topic is ‘Open source programming languages versus licensed programming languages’. There are pros and cons to both sides of the argument. The author gives readers, particularly aspirants in the field of programming, insights into this age old question.

I

n the current tech savvy scenario, we have become too busy to spend time on manual processes and hence try to get the system to perform tasks to the greatest extent possible. This means we must convey the instructions to the system, using various languages. The programming language plays an important role in letting us manipulate the system or the machine. There are different programming languages currently in use, of which some are open source while others are proprietary. Open source basically refers to a program in which the source code is available to the public, free of cost, for use or for modification from its original design. It’s actually developed by a combined effort, under which programmers improve upon the code and share the changes with the public. As per the Open Source Initiative (OSI), the source code of an open source computer program is made available free of cost to the public so that the larger group of programmers who are not concerned about proprietary rights or financial gains will produce a more useful product that can be utilised further by everyone.

Eligibility criteria

According to DFSG (Debian Free Software Guidelines), an open source programming language should fulfil the following criteria: 1. Free redistribution: An open source programming language should not restrict anyone from selling or

giving away any component of it from an aggregate distribution containing various components from several different sources. It should also not require any fee for such sale. 2. Source code: An open source programming language must include the source code. If, in case, some form of a product is not distributed with the source code, then there must be some means of obtaining the source code for a reasonable reproduction cost, like downloading via the Internet without any charge. The source code must be the preferred form in which a programmer can modify the program. Obfuscated source code is not allowed. 3. Derived works: An open source programming language must be used to make any changes or derived works, and also, those changes must be permitted to be distributed under the same terms and conditions as that of the original programming language. 4. Integrity of the author’s source code: An open source programming language may restrict its source code from being distributed in modified form only if it allows the distribution of its patch files with its source code for the purpose of modifying the program developed during build time. It must explicitly permit its distribution built from modified source code. It may also require derived works to carry a different name or version from the original software.

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 91


For U & Me Overview 5. No discrimination against persons or groups: An open source programming language must not discriminate against any kind of person or group of persons. 6. No discrimination against fields of endeavour: An open source programming language must not restrict anyone from making use of the program in a specific field of endeavour. 7. Distribution of licence: The rights attached to the open source programming language must apply to all to whom it is distributed, without the need for executing additional rights by those parties. 8. The licence must not be specific to a product: The rights attached to the open source programming language must not depend on it being part of a particular software distribution. If it is extracted from that distribution and used or distributed within the terms of its licence, all parties to whom it has been redistributed should have the same rights as those that are granted along with the original programming language distribution. 9. The licence must not restrict other software: An open source programming language must not place restrictions on other programming languages that are distributed along with the licensed programming languages. 10. The licence must be technology-neutral: There is no provision for the open source programming language to be predicated on any individual technology or style of interface.

Open source programming languages in the market

A flashback to the 18th century will reveal that the world’s very first program was written by Ada Lovelace for calculating Bernoulli’s Number using the Analytical Engine. The culture of writing programs has come into existence since then and, subsequently, it has led to the development of various other programs that are used to perform some complex mathematical calculations. According to Wikipedia, it was only in the 1950s that the first high-level programming language, Plankalkül, was designed and developed by the Germans to communicate instructions to the computer. John Mauchly’s Short Code was the first high-level language ever to be developed for an electronic computer. Today, developers and programmers have many programming language options that can be used to develop applications of their choice. Let’s take a look at a few such popular open source programming languages.

1. Java

Java is one of the world’s most influential programming languages developed so far, and it is mostly open source today. It is being used at the core of many Web and Windows based applications on all platforms, operating systems and devices. This class-based object-oriented programming language has a large number of features.

Usage of Open Source Programming Languages (2014-2015)

17%

18%

Java

8% 23%

Ruby Python C/C++

34%

Java Script

Figure 1: Open source programming languages in use (Data source: Lifehacker community)

Features of Java ƒ Is object-oriented. ƒ Allows us to create various modular programs and reusable code. ƒ It is easily ported. ƒ Is platform-independent. ƒ Easy to write, compile and debug.

2. PHP

PHP is on its way to becoming the most popular open source programming language. According to many leading industry leaders, PHP has emerged as the most user-friendly open source language; therefore, various open source packages such as Joomla and Drupal are built on it. It’s even budgetfriendly and, hence, PHP based solutions are being used by entrepreneurs and SMEs as well. Currently, many developers are making their debut on PHP, which clearly highlights its strong community base. Features of PHP ƒ Cross-platform compatibility. ƒ No need to specify the data type for variable declarations, and predefined variables can be used. ƒ Availability of predefined error reporting constants. ƒ Supports extended regular expressions. ƒ Has the ability to generate dynamic page content. ƒ Allows the user to create, open, read, write, delete and close files on the server ƒ Output can be in HTML, images, PDF, Flash, XHTML and XML file formats. ƒ Runs on various platforms such as Windows, Linux, UNIX, Mac OSx, etc. ƒ It is compatible with almost all servers being used currently.

3. Python

Python was developed by Guido Van Rossum in the 1980s and handed over to the non-profit Python Software Foundation, which now serves as the administrator of the Python language. It was one of the first programming

92 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Overview For U & Me languages that was easy for people to pick up quickly. It is open source and is free to use even for commercial applications. It’s used as a scripting language, and programmers can easily produce readable and functional code in a very short period of time. Features of Python ƒ Supports procedure-oriented as well as object-oriented programming (OOP). Python has a very powerful but simplistic way of implementing OOP. ƒ Good readability, with clear and simple syntax. ƒ Portable, therefore it can be interpreted in various operating systems, including UNIX-based systems, various Microsoft Windows versions, Mac OS, MS DOS, etc. ƒ The source code is open for users to modify and reuse. ƒ Easy to learn compared to other such languages. ƒ It can be embedded in C and C++. ƒ It’s a high-level language that does not need compilation to the binary.

4. Perl

Perl is still under active development even after 27 years. According to the Perl website, it is a stable, mature, powerful and portable language. It is highly efficient and feature rich. Perl 5 runs on over 100 platforms, which vary from portables to mainframes. It is suitable for both rapid prototyping as well as the large-scale development of various projects. Perl 5 and Perl 6 are parts of the Perl family of languages. However, Perl 6 is a separate language that has a different development team working on it. Hence, it comes as a dynamically typed, interpreted language which is comparable with both PHP and Python. It is ideal for processing and producing text data. It is highly portable and widely supported. Features of Perl ƒ Has simple syntax, so it’s easy to learn for novices as well. ƒ It is cross-platform. ƒ Is versatile and has a very comprehensive library of modules. ƒ Has a very powerful as well as flexible object-oriented programming syntax. ƒ Supports multiple platforms including Linux, Microsoft DOS, Windows, Apple Mac, and some mobile platforms as well.

2.

3.

4.

5.

6.

Open source vs licensed programming languages The benefits ƒ

ƒ

How to select a programming language

When we think of choosing our first programming language, several points pass through our minds, and the many options available add to the confusion. There are several factors that can help us select the most optimal programming language. 1. Performance and efficiency: This is one parameter that all of us look at while evaluating any programming language. We then choose the one that most efficiently performs a specific task in the least time. In reality,

no specific language is fast. It’s the efficiency of the compiler and interpreter that makes it fast. So we should choose accordingly. Capability to address your specific requirements: We should choose the one that best addresses our specific problem. We cannot select a language just because it’s popular or is highly efficient. It’s of no use if it doesn’t work to solve our problem. Ease of learning, understanding and time taken for developing code: A programming language that takes less time for a coder to understand and learn, enabling it to be implemented for a specific application, is better than one that is complex. Also, if it’s easy to understand, it will take less time to code and at the same time, it will be easy for peers to review and modify the code. Elasticity and portability: It’s always good to opt for languages that can easily be modified in order to support some new feature or functionality, without making too many changes. It is also advantageous if a programming language supports multiple platforms, middleware, databases and system management facilities or if it can support these with very few adjustments. Security: A programming language should include specific security measures in order to protect its code from malicious usage. Popularity and support: If the language we select is popular, then it becomes easier for us to find reference material. Also, the chances of finding its library files are higher, compared to less popular languages. It will be good if proper support is available as this saves a lot of time.

ƒ

ƒ

ƒ

One of the most important reasons for preferring open source programming languages is that their source code can be customised efficiently to fit our needs and requirements, whereas in the case of a licensed programming language, if the source code is made available by the owner, it can be customised only to the permissible extent. Open source programming languages are subject to peer review constantly, so the bugs present in the program code can be easily found and fixed. Users from different programming backgrounds and countries can collaborate, unlike in the case of commercially developed programs where only the developers of the original piece of code can actually change it. Open source programming languages also allow the translation of the source code from one language to another, whereas in the case of licensed programming languages, the developers of the source code may not allow that. Open source programming languages are free to try out before actual implementation, unlike licensed ones, where we have trial versions that are valid for a few days only. Open source programming languages are more secure overall.

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 93


For U & Me Overview A few disadvantages

It becomes confusing, sometimes, when so many changes are made in the source code of an open source programming language, by different users. Ultimately, this leaves us wondering about the version of the code we are using. In the case of licensed programming languages, the changes are made by developers of the source code and they clearly notify the current version of the code. ƒ Open source programming languages can also be tough to take on if we are very new to a certain language. If someone has worked on some previous code and hasn’t commented it, but has written some messy lines of code, it can be extremely confusing to unravel. ƒ It becomes difficult to find support for some open source programming languages if community users do not have a solution to your specific problem, whereas for licensed languages, the developers provide full-time support. It’s rather difficult to come to a definite conclusion on whether we need to opt for an open source programming language or a licensed option since both have a few advantages and disadvantages. Although there are some significant pros associated with open source programming languages, like allowing for complete customisation for whatever task we are trying to accomplish and also full access to the source code, we cannot disengage ourselves completely from using licensed programming languages, since these are also useful at times when we suffer from platform dependency or tool support ƒ

issues. Before taking the plunge, we need to consider the several factors we have discussed along with our requirements while choosing a specific programming language.

Additional information

The Open Source Initiative (OSI) is a non-profit organisation that promotes open source products by certifying them with the Open Source Software trademark. Debian Free Software Guidelines (DFSG) is a set of guidelines initially designed as a set of commitments that Debian agreed to abide by, and this has been adopted by the free software community as the basis of the Open Source Definition.

References [1] [2] [3] [4]

http://www.webopedia.com/ http://www.wikipedia.org/ http://lifehacker.com http://www.opensource.org/

By: Vivek Ratan The author is a B. Tech in electronics and instrumentation engineering. He works on various software automation testing tools and on Android application development. He is currently working as an automation test engineer at Infosys, Pune. He can be reached at ratanvivek14@gmail.com.

96 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Overview For U & Me

Open Source Solutions that Accelerate Adoption of Cognitive Automation The concept of Al-Enabled Coworker is relatively new for many of the business enterprises. If used effectively, it promises to transform the way we operate, work or live. It has the potential to make many business processes faster, simpler and less error-prone. In light of the heavy investments involved, this article directs readers to the open source solutions available, to accelerate the pace of its adoption.

T

he modern enterprise evolved over the late nineteenth and early twentieth centuries. Since then, technology has always been performing the role of a key enabler to expedite business innovation. Cognitive computing technologies seem to be yet another promising catalyst for enterprise transformation. These technologies aim at bringing about unprecedented levels of automation, and are poised to improve productivity across functions. Cognitive systems attempt to simulate how humans think and learn. These systems imitate humans in learning from past experiences and use that knowledge for reasoning, making hypotheses, inferring, solving problems or making decisions. Combined with automation, enterprises can leverage these systems to automate even judgement based activities that are a part of a business process. This way, they can augment human skill and expertise so that our time is used more effectively. Smart machines or AI-enabled coworkers, as analysts fondly call these systems, can simplify and accelerate many business processes to a great extent, and facilitate business agility and innovation. They can help a marketing executive by analysing the customer base and identifying the right target segment for the next campaign. They may indicate probable customer churn and help to identify the key parameters causing it. They can also help a service engineer by providing a note of caution regarding a potential outage. Or they may assist a call centre executive in reducing the resolution time for a query. They have the potential to transform the way the business operates today.

The pillars of cognitive technologies

The evolution of cognitive technologies involves various streams of artificial intelligence (AI). Many cognitive

technologies may be relevant for enterprises, including robotics, rules-based systems, computer vision, optimisation, planning and scheduling. However, in 2016 we expect the most important cognitive technologies in the enterprise software market to be: ƒ Machine learning – This is the ability of computer systems to improve their performance by exposure to data but without the need to follow explicitlyprogrammed instructions. This is likely to be the most prevalent. It enhances a large array of applications, from classification to prediction, and from anomaly detection to personalisation. ƒ Natural language processing (NLP) – This technology enables computers to process text in the same way as humans, for example, extracting meaning from text or even generating text that is readable, stylistically natural and grammatically correct. It has many valuable applications when incorporated in software that analyses unstructured text. ƒ Speech recognition – This is the ability to automatically and accurately transcribe human speech, and the technology that enables this is useful for applications that may benefit from hands-free modes of operation.

Enterprise adoption

Many leading software companies have already discovered the potential of cognitive technologies to improve the core functionality of their products, generate new and valuable insights for customers, and improve business operations through automation. These benefits are simply too compelling for software companies to ignore. Analysts predict that by the end of 2016, a majority of the world’s largest enterprise software companies (by revenue), will have integrated cognitive technologies into their

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 97


For U & Me Overview products. According to the analysts there would be a steady increase in the adoption in the coming year. Following this trend, some business software companies have developed AI capabilities in-house, but many others are acquiring capabilities through mergers and acquisitions, this trend is expected to continue in 2016. Strong support from venture capital investors is also helping to further commercialise this market. Since 2011, US-based startups that develop or apply cognitive technologies to enterprise applications have raised nearly US$ 2.5 billion, suggesting that the biggest near-term opportunity for cognitive technologies is in using them to enhance business practices.

Implementation challenges

The capabilities around cognitive computing and the market’s expectation from them are driving the need for a new class of supercomputer systems, which can enable the synergy of advanced analytics and Big Data technology. However, there are challenges that we need to be aware of. Cognitive computing technologies are quite complex, inherently. The complexities imply steep learning curves and, hence, increased turnaround times. Building capabilities around these systems requires considerable amount of time and effort. There are many solutions involving these technologies from various leading vendors. Many of the platforms and the frameworks, however, are extremely expensive and call for heavy infrastructure. Implementation of cognitive automation calls for huge amounts of investment.

Open source tool sets—a probable solution

Till recently, cognitive computing was confined more to the academic world. That could be the reason why a huge number of open source tools and libraries have evolved around cognitive computing and related technologies. Today, a wide range of solutions, along with the huge knowledge and code base, are available in the open source domain. This rich repository can enable enterprises to learn and experiment with these technologies, thus increasing their reach. These solutions will allow enterprises to create a quick prototype around cognitive computing. This way, enterprises can check the viability of the underlying idea and get quick feedback about it. Some of the popular open source solutions in the area of cognitive computing are: ƒ R: R is a language and environment for statistical computing and graphics. It provides a wide variety of statistical (linear and non-linear regression, classical statistical tests, time-series analysis, classification, clustering, etc) and graphical techniques. It is highly extensible. The R language

ƒ

ƒ ƒ

ƒ

is often the vehicle of choice for research in statistical methodology, and R provides an open source route to participation in that activity. Python: While R is specifically created for statistical analysis, Python also has a rich set of machine learning implementations. It is widely used among the scientific community. Being an interpreter, high-level programming language, Python is a good fit for machine learning implementation, as quite often, this calls for an agile and iterative approach. Apache Mahout: This provides an environment for quickly creating scalable machine learning applications. H2O: H2O is for data scientists and application developers who need fast, in-memory, scalable machine learning for smarter applications. H2O is an open source parallel processing engine for machine learning. RapidMiner: RapidMiner is a platform that provides an end-to-end development environment for machine learning. Through a wizarddriven approach, RapidMiner allows the user to rapidly build the predictive analytics model.

Points to ponder

To check the viability of a business case, it would be a good idea to first take baby steps rather than taking a huge plunge. Using open source solutions, enterprises can avoid upfront investments and check the viability of various cognitive technologies. To check the viability, first we need to identify the right use case. We should have thorough understanding of the business problem it attempts to solve. This has to be followed by right selection of the technology solution suitable for implementation. As seen in the previous section, a wide range of cognitive computing solutions are available as open source. Each solution has evolved to cater to a specific set of business problems. Each one of them has its own target user group. So to get the desired result, we need to first have a clear understanding of the problem at hand. Based on that understanding, we need to choose the right tool set to best solve that type of business problem. With a quick prototype on selected small set of use cases implemented with right choice of open source solutions would enable the enterprise to check the viability of the cognitive automation initiative. By: Sanghamitra Mitra The author has completed her post graduation from the Indian Statistical Institute. She has been working in the software industry for more than 15 years and is currently a senior technical architect at Capgemini. Her current focus area is solving business problems with cognitive computing and automation.

98 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Letâ&#x20AC;&#x2122;s Try OpenGurus

Python Programming for Digital Forensics and Security Analysis One of the many uses of the versatile Python programming language is in digital forensics and security analysis. This article covers various aspects like socket programming, port scanning, geo-location and extraction of data from websites like Twitter.

P

ython is one of the powerful programming languages used in key domains like cloud computing, Big Data analytics, network forensics, mobile app development, Web development and many others. Python has been in use for more than two decades. Python code follows and provides support for multiple programming paradigms including imperative, functional, procedural and object oriented. Nowadays, Python is widely used for a variety of high performance computing applications by a number of corporate giants including Microsoft, Google, Red Hat, IBM, Amazon and many others. Python is free and open source, and delivers the implementations and interfaces for many other languages and platforms. Table 1 displays a list of Python implementations, including the support for different platforms and programming models.

Installing Python

Python is available in two versions â&#x20AC;&#x201D; Python 3.5 and Python

2.7 (https://www.python.org). Either of these can be downloaded, depending upon your requirements and type of application. Python programming works with the IDE platform on which coding can be done. The command shell interface or IDE supports a Python program that is saved with a .py extension and executed at the command shell interface with the following commands: $ python <filename.py> (For Linux) DriveLetter:\(Path-To-Python)>python <filename.py> (For Windows)

Table 1

Python implementation IronPython CPython Jython MicroPython PyPy

Supporting platform and language .NET Framework C Java Microcontrollers Just-In-Time Compiler

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 99


OpenGurus Let’s Try

Figure 2: Executing Python code at the Windows command shell interface

Figure 3: Python IDLE environment Figure 1: Python download page from the official portal

Figure 2 depicts the execution of Python code on a system in which Python 2.7 is installed in Drive E: of the Windows OS. IDE based programming with Python can include any IDE to write, debug and execute the code. Given below is a list of Python IDEs where a graphical user interface is provided for easy programming: IDLE Koding Eric Komodo IDE PIDA MonoDevelop Spyder PyScripter Stani’s Python Editor PythonAnywhere Understand

IntelliJ IDEA Anjuta Geany Ninja-IDE KDevelop PyCharm PyDev SourceLair Python Tools for Visual Studio Pyzo Thonny

Digital forensics using Python programming

Whenever the topics of digital forensics, cyber security and penetration testing are discussed, professionals generally depend on a number of third party tools and operating systems. Kali Linux, MetaSploit, Parrot Security OS and many other tools are used for digital forensics. These tools come with in-built applications which the users deploy without real knowledge of the internal architecture and algorithmic approach of implementation. Python is a widely used programming language for cyber security, penetration testing and digital forensic applications. Using the base programming of Python, any of the following can be performed without using any other third party tool: ƒ Web server fingerprinting ƒ Simulation of attacks ƒ Port scanning ƒ Website cloning ƒ Load generation and testing of a website ƒ Creating intrusion detection and prevention systems ƒ Wireless network scanning

Figure 4: Fetching an IP address of a website from a URL

Figure 5: Fetching IP addresses associated with the local system

ƒ ƒ ƒ

Transmission of traffic in the network Accessing mail servers… …and many other implementations related to digital fingerprinting and security applications

Socket programming

Socket programming is in-built with Python, similar to Java. To work with socket programming, the package socket is first imported and then the related methods can be called. Python installation comes with the in-built IDLE GUI.

Network port scanning

Generally, the nmap tool is used for the implementation of network port scanning, but using Python socket programming, it can be implemented without any third party tool. In Kali Linux, there are many tools available for digital forensics related to networks, but many of these implementations can be done using Python programming with just a few lines of instruction. The code for port scanning of any IP address can be downloaded from http://opensourceforu.com/article_source_ code/sept16/digital_forensic.zip. The code checks which particular ports are open from the PortList [20, 22, 23, 80, 135, 445, 912]. Each value in the PortList specifies a

100 | september 2016 | OpeN sOUrCe FOr YOU | www.OpensourceForU.com


Let’s Try OpenGurus particular service associated with the network.

Geolocation extraction

The real-time location of an IP address can be extracted using Python and Google APIs with the use of the pygeoip module. First of all, import the GeoIP database from the URL http:// dev.maxmind.com/geoip/legacy/geolite/. Once the database is loaded and mapped with the Python installation, any IP address can be scanned with global visibility and location. Figure 6: Downloadable databases for GeoIP mapping >>> import pygeoip >>> myGeoIP = pygeoip.GeoIP(‘GeoIPDataSet.dat’) >>> myGeoIP.country_name_by_addr(‘<IP Address>’) ‘United States‘

To look up the country, use the following commands: >>> myGeoIP = pygeoip.GeoIP(‘GeoIPDataSet.dat’) >>> myGeoIP.country_code_by_name(‘google.com’) ‘US’ >>> myGeoIP.country_code_by_addr(‘<IP Address>’) ‘US’ >>> myGeoIP.country_name_by_addr(‘<IP Address>’) ‘United States’

Figure 7: Download links for GeoIP databases with IPv6 compatibility

other platforms can be done.

Python package index (PyPI)

To look up the city, use the following commands:

PyPI (https://pypi.python.org) is the software repository of enormous Python packages for interfacing with other platforms. PyPI is freely available for Python developers without any licensing or subscriptions. You can download the Python codes for the three actions given below from http://opensourceforu.com/article_source_ code/sept16/digital_forensic.zip 1. Fetching the list of followers from Twitter about any user 2. Fetching the Twitter timeline for any user name 3. Real-time extraction of live tweets from Twitter With the execution of the Python code downloaded from the above link, the live real-time discussion on any topic or keyword can be fetched. With this execution, the real-time discussion on the word ‘India’ will be fetched along with the details on the users involved in the transmission and distribution of tweets. The user data will include the user name, device, tweet, followers’ list, timestamp of the tweet, platform used, etc.

>>> myGeoIP = pygeoip.GeoIP(‘GeoIPCity.dat’) >>> myGeoIP.record_by_addr(‘<IP Address>’) { ‘city’: u’Mountain View’, ‘region_code’: u’CA’, ‘area_code’: 550, ‘time_zone’: ‘America/Los_Angeles’, ‘dma_code’: 807, ‘metro_code’: ‘San Francisco, CA’, ‘country_code3’: ‘USA’, ‘latitude’: 38.888222, ‘postal_code’: u’94043’, ‘longitude’: -123.37383, ‘country_code’: ‘US’, ‘country_name’: ‘United States’, ‘continent’: ‘NA’ } >>> myGeoIP.time_zone_by_addr(‘<IP Address>’) ‘America/Los_Angeles’

Real-time extraction from social media

By: Dr Gaurav Kumar

The live and real-time data from social media platforms can be downloaded using Python scripts. In Python, there are many modules and extensions with which the interfacing with WhatsApp, Twitter, Facebook, LinkedIn and many

The author is the MD of Magma Research and Consultancy Pvt Ltd, Ambala. He is associated with a number of academic institutes, where he delivers lectures and conducts technical workshops on the latest technologies and tools. You can contact him at kumargaurav.in@gmail. com or www.gauravkumarindia.com.

www.OpensourceForU.com | OpeN sOUrCe FOr YOU | september 2016 | 101


Docker

For U & Me Interview

is a boon to

Developers Containers are becoming vital for the development of software applications. While several container technologies are available in the market today to ease the software development process, Docker is leading the popularity race. In an interview with Jagmeet Singh of OSFY, Docker captain, Neependra Khare, highlights the advantages of using containerised application. He also mentions India’s role in the growth of the Docker community. A few edited excerpts…

Q How does Docker make open source software development easy? With container technology, we confine the application with all its dependencies such that it can run independently on different environments—be it the desktop, virtual machines or the cloud. We just need to have the container runtime like Docker installed on the host system. Containers have been around for quite some time, but Docker and its tools made them more popular. Docker started a revolution and others joined in as well, which resulted in a very vibrant ecosystem around it. Now containers are becoming a basic unit of development and deployment. The same image from which we run a container is shared at different stages of the software life cycle. Because everyone from a developer to an operations engineer works on top of the same image, we don’t get complaints such as, “It doesn’t work in my environment.” Docker has a concept of the registry, which is similar to GitHub. This feature allows us to share images. One can share pre-built images on the registry, while others can consume it with just one command. This is of great help for anyone using Docker, especially developers.

Q So Docker started a revolution in the world of containers. How can it be a vital tool for developers? Docker is a boon to developers. It helps them to quickly prototype ideas. Using Docker Registry, developers Neependra Khare, Docker captain

102 | September 2016 | OpeN SOUrCe FOr YOU | www.OpenSourceForU.com


Interview For U & Me the default standard for carrying out development and deployments. Sooner or later, one has to learn about them. Being early adopters would definitely help individuals to grow professionally. Also, by being part of the community, one can learn while having fun.

Q Do you think understanding cloud technologies plays a pivotal role for any developer, sysadmin or emerging DevOps professional? Yes, it does. Basic understanding of cloud technologies is a must for everyone. These days we use apps for booking a taxi, getting a doctor’s appointment and learning online courses. All of these things are possible at scale because of some form of cloud technology. With on-demand and pay-as-you-go models of cloud technologies, companies can address their customers’ requirements faster and at a lower cost. And as a developer, or someone from the QA or systems administration teams, one has to have an understanding of recent cloud technologies to build an effective solution.

Docker team at Bangaluru

Docker started a revolution and others joined in as well, which resulted in a very vibrant ecosystem around it. can easily share their application’s image. With tools like Docker Compose, a multi-tier application can be defined and deployed easily, which hides all low-level details from end users. Developers can simultaneously run different versions of software on the same machine without any interference. This helps them with faster debugging. Docker and other related tools also let developers have a near production-like environment on their local environment, which helps them avoid some of the deployment issues.

Q What is the Docker community in India like? In India, we have around 15 Docker groups which meet at regular intervals. Meetups help participants share their knowledge on various topics as well as let them do training sessions and hackathons. The Docker Bengaluru group is one of the most active and vibrant around the world. Though we have a big user base in India, we don’t have many contributors to the project. We hope to see more contributions to the Docker project from India.

Q Why should one be a part of the Docker community? First and foremost, it is a great place to learn about Docker and the ecosystem around it. Containers are going to become

THE COMPLETE MAGAZINE ON OPEN SOURCE

Docker conference at Bengaluru, India

Though we have a big user base in India, we don’t have many contributors to the project. With the DevOps culture, smaller teams which consist of developers, QA and Ops guys work together. They also need to have a basic understanding of how everyone else in the team uses the cloud and what their pain points could be. Catch the full conversation on our website: http://opensourceforu.com.

Your favourite magazine on Open Source is now on the Web, too.

OpenSourceForU.com Follow us on Twitter@LinuxForYou

www.OpenSourceForU.com | OpeN SOUrCe FOr YOU | September 2016 | 103


TIPS

&

TRICKS

Searching for the repository name locally in Ubuntu

It’s difficult to remember the required repository names for various applications in Ubuntu. We can search from the cache generated after an apt-get update for this. The usage is: apt-cache search [search term(s)]

it looks cryptic. It would be easier if we could format this JSON better. We can do it very easily using the Python interpreter. Python provides JSON modules that will do the needful. For instance, the following simple command converts the JSON into a readable format: [bash]$ echo ‘{“DISTRIB_ID” : “Ubuntu”,”DISTRIB_RELEAS”: 15.04,”DISTRIB_CODENAME”: “vivid”,”DISTRIB_DESCRIPTION”: “Ubuntu 15.04”}’ | python -m json.tool

Here’s an example:

The output is:

apt-cache search python game binding {

The above code will search for the terms ‘python’, ‘game’ and ‘binding’ in the repository name and shortdescription strings, and suggest various repos. You can install the required one by using the ‘sudo apt-get install [name]’ command.

“DISTRIB_CODENAME”: “vivid”, “DISTRIB_DESCRIPTION”: “Ubuntu 15.04”, “DISTRIB_ID”: “Ubuntu”, “DISTRIB_RELEAS”: 15.04 }

—Sricharan Chiruvolu, sricharanized@gmail.com

—Narendra Kangralkar, narendrakangralkar@gmail.com

Improving the formatting of JSON in GNU/Linux

JSON is a standard format. We can store JSON in a string, but if the string is large then JSON looks very cryptic, and it becomes very difficult to search/edit it. Web browsers/ editors provide many add-ons/tools to format JSON better. But doing it from the command line is more efficient, as this can be done quickly and we can automate it as well. Let us suppose we have a JSON string like what’s shown below: ‘{“DISTRIB_ID” : “Ubuntu”,”DISTRIB_RELEAS”: 15.04,”DISTRIB_ CODENAME”: “vivid”,”DISTRIB_DESCRIPTION”: “Ubuntu 15.04”}’

…and we want to search/edit something in it but

Detecting the operating system of a remote host

Many a time, we need to know details of a remote system. xprobe2 is a remote active operating system fingerprinting tool that can fetch the details we need. Let us first install xprobe2 as follows: #sudo apt-get update #apt-get install xprobe2

After successful installation, run the following command: #sudo xprobe2 <host/ip address>

104 | September 2016 | OpeN SOUrCe FOr YOU | www.OpenSourceForU.com


suggest searching the Web for ‘grep with colour’.

#sudo xprobe2 192.168.0.10

The tool works only if ICMP is not blocked, i.e., ping is not blocked on the system that is to be tested. #sudo xprobe2 -B xyzxyz.com

Here, the B option forces the TCP handshake module that tries to guess which the open TCP port is. You can also use nmap for getting the OS details of a remote system, as follows:

—Prem Ranjan, ranjan_september@yahoo.com

Finding and getting rid of big files

A common problem with computers is when you have a number of large files (such as audio/video clips) that you want to get rid of. You can find the biggest files in the current directory with the following command (only in the current directory). #ls -lSrh

#sudo nmap -O <ip addr of Host>

Here, O enables OS detection. If the OS doesn’t get detected, then use the following option to guess the OS.

The r causes the large files to be listed at the end, and the h gives human readable output (MB and such). You can search for the biggest MP3/MPEG files, using the following command:

#sudo nmap -O --osscan-guess 192.168.61.2

#ls -lSrh *.mp*

Here, osscan-guess can guess the OS more aggressively.

You can also look for the largest directories with: #du -kx | egrep -v “\./.+/” | sort -n

—Rupin Puthukudi, rupinmp@gmail.com

Easy-to-read grep output

When we use grep to filter text (e.g., Web logs, source code or program output), the pattern we are looking for might be anywhere in the matched lines. Hence, output will be a little difficult to read when we look for exactly where the match occurred. In this case, we can use the following command:

You can find the biggest files in your home directory (in the whole directory structure), using the command given below: #find ~ -type f -exec ls -s {} \; | sort -n

To list only the top 10 biggest files, use the following command:

#grep --color=always PATTERN

#find . -type f -exec ls -s {} \; | sort -nr | head -10

… which will show the output with matching characters in red, by default. This output is easy to read. We can customise the colours with GREP_COLOR and GREP_COLORS environment variables, as shown below:

Hope this simple tip will help you address this common problem. —Pallavi Rawat, pallavifirst@rediffmail.com

export GREP_COLOR=”01;31” grep --color=always int SomeCProgram.c

This will show all int strings in SomeCProgram.c in blue. With GREP_COLORS, we can customise even further, like have a different colour for file names and a different colour for line numbers. For complete information beyond this small tip (what exactly the colour codes are, what else can be coloured, what other customisations are possible, etc), I would

Share Your Linux Recipes! The joy of using Linux is in finding ways to get around problems—take them head on, defeat them! We invite you to share your tips and tricks with us for publication in OSFY so that they can reach a wider audience. Your tips could be related to administration, programming, troubleshooting or general tweaking. Submit them at www.opensourceforu. com. The sender of each published tip will get a T-shirt.

www.OpenSourceForU.com | OpeN SOUrCe FOr YOU | September 2016 | 105


OSFYOSFY DVD DVD

DVD OF THE MONTH Enjoy a cool Linux distro on your computer.

PCLinuxOS FullMonty KDE64 2016 Desktop (Live)

Re co mm en d

tended, and sh unin oul

db

e

terial, if found

l e ma

nab tio ec

of

e ern Int

a. t dat

Note:

Any o

bj

September 2016

he complex n d to t atu ute re

on

rib

the

s c, i

M Drive VD-RO M, D B RA , 1G : P4 nts me ire qu Re

rep lac em en t.

tem ys dS

dis

att

erly, write to us a ot work prop t supp does n ort@ DVD efy.i this n fo ase ra c free In

e

CD Team e-mail: cdteam@efy.in

What is a live DVD? A live CD/DVD or live disk contains a bootable operating system, the core program of any computer, which is designed to run all your programs and manage all your hardware and software. Live CDs/DVDs have the ability to run a complete, modern OS on a computer even without secondary storage, such as a hard disk drive. The CD/DVD directly runs the OS and other applications from the DVD drive itself. Thus, a live disk allows you to try the OS before you install it, without erasing or installing anything on your current system. Such disks are used to demonstrate features or try out a release. They are also used for testing hardware functionality, before actual installation. To run a live DVD, you need to boot your computer using the disk in the ROM drive. To know how to set a boot device in BIOS, please refer to the hardware documentation for your computer/laptop. To use this image available in the other_isos folder, you either need a drive that can create or â&#x20AC;&#x2DC;burnâ&#x20AC;&#x2122; DVDs or a USB Flash drive at least as big as the image. Or, you can use it directly with any virtualisation software like VirtualBox.

106 | September 2016 | OpeN SOUrCe FOr YOU | www.OpenSourceForU.com

PCLinuxOS is a free, easy-to-use Linux-based operating system for x86_64 desktops or laptops. This safe and secure operating system comes with a complete Internet suite for surfing the Net, sending and receiving email, instant messaging, blogging, tweeting and watching online video. The live DVD bundled with OSFY can also be installed on your computer. The LiveDVD mode lets you try PCLinuxOS without making any changes to your computer. The OS is designed to be used by beginner, intermediate or advanced users. It comes with kernel 4.4.4 LTS with the full KDE 4 Desktop.

PCLinuxOS LXQT 2016 Desktop (Community)

Community releases of PCLinuxOS ISOs are created by the community, for the community and are available in various default desktop environments. The bundled DVD has the ISO of PCLinuxOS with the LXQt desktop environment. LXQt is the Qt port and the upcoming version of LXDE, the Lightweight Desktop Environment. It is focused on being a classic desktop with a modern look and feel. You can find the live, bootable ISO images available in the other_isos folder on the root of the DVD.


www.efyexpo.com

Exhibition & Knowledge Partner

March 2-4, 2017. BIEC. Bengaluru

Indiaâ&#x20AC;&#x2122;s Electronics Manufacturing Show

MAKE. BUY. SELL. INVEST.

March 2-4, 2017. BIEC. Bengaluru

For more information, talk to us at +91-11-40596605 or email at efyexpo@efy.in


Open source for you september 2016  
Read more
Read more
Similar to
Popular now
Just for you