Geospatial World April 2013

Page 32

Cover Story  |  Big Data

commercial applications. “As a free open data stream, we almost overnight created $100 billion in value to the marketplace,” he says. He also noted how the US Weather Service, first launched in the Smithsonian Institute, provided an Open Source approach to collection and reporting of weather data from across the country — another amazing “big data stream.” Carson J.Q. Farmer et al in the proceedings of GIScience in Big Data Age writes that geospatial analytics in big data requires “new approaches that are flexible, non-parametric, computationally efficient and able to provide interpretable results for modelling dynamic and non-linear processes in data This is what a combination of NoSQL and Hadoop can do, as illustrated by the IBM model of the Netezza analytics platform and Oracle’s Integrated Software Solutions Stack.

Oracle Integrate Software Solution Stack 32

Geospatial World | April 2013

rich situations”. Till now geospatial Big Data analytics has focussed on data visualisation and descriptive analysis. GIScience needs to move away from this approach and move on to a model-centric approach, which stresses the underlying spatial processes rather than addressing the data bottleneck. According to IDC, storage capabilities have been surpassed by data stream volumes as early as 2007 and there is no way this can be bridged. The need is for runtime stream data analysis. The focus therefore shifts from the database to a model dictionary that is applied to streaming data to detect the states of the environment to highlight normal and abnormal conditions. Stream processing can be used to tune model parameters and to store useful data samples. Compute intensive operations such as image processing of high resolution satellite imagery can be performed in parallel on the Oracle Big Data Appliance. Multiple systems can be cabled together for very large Hadoop clusters for processing massive numbers of images in parallel. The high-speed connection between Oracle Big Data Appliance and Oracle Exadata (via infiniBand) enables geospatial applications to use the appropriate platform (batch processing and preprocessing vs. transactional) for a given task. Another common use case is extraction of location-related semantics from unstructured documents. Many types of social media are highly unstructured and riddled with ambiguous terms. Nevertheless, it commonly refers to important concepts like names, places, times, etc. While human readers might be able to infer and reconcile these ambiguities, machines cannot — at least not without some form of pre-processing. Applying natural language processing (NLP) on unstructured media enables developers to generate semantic indexes of terms from these sources. This “structuring” process reduces the ambiguity of social media and enables linking with more conventional structured relational content commonly found in spatial databases and GIS. This type of social media analysis is now becoming common place, and is increasingly referred to as “social media analytics” or Big


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.