Page 1

Gleaming Knowledge from Big Data for Connected Context Computing Jane Hsu National Taiwan University Intel-NTU CCC Center July 18, 2013

Intel-NTU Connected Context Computing Center


Outline •  Connected Context Computing •  motivation •  vision

•  The IoT Data Challenges •  From Data to Knowledge •  Conclusion

Intel-NTU Connected Context Computing Center 2


Technologies Today

ubiquity Portability

Productivity

Intel-NTU Connected Context Computing Center


By 2015… More Users

More Devices

>1 Billion More Netizen’s1

>15 Billion Connected Devices2

1.  2.  3.  4. 

4

More Data

>>1 Zetabyte Internet Traffic3

IDC “Server Workloads Forecast” 2009. IDC “The Internet Reaches Late Adolescence” Dec 2009, extrapolation by Intel for 2015 ECG “Worldwide Device Estimates Year 2020 - Intel One Smart Network Work” forecast Source: http://www.cisco.com/assets/cdc_content_elements/networking_solutions/service_provider/visual_networking_ip_traffic_chart.html extrapolated to 2015 source: Gartner June 2010, CAGR from 2009à2014

Intel-NTU Connected Context Computing Center


Dream for The Future

Needed: Smart Machines to Work Together Intel-NTU Connected Context Computing Center


A Global Network of Things

Intel-NTU Connected Context Computing Center


Instrumented and Connected Devices

Temperature

Traffic

Water Flow

Electricity

How do we use and manage the devices? Intel-NTU Connected Context Computing Center


The Lifecycle of Data

Apply Compute

Store

Transport Generate

* Slides adapted from a presentation at Asia Academic Forum 2012 by Dr. Wen-Hann Wang

Intel-NTU Connected Context Computing Center

Protect


Vision: Connected Context Computing To design end-to-end solutions for intelligent interaction and secure information sharing amongst a multitude of connected devices that •  efficiently sense data •  effectively communicate data •  collaboratively analyze the context, and •  proactively serve their users

Intel-NTU Connected Context Computing Center 9


Outline •  Connected Context Computing •  The IoT Data Challenges •  Greener Buildings •  Smarter Agriculture •  Safer Transportation

•  From Data to Knowledge •  Conclusion

Intel-NTU Connected Context Computing Center 10


The IoT Data Challenge Thousands in diversity

Cloud

Trillions in scale Complex in interdependency

Big Data

Continuous in time Distributed in space

?

Dynamic, noisy, and unreliable in nature

Intel-NTU Connected Context Computing Center


Data to Decision, Info to Insight Business intelligence Biodiversity trends Virtual travel guides

Medical Scans Social Media News & Journals

Language translation

Satellite Images

Insight

Compute Transport

E-commerce TV & Video

Health monitoring Sensors & Surveillance

Augmented reality Video analytics Extreme weather prediction

Intel-NTU Connected Context Computing Center


Greener Buildings

Intel-NTU Connected Context Computing Center 13


Power Monitoring

Intel-NTU Connected Context Computing Center 14


Appliance State Monitoring & Control

Intel-NTU Connected Context Computing Center 15


Video-based Activity Recognition

Intel-NTU Connected Context Computing Center 16


MicroGrid of Smart Homes Private Cloud

Intel-NTU Connected Context Computing Center 1/1

Ref Source: http://austinspc.com/2011/04/10/what-is-a-microgrid/


Contexts for Greener Buildings •  Activity Recognition •  Appliance states •  Power consumption •  Temperature •  Humidity •  Light

Data

Features

•  Air (CO2) •  Sound •  Indoor location

Intel-NTU Connected Context Computing Center 18

Context

Service


Smarter Agriculture

Intel-NTU Connected Context Computing Center 19


Automatic Greenhouse Water curtain

Heater

Exhaust Intel-NTU Connected Context Computing Center

Automatic frames


Smart Greenhouse

12 Fix  Nodes

Real-­‐'me data  inquiry  interface 8  Mobile  Nodes

M2M Networking  (Fix  +  Mobile  Nodes) Visualiza'on  interface Gateway

Objectives Automa'c Greenhouse

•  •  •  • 

Scalability (unlimited number of sensor nodes in a single PAN) Robustness (dynamic topology, routing and localization) Heterogeneous (ZigBee + WiFi + different devices) Smart service (light, irrigation and inspection)

Intel-NTU Connected Context Computing Center 21


Temperature/Humidity Map by DADSM2M  Map es'mated  by  29  real     sensor  readings

Map es'mated  by  9  real  sensor  readings  and     9  synthe'c  UD  readings  (MAE:0.43)

Map es'mated  by  9  real     sensor  readings  (MAE:0.74)

Map es'mated  by  9  real  sensor  readings,  9  synthe'c  UD   readings,  and  50  synthe'c  random  readings  (MAE:0.46)

Intel-NTU Connected Context Computing Center 22


Crowd Sensing

Intel-NTU Connected Context Computing Center 23


Contexts for Smarter Agriculture •  Weather •  Sunlight •  Pest population •  Battery level •  Price •  Energy supply

Data

Features

•  Energy costs •  Demand/yield

Intel-NTU Connected Context Computing Center 24

Context

Service


Safer Transportation

Intel-NTU Connected Context Computing Center 25


Intra-Vehicle Sensing

Intel-NTU Connected Context Computing Center 26


Inter-Vehicle Sensing

Intel-NTU Connected Context Computing Center 27


Distributed Environment Sensing –  Road-side unit (RSU) design for inter-vehicle map reconstruction

Intel-NTU Connected Context Computing Center 28


Online Traffic Information

Intel-NTU Connected Context Computing Center 29


Contexts for Intelligent Transportation •  The vehicle in front is turning •  The vehicle in front is braking •  Vehicle location •  Crossroads •  Traffic •  Long weekend

Data

Features

•  Distraction •  Sleepy or drunk driver

Intel-NTU Connected Context Computing Center 30

Context

Service


Outline •  Connected Context Computing •  The IoT Data Challenges •  Greener Buildings •  Smarter Agriculture •  Safer Transportation

•  From Data to Knowledge •  Conclusion Intel-NTU Connected Context Computing Center 31


Intel-NTU Connected Context Computing Center 32


Not So Intelligent Transportation

http://www.youtube.com/watch?v=U_MnvHwjcIQ 

Intel-NTU Connected Context Computing Center 33


The Key Challenges/Barriers Streaming data from heterogeneous devices w/ varying capabilities Limited bandwidth/computation/storage resources Non-scalability due to computational complexity Expensive (if not impossible) to label data Incomplete or inaccurate information Non-cooperative things è

centralized global optimization does not work

è

needs to anticipate probable actions by others

Intel-NTU Connected Context Computing Center 34


Context + Prediction è Action

Intel-NTU Connected Context Computing Center 35


Reactive vs. Anticipatory “The vehicle in front is braking, step on the brakes.” Context

Reactive Engine

Action

“The traffic light ahead is turning yellow, and the vehicles in front may stop at the red light, step on the brakes.” Context

Anticipatory Reasoning

Predictive Model Intel-NTU Connected Context Computing Center 36

Action


Rule-based Reasoning •  If the traffic light is red, then stop. •  If the traffic is heavy, then slow down. •  If vehicle in front is braking, then step on the brakes. •  If it is raining, then the road is slippery. (world model) •  If driver brakes, then vehicle slows down. (action model) Context

Rule Engine

Condition-action Rules

Intel-NTU Connected Context Computing Center 37

Action


Knowledge Extraction from Unstructured Text Corpus: Clueweb09-Chinese • 

177,489,357 pages

• 

592 GB, compressed

• 

4.5 TB uncompressed

After 5 iterations (on Hadoop version) • 

145 categories

• 

206 predicates and 61 relations

• 

overall precision 71.2%

• 

categories precision 76%

• 

relations precision 52.6% Intel-NTU Connected Context Computing Center 38


Possibilistic Reasoning 

Context

Possibilistic Reasoning

Fuzzy Petri Nets

Intel-NTU Connected Context Computing Center 39

Action


Probabilistic Reasoning 0.8

accelerate

Stop

Fast

accelerate

0.3

0.7

0.2 0.05 0.95

brake

Crash

Slow 0.99

brake 0.01

Context

Approximate/ Exact Solver

POMDP

Intel-NTU Connected Context Computing Center 40

Action


Appliance State Recognition 5-fold cross-validation •  For average accuracy and joint accuracy, FCRFs has the best results •  Compare with PCRFs and FCRFs, the Co-temporal relationships help improve the recognition accuracy

Intel-NTU Connected Context Computing Center

AAAI10 PAIR

41


Machine Learning

Prediction Engine

Anticipatory Reasoning for IoT Context Engine

Anticipatory Reasoning

Data

Data Streams Gateway

Wukong self-configurable platform

Control/Action WuDevice WuDevice

WuDevice

Intel-NTU Connected Context Computing Center 42


Talking Tail-light Extra time to react Okay!! I need to apply the brake now!

I am braking hard!!

101 0

Intel-NTU Connected Context Computing Center 43


Mul--­‐Sensor Fusion  

Intel-NTU Center

•  Belief-­‐based localiza'on  and  tracking   •  Stereo-­‐based  moving  object  detec'on  algorithm  

System  Specifica-on

44


UX-Based Design and Simulation •  Immersive VR •  Proactive Alerts

Intel-NTU Connected Context Computing Center 45


Driver Predic-on  Model

Intel-NTU Center

•  Sparse learning  process  for  big  data   –  Guaranteed  Global  convergence   –  Saved  lots  of  I/O  and  memory  

     

4

Memory Disk

Learning     Algorithm  to   minimize  

2 1  

Hinge Loss  

0

Network Capacity

Ac-ve Set

Ramp Loss  

3

-­‐3 -­‐2   -­‐1   0   1   2   3  

v.pv1 dist( q,  N(v.pv1)  ) q v.pv3 dist(  q,  N(v.pv2)  ) v.pv2

α i with ∇ iP f (α ) >> 0

Solve each  Block  with   46 standard  solvers. •       Guaranteed  Global  Convergence.   •       Saved  lots  of  I/O  and  Memory.

v v.pv1 v.pv2 v.pv3


Distributed Anomaly  Detec-on  

Intel-NTU Center

•  Traveling route  anomaly  detec'on   –  surroundings  of  NTU  

•  Aggressive driver  detec'on  

normal route

Traveling route

abnormal type  I   end

start 47

abnormal type  II


Distributed Learning Challenges •  How to reduce expensive Network I/O ?

w* = arg min f(w; D) w

Existing Approaches

M1

•  Select important samples on each site [2]. D •  Still expensive for Sensor Network Applications (large n=|sites|, small |Dn|). 1

Our Approach

[1]

…….

D3

D2

Dn

w* = arg min f(w; D) w

•  Build Index (tree/hash-table) through Network •  Visit only sites containing crucial samples •  Sparsify number of crucial samples via sparse modeling v.pv1

D1

M1 v

v.pv2

D2

v.pv3

D3

Crucial Samples Intel-NTU Connected Context Computing Center 48

…….

Dn


Sweetfeedback for Energy Savings •  16 window sensors •  3 Arduinos •  2 gumball machines w/ Arduinos •  2 displays

Intel-NTU Connected Context Computing Center 49


Conclusion IoT data challenge: volume, velocity, variety, varacity Approach: Data-driven+ Model-driven Many forms of knowledge may be gleaned from data generated by sensors, online resources, and crowds using scalable machine algorithms. •  Contexts •  Alert/anomaly event detection •  Commonsense knowledge •  Predictive models

Intel-NTU Connected Context Computing Center 50


Acknowledgements •  Our Sponsors •  National Science Council •  National Taiwan University •  Intel Corporation

•  At the Intel-NTU Connected Context Computing Center, we are designing end-to-end solutions to achieve greener buildings, smarter agriculture, and safer transportation

Intel-NTU Connected Context Computing Center 51


Thank You! Q&A http://ccc.ntu.edu.tw/

Intel-NTU Connected Context Computing Center 52

prof._jane_hsu