Issuu on Google+

Review & PResentation

Radio fRequency identification & data ManageMent

Temporal Management of RFID by Fusheng Wang, Peiya Liu Warehousing and analyzing massive rfid data sets by Hector Gonzalez Jiawei Han, et al. Cost-Conscious Cleaning of Massive RFID Data Sets by Hector Gonzalez Jiawei Han, et al.

COP 6731 – For Dr. Kien A. Hua Presented by Rahat

Antenna IC Chip Substrate Connection


RFID & Discussion on Papers


“So called Nucleus of Enterprise supply chain management ”

ReseaRch PRobleM

Characteristic aspect of RFID data management & how it enables the scholars to solve out the diverse challenges

intRoductoRy woRds: “ An Important Technology”

Tracking Technologies and Automatic ID Systems Various technologies are used to track and automatically ID people, products, and other objects – Barcodes – Optical Character Recognition (OCR) – Biometrics • Voice recognition and ID systems • Fingerprint ID systems

– Smart cards – Memory cards – Microprocessor cards

Rfid: what is it ? RFID combines many of the features of several of these technologies – Like barcodes, RFID is used to identify and track objects – As with OCR and biometrics, RFID enables automatic ID and verification – RFID also can be used like smart cards, memory card, and microprocessor cards to store information and provide interactive data processing

Most RFID tags contain at least two parts. One is an integrated circuit for storing and processing information, modulating and demodulating a (RF) signal, and other specialized functions. The second is an antenna for receiving and transmitting the signal.

how is Rfid unique? – It can be used to accurately locate and identify objects from a distance using RF signals – It can be used to detect and read objects that are not in line of sight –Data can be interactively managed and processed by the RFID chip and RFID system

MaRket & aPPlication Industrial Products

Logistics/ Trans.

Consumer Products

Retail Products

Homeland Security

Key Industry Drivers Leading Us Toward RFID

Other Service

exaMPle: Rfid in libRaRies

Simplifies checkout process for staff

Inventory of Collections

Use with new and future technology

Item Security

Express checkout for patrons

benefits of Rfid •Automatic reads •Active chips can be written •Many chips can be read simultaneously •Standardized and unique encoding •Better process specific data collection


cuRRent technology

thRee Main PaPeRs RFID & Discussion

& the suPPoRting cast

Managing Rfid data (vldb2004) Sudarshan S. Chawathe, Venkat Krishnamurthy et al. • Not the Supporting Paper • The authors presented a brief introduction to RFID technology and highlight a few of the data management challenges

• A layered architecture – RFID tags – Tag readers – Savant/Middleware • Mapping the low-level data stream form readers to a more manageable from that is suitable for application-level interactions

– EPC-IS • Combing business logic with the stream of data emerging from the sensing framework below them

– ONS • Essentially a global lookup service

chaRacteRistics of Rfid data Three main Problems came out

Probable Solution

Large volume – A retail with 3000 stores sells 10,000 items a day per store

(EPC, location, time) Each item 10 traces before leaving store How manly tuples it will generate each day? 10,000 ×10 ×3,000=300,000,000 (without redundancy)

Model and storage of RFID data

– Walmart is expected to generate 7 terabytes of RFID data per day

Implicit semantics – Observations imply location changes, aggregations, and business processes

Inaccurate data – Noisy data and duplicate readings

Query and data mining of RFID data Data cleaning of RFID data

laRge voluMe Papers on Problem One

• Mention one PaPeR on Bitmap Datatype • coveR one of the Main thRee PaPeR Temporal Management

suPPoRting Rfid-based iteM tRacking aPPlications in oRacle dbMs using a bitMaP datatyPe(vldb2005) Observation Ying Hu, Seema Sundara et al.

– Groups of items in the same proximity e.g. on a shelf, on a shipment – Groups of items with same property e.g. Same product

Main Idea Instead of storing a tuple per item store a tuple for all the items having same prefix.

EPC BITMAP SEGMENT DATATYPE – A new type to represent a collection of EPCs with a common prefix

In Bulk Load Performance & Storage comparison, it works very good

featuRed PaPeR 1 of 3 Temporal Management of RFID by Fusheng Wang, Peiya Liu

teMPoRal ManageMent of Rfid data (vldb2005) Fusheng Wang, Peiya Liu

Authors Proposed by Wang and Liu from Siemens

Senior Member of Technical Staff »Siemens Corporate Research, Inc. Multimedia Documentation Program »Princeton, New Jersey »USA Email:

teMPoRal ManageMent of Rfid data (vldb2005) Fusheng Wang, Peiya Liu

Observation • RFID entities are static and are not altered & RFID relationships are dynamic and change all the time • RFID data are time-dependent in large volumes • RFID data management systems need to effectively support such large scale temporal data created by RFID applications. These systems need to have an explicit temporal data model for RFID data to support tracking and monitoring queries. • In addition, they need to have an automatic method to transform the primitive observations from RFID readers into derived data used in RFID-enabled applications.

teMPoRal ManageMent of Rfid data (vldb2005) Fusheng Wang, Peiya Liu

In this paper, the authors present an integrated RFID data management system Dynamic Relationship ER Model Two types of dynamic relationships added: Event-based dynamic relationship:

A timestamp attribute added to represent the occurring timestamp of the event. State-based dynamic relationship:

tstart and tend attributes added to represent the lifespan of a state.

Static entity table

– OBJECT (object_epc, name, description) – LOCATION (location_id, name, owner)

– SENSOR (sensor_epc, name, description) – TRANSACTION (transaction_id, transaction_type)

Dynamic relationship tables – CONTAINMENT(epc, parent_epc, tstart, tend) – OBSERVATION(sensor_epc, value, timestamp) – OBJECTLOCATION(epc, location_id, tstart, tend) – SENSORLOCATION(sensor epc, location id,position, In the paper there are number of nice but complex example, also – TRANSACTIONITEM(transaction_id, epc, timestamp) tstart, tend)

1. Siemens RFID Middleware Architecture 2. Rules-based RFID Data Transformation

teMPoRal ManageMent of Rfid data (vldb2005) Fusheng Wang, Peiya Liu

Advantage Provides powerful query support of RFID object tracking and monitoring Can adapt to different RFID-enabled applications Enables semantic RFID data filtering Automatic data transformation based on declarative rules Is a powerful and realistic model of RFID as it integrates business processes into the data model itself

iMPlicit seMantics Papers on Problem Two

Mention intRoduction and keynote on Warehousing &

ining coveR one of the Main thRee PaPeR on


waRehousing 101 Huge data sets, terabytes generated each day We need OLAP to make sense of the data Traditional data cubes don’t work. A data cube only provides aggregates for a given combination of dimension values. We need aggregates at the path level. Architecture of the RFID Warehouse is based on these Key ideas of RFID data compression

Taking advantage of data generalization Taking advantage of bulky object movements Taking advantage of the merge and/or collapse of path segments

waRehousing 101 •

Lossless compression

– Remove redundancy: (r1,l1,t1) (r1,l1,t2) ... (r1,l1,t10) => (r1,l1,t1,t10) – Group objects that move and stay together. •

Data cleaning: Multi-reading, missed-reading, error-reading, bulky movement.

Data mining: Find trends, outliers, frequent, sequential, flow patterns.

Multi-dimensional summary: product, location, time, …

Query Processing

– Support for OLAP: roll-up, drill-down, slice, and dice

the big PictuRe

waRehousing and Mining Massive Rfid data sets Keynote for ADMA2006 - Jiawei Han

featuRed PaPeR 2 of 3

Warehousing and analyzing massive rfid data sets by Hector Gonzalez, Jiawei Han, et al.

waRehousing and analyzing Massive Rfid data sets (icde2006) Hector Gonzalez, Jiawei Han, Xiaolei Li & Diego Klabjan


waRehousing and analyzing Massive Rfid data sets (icde2006) Hector Gonzalez, Jiawei Han, Xiaolei Li & Diego Klabjan

Motivation (1) Items usually move together in large groups through early stages in the system (e.g., distribution centers) and only in later stages (e.g., stores) do they move in smaller groups (2) Although RFID data is registered at the primitive level, data analysis usually takes place at a higher abstraction level.

As a departure from the traditional data cube, the authors propose a new warehousing model It preserves object transitions while providing significant compression and path-dependent aggregates Techniques for summarizing and indexing data, and methods for processing a variety of queries based on this framework are developed in this study.

waRehousing and analyzing Massive Rfid data sets (icde2006) Hector Gonzalez, Jiawei Han, Xiaolei Li & Diego Klabjan

Advantages Allows high-level analysis to be performed efficiently and flexibly in multidimensional space. The model is composed of a hierarchy of highly compact summaries (RFID-Cuboids) of the RFID data aggregated at different abstraction levels Efficient answering of a wide range of RFID queries Collapse multiple movements into a single record without loss of information.

Taking advantage of data generalization Taking advantage of bulky object movements Taking advantage of the merge and/or collapse of path segments

waRehousing 101 FlowGraphs •Tree shaped workflow •Captures main trends and significant deviations

FlowGraph Cubing •Data Cube where each cell is a FlowGraph. •The FlowCube goes beyond the traditional Data Cube with scalar aggregates, and adds a path view of the data.

Why FlowGraphs Compact summary of popular paths traversed by items Highlights important deviations from popular paths

otheR woRk by the dais University of Illinois

FlowCube:Constructing RFID FlowCube for Muti-Dimensional Analysis of Commodity Flows,VLDB2006 Mining Compressed Commodity Workflows From Massive RFID Data Sets, CIKM’06 Cost-Conscious Cleaning of Massive RFID Data Sets, ICDE’07

Notice that their proposal of the RFID model and its subsequent methods for warehouse construction and query analysis is based on the assumption that RFID data tend to move together in bulky mode, especially at the early stage. This fits a good number of RFID applications but not all. So further study can be done

inaccuRate data Papers on Problem Three

Mention thRee PaPeR on Data cleaning coveR one of the Main thRee PaPeR on Cost-Conscious Cleaning

issues in data cleaning False negative reading • In this case, RFID tags might not be read by the reader at all while present to a reader • Caused by – RFID readers capture only 60-70% of all tags that are in the vicinity – RF collisions – Water or metal shielding

False positive reading • In this case, besides RFID tags to be read, additional unexpected reading are generated • Caused by – RFID tags outside the normal reading scope of a reader are captured by the reader – RFID tags has moved away its vicinity, but reader fails to capture it

Duplicate Readings • Caused by – Tags in the scope of a reader for a long time are read by the reader multiple times – Multiple readers are installed to cover larger area or distance, and tags in the overlapped areas read by multiple readers – To enhance reading accuracy, multiple tags with same EPCs are attached to the same object, thus generate duplicate readings

Logical anomalies: tend to be application dependent

data cleaning PaPeRs A Pipelined Framework for Online Cleaning of Sensor Data Streams, ICDE 2006(short paper) Adaptive Cleaning for RFID Data Streams_VLDB2006 ShawnR. Jeffery, Minos Garofalakis, Michael J.Franklin Efficiently Filtering RFID Data Streams_VLDBCleanDB2006 Yijian Bai, Fusheng Wang, Peiya Liu

Existing cleaning techniques have focused on the accurate methods, but have disregarded the very high cost of cleaning in a real application

featuRed PaPeR 3 of 3 Cost-Conscious Cleaning of Massive RFID Data Sets, ICDE’07

cost-conscious cleaning of Massive Rfid data sets, icde’07 Hector Gonzalez, Jiawei Han & Xiaolei Li


cost-conscious cleaning of Massive Rfid data sets, icde’07 Hector Gonzalez, Jiawei Han & Xiaolei Li

Observation & Motivation False negative reading False positive reading Duplicate Readings Logical anomalies: tend to be application dependent Existing cleaning techniques have disregarded the very high cost

Contribution: Propose a cleaning framework Identify the conditions under which a specific cleaning method A sequence of cleaning methods can be applied in order to minimize the expected cleaning costs, including error costs

cost-conscious cleaning of Massive Rfid data sets, icde’07 Hector Gonzalez, Jiawei Han & Xiaolei Li

Cleaning Plan Define a cost model Assign costs to cleaning methods (training cost, execution cost, maintenance cost, etc). Define error costs. Using training data determine the efficacy of training methods under different contexts. Optimization Problem: Construct a cleaning plan (methods to apply for different circumstances) that minimizes the total expected cleaning costs.

cost-conscious cleaning of Massive Rfid data sets, icde’07 Hector Gonzalez, Jiawei Han & Xiaolei Li

Advantages Development of accurate methods Other existing methods were well under a wide set of conditions as well they disregarded the cost Induces a cleaning plan that optimizes the overall accuracy-adjusted cleaning costs Simple Architecture of Cleaning Framework Cost-conscious framework that learns when and how to apply different cleaning techniques in order to optimize total cleaning costs and accuracy.

soMe thoughts & RefeRences RFID & Discussion

concluding ReMaRks Number of the papers propose some general and expressive temporal-oriented data model for RFID data The data models are shown to be quite powerful on supporting RFID data tracking and monitoring The rules-based framework enables automatic RFID data filtering, transformation, and aggregation, to generate semantic high level data The system can be adapted into different RFID applications, thus substantially reduces the cost of managing and integrating RFID data into business applications

society’s conceRns Privacy

- Tracking individuals - Illicit or inappropriate use of personal data - Tracking personal activities (e.g., purchase habits, travel)


- Unsanctioned readers - Theft of information - Inadequate encryption

Global differences

- Regulations around collecting data - Standards - Ownership of data

futuRe woRks Privacy and security for the deployment of RFID. Tampering is big research topic Secure management of RFID data management XML-based Traceability of RFID data One notable paper came on which was bit diverse from the other papers was Integrating Automatic Data Acquisition with Business Processes Experiences with SAP’s AutoID Infrastructure in one of the VLDB Christof Bornhovd,Tao Lin,Stephan Haller,Joachim Schaper

Auto-ID infrastructure Open Issues Different Qualities of Service Distributed Smart Items Infrastructure Seamless Integration of Environmental Sensors In the next stage we expect to look into some of these issues

RefeRence Books RFID security Rockland, MA : Syngress, c2006 Frank Thornton ... [et al.]. RFID implementation New York : McGraw-Hill, c2007. Dennis E. Brown. RFID essentials Beijing ; Sebastopol, CA : O'Reilly, 2006 Bill Glover and Himanshu Bhatt. RFID for dummies Hoboken, N.J. ; Chichester : Wiley, 2005. by Patrick J. Sweeney.

Websites Managing RFID data Temporal management of RFID data Mining compressed commodity workflows from massive RFID data sets Warehousing and Analyzing Massive RFID Data Sets

back uP slide Rfid tags: Passive vs. active

Rfid in action

a veRy active tag

back uP slide

λ 0

λ 1


λ 2







Tμ (T+1)μ (T+2)μ


C-1 (C-1)μ Cμ

the Guard channel policy rejects all new calls until the channel occupancy goes below threshold


soMe otheR thoughts Sky is still the limit but RFID did not reached half of its limit yet soMe links (if time permits)

RFID - Technology Video (Detailed) RFID Demonstration That´s how we will shop within a couple of years! Future Store (Smart Check Out)