MACH: Fast Randomized Tensor Decompositions

Page 1

Charalampos (Babis) E. Tsourakakis

SIAM Data Mining Conference April 30th, 2010 MACH: Fast Randomized Tensor Decompositions, SDM 2010

1


  Introduction   Why Tensors?   Tensor Decompositions

  Our Motivation   Proposed Method

  Experimental Results   Case study I: Intemon   Case study II: Intel Berkeley Lab

  Conclusion

MACH: Fast Randomized Tensor Decompositions, SDM 2010

2


(min)value

0200040006000800010000010203040time

(min)value

020004000600080001000000.511.522.5time

Temperature Light

Intel Berkeley lab

Humidity Voltage

MACH: Fast Randomized Tensor Decompositions, SDM 2010 3

(min)value

0200040006000800010000051015202530time

(min)value

02000400060008000100000100200300400500600time


Location

time

Data modeled as a tensor, i.e., multidimensional matrix, T x (#sensors) x (#types of measurements) Time mode

Sensor mode

Measurement type mode

Observation Multi-­‐aspect data can be modeled in such way. MACH: Fast Randomized Tensor Decompositions, SDM 2010

4


Functional Magnetic Resonance Imaging (fMRI) 5 Mode Tensor voxel x subjects x trials x task conditions x timeticks

Tensors model naturally numerous real-­‐world datasets. And now what? MACH: Fast Randomized Tensor Decompositions, SDM 2010

5


  Introduction   Why Tensors?   Tensor Decompositions

  Our Motivation   Proposed Method

  Experimental Results   Case study I: Intemon   Case study II: Intel Berkeley Lab

  Conclusion

MACH: Fast Randomized Tensor Decompositions, SDM 2010

6


Α (m x n)

=

σ1

o v1 u1

+

σ2 o v2

+

u2

σ3 o v3

+…

u3

Singular value decomposition (SVD) The “Swiss army knife” of matrix decompositions (O’Leary)

MACH: Fast Randomized Tensor Decompositions, SDM 2010

7


Document to term matrix

Documents to Document HCs Strength of each concept

CS =

x

x

MD data graph java brain lung

Term to Term HCs MACH: Fast Randomized Tensor Decompositions, SDM 2010

8


  Two families of algorithms extend SVD to the

multilinear setting

  PARAFAC/CANDECOMP decompositions   Tucker decomposition Tensor Decompositions and its Applications, SIAM review Kolda

Bader MACH: Fast Randomized Tensor Decompositions, SDM 2010

9


Tucker is an SVD-­‐like decomposition of a tensor, with one projection matrix per mode and a core tensor.

~

J. Sun showed that Tucker decompositions can be used to extract useful knowledge from monitoring systems. MACH: Fast Randomized Tensor Decompositions, SDM 2010

10


  Introduction   Why Tensors?   Tensor Decompositions

  Our Motivation   Proposed Method

  Experimental Results   Case study I: Intemon   Case study II: Intel Berkeley Lab

  Conclusion

MACH: Fast Randomized Tensor Decompositions, SDM 2010

11


  Most of the real-­‐world processes result in

sparse tensors.   However, there exist important processes which result in dense tensors: Physical Process

Percentage of non-­‐zero entries

Sensor network (sensor x measurement type x timeticks)

85%

Computer network (machine x measurement type x timeticks)

81%

MACH: Fast Randomized Tensor Decompositions, SDM 2010

12


  It can be either very slow or impossible to

perform a Tucker decomposition on a dense tensor due to memory constraints.

Given the fact that (low rank) Tucker decompositions are valuable in practice, can we “trade” a “little bit” of accuracy for efficiency? MACH: Fast Randomized Tensor Decompositions, SDM 2010

13


  Introduction   Why Tensors?   Tensor Decompositions

  Our Motivation   Proposed Method

  Experimental Results   Case study I: Intemon   Case study II: Intel Berkeley Lab

  Conclusion

MACH: Fast Randomized Tensor Decompositions, SDM 2010

14


Fast low rank matrix approximation STOC 2001

McSherry

Achlioptas

MACH extends the work of Achlioptas-McSherry for fast low rank approximations to the multilinear setting. MACH: Fast Randomized Tensor Decompositions, SDM 2010

15


  Toss a coin for each non-­‐zero entry with

probability p

  If it “survives” reweigh it by 1/p.   If not, make it zero!

  Perform Tucker on the sparsified tensor!

  For the theoretical results and more details,

see the MACH paper.

MACH: Fast Randomized Tensor Decompositions, SDM 2010

16


  Introduction   Why Tensors?   Tensor Decompositions

  Our Motivation   Proposed Method

  Experimental Results   Case study I: Intemon   Case study II: Intel Berkeley Lab

  Conclusion

MACH: Fast Randomized Tensor Decompositions, SDM 2010

17


  Intemon: A prototype monitoring and mining

system for data centers, developed at Carnegie Mellon University.

  Tensor X, 100 machines x 12 types of

measurement x 10080 timeticks

MACH: Fast Randomized Tensor Decompositions, SDM 2010

18


Ideal ρ=1

For p=0.1 we obtain that Pearson’s Correlation Coefficient is 0.99

MACH: Fast Randomized Tensor Decompositions, SDM 2010

19


Find the differences! Exact

MACH

The qualitative analysis which is important for our goals remains the same! MACH: Fast Randomized Tensor Decompositions, SDM 2010

20


  Introduction   Why Tensors?   Tensor Decompositions

  Our Motivation   Proposed Method

  Experimental Results   Case study I: Intemon   Case study II: Intel Berkeley Lab

  Conclusion

MACH: Fast Randomized Tensor Decompositions, SDM 2010

21


  Berkeley Lab

  Tensor 54 sensors x 4 types of measurement x

5385 timeticks

MACH: Fast Randomized Tensor Decompositions, SDM 2010

22


The qualitative analysis which is important for our goals remains the same! MACH: Fast Randomized Tensor Decompositions, SDM 2010

23


Exact

MACH

The spatial principal mode is also preserved, and Pearson’s correlation coefficient is again almost 1! MACH: Fast Randomized Tensor Decompositions, SDM 2010

24


REMARKS 1) Daily periodicity is apparent. 2) Pearson’s correlation Coefficient 0.99 with the exact component.

MACH: Fast Randomized Tensor Decompositions, SDM 2010

25


  Introduction   Why Tensors?   Tensor Decompositions

  Our Motivation   Proposed Method

  Experimental Results   Case study I: Intemon   Case study II: Intel Berkeley Lab

  Conclusion

MACH: Fast Randomized Tensor Decompositions, SDM 2010

26


  Randomized Algorithms for Tensors   Smallest p* for tensor sparsification for the

HOOI algorithm

  Randomized Algorithms work very well (e.g.,

sublinear time algorithm), but typically hard to analyze.

MACH: Fast Randomized Tensor Decompositions, SDM 2010

27


MACH: Fast Randomized Tensor Decompositions, SDM 2010

28


MACH: Fast Randomized Tensor Decompositions, SDM 2010

29


Remark:Even if our theoretical results refer to HOSVD, MACH works for HOOI

MACH: Fast Randomized Tensor Decompositions, SDM 2010

30


Canonical Decomposition CANDECOMP/ PARAFAC

Tucker Decomposition

MACH: Fast Randomized Tensor Decompositions, SDM 2010

31


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.