CONTEXT-BASED PEOPLE RECOGNITION in CONSUMER PHOTO COLLECTIONS Markus Brenner, Ebroul Izquierdo MMV Research Group, School of Electronic Engineering and Computer Science Queen Mary University of London, UK {markus.brenner, ebroul.izquierdo}@eecs.qmul.ac.uk
Aim
 Resolve identities of people primarily by their faces
 Perform recognition by considering all contextual information
at the same time (unlike traditional approaches that usually train a classifier and then predict identities independently)
 Incorporate rich contextual cues of personal photo collections
where few individual people frequently appear together
Face Detection and Basic Recognition
Graph-based Recognition
Initial steps: Image preprocessing, face detection and face normalization
Model: pairwise Markov Network (graph nodes represent faces) Unary Potentials: likelihood of faces belonging to particular people � �� =
1
đ?‘?đ?‘“
Pairwise potential
Face
f1
f2
� ��
Unary potential
Descriptor-based: Local Binary Pattern (LBP) texture histograms
f3
Pairwise Potentials: encourage spatial smoothness, encode exclusivity constraint and temporal domain
LBP
‌ for each block ‌
đ?œ?, đ?‘–đ?‘“ đ?‘¤đ?‘› = đ?‘¤đ?‘š ∧ đ?‘–đ?‘› ≠đ?‘–đ?‘š đ?‘? đ?‘¤đ?‘› , đ?‘¤đ?‘š = 0, đ?‘–đ?‘“ đ?‘¤đ?‘› = đ?‘¤đ?‘š ∧ đ?‘–đ?‘› = đ?‘–đ?‘š đ?‘?đ?‘œ đ?‘¤đ?‘› , đ?‘¤đ?‘š , đ?‘œđ?‘Ąâ„Žđ?‘’đ?‘&#x;đ?‘¤đ?‘–đ?‘ đ?‘’
LBP
Similarity metric: Chi-Square Statistics All samples are independent
Basic face recognition: k-Nearest-Neighbor
Te
Te
Tr
Topology: only the most similar faces are connected with edges
Unary potential of every node
Tr
Tr
Face similarity
Tr
Inference: maximum a posteriori (MAP) solution of Loopy Belief Propagation (LBP)
Te Te
Tr
Social Semantics
Body Detection and Recognition
Individual appearance for a more effective graph topology (used to regularize the number of edges)
‌ when faces are obscured or invisible
Unique People Constraint models exclusivity: a person cannot appear more than once in a photo

Detect upper and lower body parts

Bipartite matching of faces and bodies

Graph-based fusion of faces and clothing
Pairwise co-appearance: people appearing together bear a higher likelihood of appearing together again Groups of people: use data mining to discover frequently appearing social patterns
Tr Based on face similarities
...
Experiments Public Gallagher Dataset: ~600 photos, ~800 faces, 32 distinct people Our dataset: ~3300 photos, ~5000 faces, 106 distinct people
Gain @ 3% training 25% 20% 15% 10%
 All photos shot with a typical consumer camera
5%
 Considering only correctly detected faces (87%)
0% + Graph. Model
+ Social Semantics
+ Body parts
Unary potential of every node
Tr
Tr
Upper body similarity
Lower body similarity Te
Te Face similarity
Tr