Page 1

Activity data • Tom Franklin, Franklin Consulting David Kay, Sero Mark van Harmelen, HedTek Helen Harrop, Sero • Rob Moores, LMU

Contents 13.15

Introduction What is activity data? What can it be used for? What are the benefits of using activity data?


What challenges does it raise?



Star-TRAK: NG - activity data in support of student success



Discussion of potential uses - consider use cases that may spark your own ideas



Legal issues



Refreshment break


Building a business case TF / HH How can you build a case to make use of activity data in your institution? Consideration of examples from the 2011 JISC Activity Data programme

OR 15:00

Working with activity data Analyse some exemplar activity data and identify the potential value and challenges of the data



Feedback & Conclusions What does this mean for learning technologists? What you can do next




Using activity data to support your users Tom Franklin University of Manchester and Franklin Consulting

What is activity data? Short answer: Anything in a log file Longer answer ● Every log in ● Every search ● Every access of a resource ● Every submission of a document ● Every access of a web page ● Any action in the VLE

It only matters if: We know who you are

Not quite A proxy will do Logged in users – we know who you are IP address – we know you are the same person ● But proxy servers will cause problems

What really matters is the ability to link activities together Look at patterns of behaviour

Who is doing it?

What is activity data useful for? Student recruitment Student retention

Research impact Resource management

Student recruitment  What are potential students doing on your website  Where do they go?  What are their routes?  Can you tie schools’ IP addresses to UCAS applications?  How can you use this information to ● Support candidates while they are on the site? ● Improve the usability of the site?

Student retention

How are they doing?

Demonstrating value  Library Impact Data Project ● Relationship between library use and results

 Using a variety of data ● Turnstile activity ● Library management system ● EZProxy service ● Student record system

 Relationship demonstrated (not causality) 

No of books borrowed across degree

Year of graduation

Book borrowing and degree classification

Supporting students Exposing VLE data ● Identify patterns of behaviour associated with success ● Identify students who are struggling ● Change academics' attitudes towards the institutional VLE


Research impact Beyond Google Analytics Increase use of the institutional repository Help people find the information that they need People who accessed x also accessed y People who searched for a accessed b

AEIOU Across Welsh Repository Network Recommender system – people who accessed x also accessed y Increase number of articles accessed per session Combine with similarity based recommendations

Resource utilisation What resources are being used? By whom? What are the patterns? Can you support users to make better use of resources? Are the subscriptions optimal?

Using activity data: Challenges Mark van Harmelen Univ Man Comp Sci And Hedtek Ltd

What do you want to do Two immediately obvious uses for activity data ● To recommend resources ● To provide learning analytics data

determine challenges Deal with the one specific to recommending resources first, then general challenges for both uses

Anonymisation Thanks to the Data Protection Act, recommenders must not reveal any Personally Identifying Information Strategy: Remove small data sources ● Items only occasionally loaned / referenced ● Small cohort data

Data collection Your educational institution is has a veritable ‘gold mine’ of information But can you get to any of the gold? ● Library data, circulation, downloads and turnstile information ● Student registry data: student attendance and academic results ● Diverse system use: VLE derived data!

Dealing with data quantities Potentially, maybe not now, but after a few years, millions of records Will your databases grind to a halt (or become less responsive) SQL databases can deal with loading, but not good for some recommender purposes, depending on the recommender noSQL has various database technologies that may help: MongoDB is one favorite

Facebook EdgeRank Algorithm  While RISE used conventional databases to good effect the underlying data format sometimes needs improvement  An example of a complex recommender algorithm using social data For each item and each user compute i = Σuewede Where e represent edges and ue – affinity score between the viewing user and edge creator we – weight for this edge type (create, comment, like, tag, etc) de – time decay factor based on how long the edge was created Then rank according to the item score I

 Expensive with SQL, almost trivial with a noSQL graph database like Neo4j

Learning algorithmics • What is it? A method of predicting which students are at risk of failing or leaving education • How does it work? 1. Find activity data that differentiates between students at risk and not at risk (statistical significance needed) 2. In subsequent years, look for similar measures in the activity data

Challenges: find the differentiators!

Summary ďƒ˜Three top challenges 1. Get hold of the data sources you need 2. Ensure that from a collection and a processing point of view you are using the right data storage media 3. With learning algorithmics, find the differentiators

Rob Moores Associate Director IMTS


[personal; course; grades]



[timetable; attendance]

Lab PC



Help Desk


[Search; incidents]

[security gates]

STAR-TRAK: NG Resource Manager


[fees payment]

[AV loans; fines]




[search; loans; fines]

Xstream Portal/VLE


[email, apps]

[usage; assignment status /mark]

Student Module Tutor

Student Liaison Officer Other Pastoral

Personal Tutor

Module Leader STAR-TRAK: NG

Course Leader

Level Leader Faculty Admin

Subject Group Head

Parent Corporate Planner

(with student permission)

Web Service Interface

Leeds Metropolitan University Network

Eclipse BIRT Analytical Reporting

SOAP Opera 2 Data Warehouse

Talend Open Studio Extract Transformation Loading

Star Trak NG Staging Tables

On Premises BizTalk Server 2010

BizTalk Enterprise Service Bus Toolkit 2010 BizTalk Server 2010

Banner Student Information System

CMIS System

VLE-Blackboard System

PC-Availability System

IPrint System

INTERIM: Qualitative: Feedback on perceived value from students and staff through focus groups PILOT: Qualitative / Quantitative: Feedback from staff and students on actual value through focus groups; usage statistics

LONG-TERM: Quantative: Analysis of retention rates; NSS scores

Activity Data Use Cases [David Kay, Sero Consulting] Student recruitment Student retention Research impact Resource management More?

1. Student recruitment 2. Student retention 3. Student choice

4. Process improvement 5. Systems optimisation 6. Teaching & Learning quality 7. Research impact 8. Research collaboration 9. Resource recommendation 10. Resource management

First order challenge – scenario definition Objective


Library resource recommendation Student

Benefit Direction to useful, value added and available print and electronic resources

Input data

Output info

List of options <Students on Student course / your unit who unit, relevant borrowed this reading list, peer borrowed this loans before / next>

Second Order Challenges

Questions  Quality of data (veracity, completeness, etc)  Critical mass of data  Applications involved  Local or above campus  Legal & corporate compliance Opportunities  Platform approach  Available and emerging synergies  User contribution

Legal issues Three key ones ● Data protection ● Freedom of information ● Sharing data

Also ● Licensing

Building a business case Concept useful whenever asking for funding Applies to bidding for funds eg to JISC Useful project preparation

Building a Business case: Introduction Who is it for? ● Write it in their language ● Write to their knowledge ● You are selling your project to them

Building a business case: Options For each option: ● What is it? ● Benefits of each approach ● Costs of each approach ● Summary of reasons for accepting / rejecting each

Building a business case: benefits Benefits to the funder Benefits in terms that are meaningful to them: ● Money ● Student retention ● Library usage ● …..

Building a business case Costs Project plan Risks Risk Data formats incompatible Sued for breach of privacy

Owner IT manager

Probability Low

Cost 5 days

Data protection officer

Very low



Amelioration Map between formats Ensure agreements are in place and signed by students

Read more
Read more
Similar to
Popular now
Just for you