Page 1

13 June 2011

Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk



All packed and ready to set off to #gregynog2011 in the morning to talk about #jiscad #lidp http://bit.ly/m3fDvy http://bit.ly/m9sVXF

RT @daveyp: Yays! The #jiscad #lidp Project will be presenting at #ili2011 along with the #jiscad SALT Project :-)

JUN 12, 2011 08:11P.M.

JUN 09, 2011 09:12P.M.



RT @JISC_RSC_YH: Usage statistics provide insight into resources: [...] http://bit.ly/mm35iX [#jiscad #raptor]

RT @mimasnews: RT @shaunlawson nice blog post on #opendata initiatives in UK universities http://t.co/BB3G9Tm includes @unilincoln [#ukdiscovery #jiscad]

JUN 10, 2011 02:31P.M.

JUN 09, 2011 07:03P.M.


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011



@librarygirlknit yes, tricky with shorter projects! We're helping evaluate one of the #jiscad projects and it's a bit of a rush

Ran a focus group this morning for #lidp #jiscad - lots of interesting information came out of it! 45 mins of transcribing to do though...

JUN 09, 2011 05:00P.M.

JUN 09, 2011 01:44P.M.


Yays! The #jiscad #lidp Project will be presenting at #ili2011 along with the #jiscad SALT Project :-)


RT @andymcg: Really interesting initial exploration of the openurl router dataset by @psychemedia: http://bit.ly/kP5G0D #jiscad

JUN 09, 2011 02:25P.M.


@librarygirlknit good luck with the transcribing, could take a while! Interested to hear what came out of focus group #jiscad #lidp


RT @shaunlawson nice blog post on #opendata initiatives in UK universities http://t.co/BB3G9Tm includes @unilincoln [#ukdiscovery #jiscad]

JUN 09, 2011 01:45P.M.

JUN 09, 2011 12:29P.M.


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011


• Session time - 30 mins? Consensus that this is about right.

RT @shaunlawson nice blog post on #opendata initiatives in UK universities http://t.co/BB3G9Tm includes @unilincoln [#ukdiscovery #jiscad]

• Added value: Ability to produce reports from aggregated data – per institution/per item/per selection of items –such as for Sconul and SUSHI reporting. • Google search (restricted to the 12 repositories) based on the item metadata could be added to suggest ‘similar items’ alongside recommended items.


RT @pstainthorp: Shared: Another Blooming Look at Gource and the Edina OpenURL Data: [...] http://t.co/y2zfN8L [#jiscad]

JUN 09, 2011 06:42A.M.


AEIOU First Focus Group meeting

JUN 08, 2011 12:51P.M.

JUN 09, 2011 03:52A.M. Yesterday we held the first AEIOU Focus Group meeting to demonstrate the Recommendation service, and it was trialled by the Group for the first time. Four of the six core repositories had the service deployed - and the remaining two will be available shortly once some upgrade issues are resolved. The Focus Group searched on a provided list of words and it was fascinating to see the recommendation service start to work - with user activity triggering recommendations across repositories as users moved from one search item to another. There was quite a bit of ‘noise’ to begin with but gradually the viewed item lists started to become meaningful. An online questionnaire was provided and participants were asked to complete this before leaving the meeting (http://www.surveymonkey.com/s/AEIOU1). Feedback on the testing exercise and the recommendation service was very positive and group discussions led to several useful suggestions on how the service could be improved to provide added value. See below for key suggestions:


.@andymcg Another Blooming Look at Gource & Edina OpenURL Data http://bit.ly/kAqZtp (@daveyp tried out gource last night too;-) #jiscad JUN 08, 2011 10:05A.M.

• Identify repository an item is from - should include this in results • Views vs downloads - consider weighting, ie. Number of times viewed • Number of recommendations. Differences of opinion on this – 5 or 6 seem about right. • Location of recommendations on item page - above or below? Maybe useful to have a link at the top of the record to take you to the recommendations at the bottom of the record.


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011



@psychemedia Mark van Harmelen wrote a follow up to your openurl post using ruby to do a similar analysis http://bit.ly/kyfjwC #jiscad

RT @psychemedia blogged: Visualising OpenURL Referrals Using Gource http://bit.ly/lrLpIG #jiscad JUN 07, 2011 04:02P.M.

JUN 08, 2011 09:56A.M.


RT @psychemedia Animated snapshot of edina openurl usage, generated using #gource http://youtu.be/U8XYEfNdxvk #jiscad


Advertisement at AG Townhall JUN 07, 2011 05:53P.M. An Access Grid Townhall meeting was scheduled for (Tuesday, June 7 at 10:00AM Central 15:00 UTC) and is one series held globally each first Tuesday of the month. This is an open forum for the community to air issues etc. and an opportunity to present new applications and features.

JUN 07, 2011 02:13P.M. Presented very briefly was a section describing the early stages and resulting pdfs of the recent Activity Data results for the UK Access Grid streams. Some of the early outcomes were described and enquires asked about the kind of feedback that is expected from the user community. Mainly the points raised concerned planning and proving usage as well as the possibilities to predict future trends.


RT @ojleaf: For BCH, Shropshire, Staffordshire a CCNC open licence has not been a barrier to all the partners to CG #oc2011 [#jiscad]

There were (semi) technical questions on what should be stored in the AG log files for the future: and suggestions considered if detailed IP numbers or MAC addresses were needed or desirable. A open issue of IPR and privacy were also raised. The discussion is to be continued at a follow up event after user feedback. Thanks to those presnt and will be back.

JUN 07, 2011 11:40A.M.


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011



RT @andymcg: Really interesting initial exploration of the openurl router dataset by @psychemedia: http://bit.ly/kP5G0D #jiscad

• Jodrell Bank – A node which used to be based at Jodrell Bank, but now is used by the astronomers based in the University of Manchester. And an initial list with Physical Room Nodes that need linking • MIMAS – A physical node used by MIMAS. • General Breakout room – A small node used by Research Computing Services. One of the key nodes used by the NGS.

JUN 07, 2011 11:09A.M.

There has been some further interest from other parties and these will be invited, as long as they give permission indirectly to for user communities – this includes some of the NeISS (National eInfrastructure for Social Simulation) partners. These user bases cover groups involved with a range of university activities; teaching, administration and research practice. The next stage is to invite the testusers to more structured but informal meetings with the following three stage process planned:


Really interesting initial exploration of the openurl router dataset by @psychemedia: http://bit.ly/kP5G0D #jiscad

Part a: Present a “Pepys’ diary” report for their virtual or physical room as well as consider cross-linked data. Part b: Highlight points of “Serendipity” that may inspire the users.

JUN 07, 2011 10:58A.M.

Part c: Process of gathering informal feedback that is to be documented.


From initial comments and emails we predict the main reasons for exploiting this data are in statistiics for future planning and quality control, but also secondary reasons will be in correlating student attendance, and providing statistics for the GreenIT initiatives. A second or combined post is planned to show how this engagement results in actual outcomes or as one other project puts this, from “being useful to being used”.

Post 3 – Users “AGtivity” JUN 07, 2011 10:24A.M. The AGtivity data set is starting to produce usable automated reports that are finding items of ‘surendipity’ in the analytics stage. The next stage of the project is to seek informal evaluation from users; and proposed on a small sized meeting process. “A user focused post talking about the use case (user requirements) for the project work, how the project affects your users and how users are being engaged and reacting to the project” Initial list of test-users with combined Physical Room Nodes with specific Virtual Venues: • Computer Science (Room1:10) – A publicly available node managed by computer science at the University of Manchester which was the main e-Science node. • MAGIC – A virtual node for the mathematics department which is used by mathematics postgraduate students for the TCC maths


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011


or email interview.

RT: @psychemedia: Learning Analytics conference 2012 in Vancouver. Call for papers: ttp://lak12.sites.olt.ubc.ca/ #jiscad

3) We believe we can change our academics’ attitudes towards the institutional VLE, by providing clear presentations including visualisations of activity information. We plan to test this by experimenting with different presentations of elements of our activity information to establish what the most effective and engaging presentational formats are. We will survey academics at the start and end of the project to measure their attitudes towards the VLE, which should allow us to measure our results.

JUN 06, 2011 08:24P.M.

4) We think a comparison of VLE usage information across the universities of Cambridge, Oxford and Hull should prove valuable to the sector, as we may be able to identify similarities and differences in VLE usage which may inform future consideration of the transferability of VLE project results and concepts across institutions.


Day two of the RISE evaluations. Interestingly, so far students think recommendations should be more prominent. #ourise #jiscad


Data visualisation JUN 06, 2011 12:37P.M. We’re looking at a couple of tools here: BIRT and Pentalho, both of which have free business visualisation software packages. We’re hoping that they can offer us more than you can get from Excel pivot tables, and be easier to set up than a bespoke solution involving some PHP and graphing software.

JUN 06, 2011 03:52P.M.

This isn’t as straightforward as you might imagine. Raad’s been working on setting up a Pentalho instance for most of the last week, and hasn’t yet managed to get a significant improvement on what Excel provides, though it’s taken considerable effort to get this far. Pentalho requires various modules to be installed, but its documentation is rather incomplete, especially the documentation for creating aggregate tables. Aggregate tables are essential when dealing with large volumes of data we have over 10m rows of Sakai event data, so without aggregate tables, every time we try to look at a large section of the dataset, we run out of resources. So thus far, our suggestion would be that if you want business information software, you may be better off paying for a commercial product.


Hypotheses JUN 06, 2011 12:40P.M. We have four hypotheses we want to test: 1) Senior stakeholders in our VLE would like richer information about VLE/VRE usate, so that we can show growth potential, whether across the campus or in specific faculties or departments. We will test this by presenting the visualisations of our activity information to our Centre’s management committee (or equivalent decision-making committees) and gathering their responses to the information, as well as obtaining their opinion on whether a case is made for a change in investment level. 2) We aim to identify ‘usage signatures’ which indicate either skilled use of the VLE, or users who may be struggling, but who do not ask for help. In the former case, we’d like to share what they’re doing; in the latter case, we’ll look at the relation between access to our help documentation and our helpdesk tickets. We will test this by correlating a usage signature with the reported experiences of academics, gained via phone


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011


degree course are tagged with metadata on creation. This is a relatively new procedure, so only sites active from October 2010 are tagged. However, signalling that a site is no longer in active use for teaching (because a new site has been created for the new academic year, for example), is harder. The case I just mentioned can be done by editing the metadata, because we will have information that there is a new site; but if a lecture course has been discontinued, we can’t currently update that. For other sites, we have to rely on manual inspection. What is the site called? How many people are in it? What documents are stored there? Which tools does it use? From this information, we can usually deduce what the site is used for.

The Data JUN 06, 2011 12:34P.M. We’ve just started work on our JISC project on Exposing VLE Activity Data. First, we’ve had to get our data (first, catch your rabbit..), from when we started using CamTools (our current institutional VLE) to December 31 2010. This involved retrieving archived data, which didn’t go as smoothly as we’d hoped. We had to do some restoration of corrupted data, and we’re missing about two weeks of data as well. This just illustrates the problems of dealing with data that’s collected but not looked at very often.

Does a site’s purpose change? There are two aspects to this question: does a site, for example a smallgroup teaching site, turn into something else - perhaps a research site, or a site for that teacher’s lecture course? Or, does someone set up a site expecting it to be used in one way (putting in certain tools), and it turns out to be used in another? The former is difficult to determine. All we can do is examine a site and say that it was being used in a particular way at a particular time, unless we can find out particular ‘signatures’ which denote the type of a site (at the moment, we don’t know whether sites would have distinctive signatures). The latter may be more amenable to analysis, in two ways. One, we can look at tool usage: tool X was added in 2008, but was never used, tool Y was added in May 2010, and has some hits. Two, we can conduct interviews with site owners, and ask them what they thought they were going to do, and what they actually found. (This does have the problem that people’s memories may be unreliable, but we can check what they say against the data we hold about their site.)

The kinds of data we’ve collected are all the events from the Sakai event table. Sakai is the underlying software that powers our VLE (Virtual Learning Environment). Its event tables contains details of software ‘events’ - something that’s happened. Typical events are things like ‘content.read’ (someone’s read some content), ‘content.update’ (someone’s updated some content) or ‘search’ (this is probably easy to work out!). We’ve also collected data about who’s visited which web pages inside Sakai, when they did it, and which web browser they were using at the time - more typical access log data for web pages.. Now that we’ve got all this data from our logs, we need to make sure it’s in a format where we can process it, to find the answers to some of our questions about how the VLE is used. However, we may also want to collect other, ‘softer’ data, such as what each area of the VLE is used for (teaching, research, admin, or something else), and why it’s used. This will require more human input, whether by examining individual subsites of the VLE, questionnaires or interviews. General Observations on what the limitations of the data are

These kinds of approaches allow us to augment the automatically collected data from the past four years of running the VLE.

As mentioned above, we mostly can’t determine what a site is used for, other than by human inspection. The exception to this is sites designed to support lecture or degree courses, for which we maintain a list. So while we may be able to track usage patterns for an individual site, we can’t easily do so for a set of related sites, unless we define the relation manually.


Another unTabbloid news digest JUN 06, 2011 06:27A.M. ‘My personal battle with Tabbloid’ seems to be emerging as an overriding theme for my synthesis posts and, alas, this/last week was no different. As before, you can produce a Five Filters pdf newspaper on the fly but I noticed that not everything is showing up and only the last few tweets are included (seemingly due to a glitch with the Twitter Atom feed). Happily there is a Twapper Keeper archive of the #jiscad tweets which you can browse through as a supplement to the projects’ blogposts. Perhaps next week I’ll print out all the new blogposts and tweets and handcraft my own news digest from paper and glue, before scanning it in and adding all the necessary hyperlinks ... let’s hope it doesn’t come to that. Anyhow, here are some highlights I’ve drawn out from the last couple of weeks:

We’ve observed sites being used for: teaching, research, administration, social activities and testing (using sites as a sandbox to try things out before updating a site that’s already being used by students or researchers). More specifically, we’ve seen sites used for teaching lecture courses, whole degree programs, small-group teaching, language learning. We’ve seen sites used to organise research projects, from PhD theses up to large international collaborations. CamTools has been used to administer part of the college applications process, and for university and college societies, and to organise conferences. But unless a human looks at a site, we’ve got no way of deducing this from the data (we don’t capture extensive metadata on site creation).

The OpenURL Project announced the release of OpenURL data under the ODC-PDDL license with an ODC-by-SA Attribution clause. Full details of the data the project has released, and the data itself, is

So, how do we categorise a site? Currently, sites which are associated with a specific lecture course or a


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011

available on their website.


“Recipe” for Reporting

The OU RISE Project are holding a one day ‘Innovations in Activity Data’ workshop at the Open University in Milton Keynes on 4th July. The day includes presentations from the RISE, SALT and LIDP projects and a presentation from Tony Hirst on visualizing activity data.

JUN 05, 2011 09:14P.M. Creating a recipe to produce a basic activity diary report for your own a Virtual Venue.

Over the past few weeks the OU RISE team have been knee deep in early user feedback, ahead of their main evaluation activity which is planned for July. Firstly in the form of feedback from focus groups which were asked about the usefulness of recommendations (as part of a wider OU Library search evaluation). The feedback gathered suggests that the provenance of recommendation was key in determining its usefulness. The results also suggest that there are differences in how provenance is judged depending on a student’s level of study. Secondly, the results of an ongoing user survey which they’ve been analysing. The results are looking encouraging so far but are also raising more questions for them to delve into.

1. The data log file needs to be accessesed from the servers 2. … and then a parser has been constructed in C/C++ 3. To create a report a few tools are used and need installing: • Python (2.5 should be fine but tested is using 2.7) http://www.python.org/ • Python Image Library http://www.pythonware.com/products/pil/

The UCAID project have written an interesting post on how their project differs to the rest of the Activity Data projects. Namely, that their focus is on making activity data available to individual users for their own benefit. It got me thinking about the spread of personal vs public motivations and benefits behind the projects and whether the focus sometimes drifts too quickly towards the open data agenda. Maybe we could do with thinking deep thoughts about the Drucker principle / Pearson’s Law which states that if something is measured then it improves - these seems particularly apt for those projects who have a keen interest in building data visualizations. Those are just some off the cuff ponderings from me - I might try and corral them into a future blogpost.

• ReportLab modules to produce PDFs http://www.reportlab.com/software/opensource/rltoolkit/download/ • Gnuplot http://www.gnuplot.info/ 4. Once directory are edited a simple instruction can be carried out: • python venuereport.py [start after [end before]] e.g. python venuereport.py MAGIC “01/01/11 00:00:00″

LIDP team discuss how they’ve been tackling one of their project’s big issues and liaising with JISC Legal to ensure that they’re complying with legal guidelines around accessing and releasing data. On a related note, the Discovery initiative recently published the timely ‘Licensing Open Data: A Practical Guide’ [pdf] by Naomi Korn and Professor Charles Oppenheim.

Example use is to look at the test-room: C:\data\agp\pyrep>......\Python27\python.exe venuereport_mt.py “AGSC Test Room” “01/09/10″ “1/05/11″ See the first two pages of an eigth page report on cross-usage (anonymised). As shown in the last couple of semesters this has been used nearly 3000 times for a lot of short sessions – exactly what a test room is likely to be used for.

Lastly, news from the e-research JISC programme in the shape of a press release about the RAPTOR project which has just released v0.1 of their eresources usage statistics tool.


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011

and internal policies such as our proposed Student Attendance policy. So we have useful and usable – let’s see if we can get used …


robmoores JUN 04, 2011 05:44P.M. Time at last to catch up with blogging about STAR-Trak ! First of all, the project is in a healthy state and on track. It’s been a quiet time in some respects as the development team works its way through the requirements. We applied MoSCoW principles to the requirements to get a first cut of what we could achieve with the time and budget available and then divided them into four timeboxes. Time boxes 1 and 2 have been delivered (with one or two exceptions), 3 is almost complete and 4 has completed the technical design. TB1 has been tested and the results fed back for correction. The main exception to date is the development of a chart interface to show activity data. Our initial proof of concept project listed and aggregated this data in a very static and unattractive way. The development team proposed a Flash-based solution for this project but this was rejected due to compatibility issues. There are plenty of AJAX-based solutions that we can base our chart on and the team are investigating the best way to implement this.

A use-case for this one is to spot those nodes always using the test room but never reportiong a fault.


robmoores JUN 04, 2011 06:03P.M. There’s a saying that for an application to add value it has to be useful, usable and used. Our requirements gathering involved teaching staff from the Faculties and Personal Tutor network, administratve staff from registry services and student support services, and the Student’s Union. If we’ve done our job properly then the application should be useful.

As we have fleshed out the intricacies of the requirements there has been some swapping of requirements between timeboxes, re-prioritisation, and de-scoping of some lower priority functions. This was always going to be more of an agile than a classic waterfall development as we are charting new waters, at least for Leeds Metropolitan, and so far we are very confident that we will deliver a very useful and usable application. The next challenge will be to ensure that it is used !

We have spent a lot of time designing the application to be as simple and friendly to use as possible. For example the user can select the screen they see on login to save a click. Maintenance of relationships between staff, course/modules and students could be a nightmare, so we have built in what we are calling “social controls” rather than hard controls. This means for example that I can identify myself as a module tutor, rather than this having to be maintained by an administrator. The social control aspect is enforced by everyone being able to see that I have claimed this role, and allowing students to block any member of staff from having access to their details. It will be interesting to see how this control model pans out in the trial. We are now working on the third aspect: getting the application used. Since we submitted our initial bid we have had a regime change which resulted in us losing our main senior sponsors. We have worked behind the scenes to maintain the ground-level swell of support for STAR-Trak and have recently started to push STAR-Trak back onto the corporate agenda. As a result we have been asked to submit a paper to the Vice Chancellors Group containing a proposal for an extended trial of STARTrak across a full academic year. As other commentators have pointed out, tracking activity data is not usually considered a corporate priority, particularly when HEIs are struggling to cope with very diffeent market conditions. However we have demonstrated strong links with our strategic plan, compliance requirements such as the UK Border Agency


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011



RT @OU_Library: Join us at the Innovations in Activity Data workshop, 4 July, Milton Keynes http://tinyurl.com/6hq6vdb #ourise #jiscad

RT @OU_Library: Join us at the Innovations in Activity Data workshop, 4 July, Milton Keynes http://tinyurl.com/6hq6vdb #ourise #jiscad (via @LizMallett)

JUN 04, 2011 05:21P.M.

JUN 03, 2011 01:53P.M.


RT @OU_Library: Join us at the Innovations in Activity Data workshop, 4 July, Milton Keynes http://tinyurl.com/6hq6vdb #ourise #jiscad #lidp


Join us at the Innovations in Activity Data workshop, 4 July, Milton Keynes http://tinyurl.com/6hq6vdb #ourise #jiscad (via @LizMallett)

JUN 03, 2011 03:20P.M.

JUN 03, 2011 01:53P.M.


RT @OU_Library: Join us at the Innovations in Activity Data workshop, 4 July, Milton Keynes http://tinyurl.com/6hq6vdb #ourise #jiscad JUN 03, 2011 01:55P.M.


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011



Starting student evaluations of the RISE recommendations system today. #ourise #jiscad



10.15am Welcome and Introduction Nicky Whitsed, Director of the Open University Library Services

JUN 03, 2011 01:48P.M.

What activity data can you use? Examples from JISC Activity Data programme 10.30am SALT project ‘Surfacing the Academic Long Tail ‘– MIMAS Joy Palmer, Janine Rigby


An invitation to Innovations in Activity Data workshop, 4 July, Milton Keynes http://tinyurl.com/6hq6vdb #ourise #jiscad


Coffee break

JUN 03, 2011 01:44P.M.

How can you use the data?

11:30am RISE project ‘Recommendations Improve the Search Experience’ – Open University, Richard Nurse 12.10am LIDP project ‘Library Impact Data Project’ – University of Huddersfield, David Pattern (via video) 12.50pm

1.50pm in libraries? RISE

Lunch break

What are the challenges around activity data

World Café style workshop exercise

Innovations in Activity Data workshop 4 July 2011 The Open University, Milton Keynes

- What data?, How much?, Where is it?, How do you get at it? - What to do with it? - What are the challenges? 2.40pm How can you visualize activity data? Tony Hirst, Lecturer, Department of Communication and Systems, Open University

JUN 03, 2011 01:38P.M. Innovations in Activity Data workshop

3.10pm Outline A one-day workshop aimed at Higher Education library services who are interested in practical applications of activity data, what can be collected, how it can be used, visualised and presented. The workshop will be an opportunity to hear from library projects working on the JISC Activity Data programme and from practitioners working in this area.

Tea break

3.30pm Wrap-up session – JISC Activity Data Synthesis project, David Kay, Sero Consulting 4.00pm


Who should attend? Librarians, leaders and managers, practitioners, developers and advocates from academic libraries, who want to understand the potential of activity data to shape, guide and improve services, to inform users and to deliver innovative new services.

Location Christodoulou meeting rooms, The Open University, Walton Hall, Milton Keynes Date 4 July 2011

Register by email to: RISE-Project@open.ac.uk

Cost Free to attend. Refreshments will be provided.


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011



Presentation from JISC Activity Data Online event 2 June 2011

RT @richardn2009: May update for RISE project blog, on evaluation, feedback & openURL data http://bit.ly/m6Cb8X #jiscad #ourise [#openurlrd]

JUN 02, 2011 04:39P.M. Rise presentation for jisc online mtg 2011 06-02 View more presentations from Richard Nurse.


ADS Programme Virtual Meeting 2 – an AGtivity Perspective

JUN 02, 2011 01:26P.M.

JUN 02, 2011 03:48P.M.

May update for RISE project blog, on evaluation, feedback and openURL data http://bit.ly/m6Cb8X #jiscad #ourise


Used Wedex this time: a bit crackly at the start but seems to work for some but note is ActiveX based so needed installation. The main theme pesented moved away from books and libraries; to consider e-Resourses that (may) have seperate issues. (Significance, Context, Coverage, Purpose) • Q. Is knowing a particular router that was used a useful bit of information?

JUN 02, 2011 01:22P.M.

• Q. Can this be better as individual sub-uses can now be measured? • Was a concern on who are the active ‘customers’ for a particular data set – and agree with AGtivity experience.


May update

Ideas: are there better list of data type – c.f. openurl http://openurl.ac.uk/doc/data/data.html (about 250,ooo requests per month). Should we just use Dublin core and other base standards for sharing to create a minimum set. Link mentioned: http://openurlquality.niso.org/

JUN 02, 2011 01:21P.M. After the flurry of technical activity earlier in the project May has been a quieter month that we’ve spent mainly arranging the evaluation work that starts in June, and looking at some of the early feedback from the on-going user survey.

Still a vote for EVO from Caltech to be considered as an alternative: http://evo.caltech.edu/

User evaluation work Any research with students at the OU has to be approved by an ‘ethics’ committee, the Student Research Project Panel. So we complete a fairly lengthly template that outlines the research we plan to do, who we will involve, how we will go about the research and what Data Protection processes we have in place. That goes off to the panel for assessment and all being well you get approval for your evaluation activity. At the OU, apart from dealing with the ethical basis of the research the process also acts to regularise the contacts with students so they aren’t deluged with requests and emails. As a distance learning institution a lot


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011

of contact with students is by email so it’s important that students can control the amount of material that is sent to them as the pace of study can be intensive. So students can opt-in to being available for research of this type. Once the project is approved then we get sent a list of contact details for the students we are allowed to contact to take part in the evaluation. For RISE we’ve had quite a good response and have been able to arrange the first few one to one interviews starting tomorrow. We’ve also had people saying that they are interested in checking out MyRecommendations online and will complete the feedback. Feedback so far When we setup the RISE interface we added a feedback link to a survey using SurveyMonkey This has allowed us to collected some user responses more immediately. The second type of recommendation, which tries to relate articles you’ve viewed with similar articles by postulating that there is a relationship between articles that a user views sequentually, shows that 50% thought them to be Very or Quite Useful, but with a larger number seeing them as not useful. There does seem to be a ‘marmite’ effect where recommendations are either relevant or not. That could be down to the quantity of recommendations data as RISE currently relies on data collected since the interface went live.

RISE feedback People on your course viewed So we’ve asked questions about each of the different types of recommendations that we are providing to get people to tell us how useful they are. For course recommendations i.e. ‘People on your course viewed’ more than 40% saw them as Very or Quite useful. It should be noted that if you aren’t on a course you don’t get any course recommendations so that should account for the 33% who said ‘Not applicable’. Course recommendations are based largely on the EZProxy logfiles so have the largest amount of data to draw on.

The third type of recommendation relates to the search terms that are used and the articles viewed. Agian 50% saw these are Very or Quite useful, but a smaller percentage saw them as Not useful. Again these recommendations are being powered by search terms entered into the RISE interface as we don’t have the search terms used for the EZProxy data.


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011


e-resource usage tracker: RAPTOR http://bit.ly/ig1trI #jiscad JUN 01, 2011 04:59P.M.


Activity Data Justifying Expenditure JUN 01, 2011 04:59P.M. From a JISC project quote ‘Martyn Harrow, director of information services at Cardiff, said: “The strength of the RAPTOR tool at a time when education budgets are being squeezed is in providing the evidence needed for academic schools to assess the e-resources subscriptions that are in place. Universities using this system will be able to prove the impact of the e-resources they provide, and ensure that they continue to deliver the best possible value for money for individual academic schools and entire institutions.” ‘

The final question we asked was to try to understand a bit more about the relevance and quality of the results. Here there was a much more definite Not Relevant response at 42% but 50% saw the results are Very or Quite relevant. So again a bit of a ‘marmite’ response that bears more detailed investigation to undestand why. EDINA OpenURL data openly released A few days ago came the great news that EDINA have released their OpenURL data http://openurl.ac.uk/doc/data/data.html So we’ve been having a look at the data to see how it could help us with RISE recommendations. The size of the dataset at nearly 300,000 rows is larger than we have with RISE and although there aren’t any search terms included we think there are ways that we can use it with RISE so have scheduled some time to setup a RISE parser to ingest the data and test it later this month. A great example to us all though and it will be interesting to see what can be done with the data.

Agree with this to some extent although in can take time to gather the eviidence. Presented is an example from our activity data sets (AGtivity). The following graph show links between two video conferencing systems EVO (from Caltech) and AG/IOCOM; displayed in terms of people/hours per meeting. It indicates that over the previous year some experimental tests where performed, short spikes, and also larger demonstration workshops; then this year (after the Xmas break) there has been more regular usage.

Creating ‘bridges’, servers that link systems together, is not only useful but can be monitored to see what kind of take-up there is. Admitttedly


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011

this is preliminary and it will be useful to monitor usage over the next year.


Technical Approaches MAY 25, 2011 05:03P.M.

JISCAD - TWITTER SEARCH There are numerous technical challenges and milestones relating to the RISE project, this post will aim to address some of the most important aspects.

RT @psychemedia: RT @mhawksey: Consent Management: Handling Personalisation Data Lawfully /via @jisclegal http://bit.ly/kN5Wbu [thanks] #jiscad #devcsi

Database Structure The database’s design is probably the most important aspect of the RISE Project’s development, being a heavily used source of data in the system it has to be capable of generating the recommendation types required as quickly and efficiently as possible. The data in this database initially came from an archive of access logs generated by EZProxy, yet along with these log files it also has to accommodate data generated by the MyRecommendations web service going forward, as such the database’s schema, which can be found on the Technical Resources page is designed such that it is able to accept and use data from all of these sources effectively.

MAY 31, 2011 02:44P.M.

Throughout development it has also been imperative to bear in mind that some of the data contained within this database is to be released publicly and as such must be useful to external entities while also being fully functional internally. One of the main concerns regarding this ensuring anonymity, which is described in further detail on the Technical Resources page.


RT @mhawksey: Consent Management: Handling Personalisation Data Lawfully /via @jisclegal http://bit.ly/kN5Wbu [thanks] #jiscad #devcsi

Parsing Data As mentioned in the previous section the system has to be designed such that it can accept data from gzipped EZProxy log files. This is done via a PHP parser which extracts the relevant useful information. Some of this information is then used for further information gathering via the EBSCO Discovery Solution API, which stores full information on the resources available to the OU.

MAY 31, 2011 02:24P.M.

The data received by the parser is in the following format:

<REMOTE_HOST>|||<DATE_TIME>|||<USER_ID>|||<HTTP_REQUEST> |||<HTTP_REFERER>|||<HTTP_RESPONSE>|||<RESPONSE_SIZE> |||<SESSION_ID> With the exception of the response size variable, all of the above are used by the database in the process of generating recommendations, more specific uses for each are outlined later in this post. Notably at this stage there is no variable which contains information relating to a particular resource, hence the most salient variable becomes HTTP_REQUEST, from which we extract the requested URL’s ‘AN’ parameter. This ‘AN’ parameter, if present contains an Accession Number, which is a resource identifier used by EDS (EBSCO Discovery Solution – used to index resources). Using this AN (an 8-digit integer),


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011

the parser then requests further resource details from the EDS API.

3. The gadget will now be able to successfully access the user’s credentials (as a successful SAMS authentication has taken place) and perform searches and provide recommendations.

The EDS API returns resource data in an XML format, an example of which can be found here.

Tracking & Analytics From this XML result the parser extracts the required resource information, such as Name, ISSN/DOI, Author(s) and Publication Dates.

As the MyRecommendations and Google Gadget aspects of the RISE Project utilize AJAX & Javascript technologies within their UIs and in the processes of displaying search results and recommendations it becomes necessary to implement advanced tracking measures which aren’t included by default with analytics implementations.

This data, combined with the data retrieved from the information from the log file entries is then formatted appropriately, and inserted into the RISE database. A flowchart depicting the logic behind this parser can be found on the Technical Resources page.

The main package in use for the RISE Project is Google Analytics. As an example, if a user lands on the MyRecommendations interface their initial pageview is tracked. Their searches however use AJAX to Asynchronously fetch the results without having to reload the static parts of the interface. As this means the actual page isn’t being reloaded (the Google Analytics JavaScript isn’t being requested again by the user), it’s necessary to manually “push” a pageview to the Google Analytics servers with information about the current action.

Authentication Part of this project is to develop and release a Google Gadget, which is to provide similar functionality to that offered by the MyRecommendations interface in a lighter and more portable interface. It is important to note that The Open University’s online e-resource collections are only to be accessed by current staff and students, as such any online service with the aim of indexing said resources must perform it’s own suitable authentication & authorization of users. On pages hosted on Open University Servers this is accomplished using SAMS, which is a proprietary server-side authentication system.

In the above example, we would push a pageview event with the parameters “/search/<search terms>”. Which then allows users of the Google Analytics interface not only to see the number of actual searches, but also to drill-down and see which pages have been requested within the /search/ section (i.e. most commonly used search terms).

The initial plan for the Google Gadget was to authenticate using a tokenbased method similar to the following:

This approach is also used to track searches on the Google Gadget interface.

1. Pass the user to a page hosted on OU Servers, which is protected by SAMS.

Providing Recommendations The recommendation types outlined in previous posts are generated based on the logged-in user’s credentials, and resources which may have relationships with said credentials.

2. SAMS-Protected page generates a hash token, and redirects the user to a gadget authentication page with the token as a URL parameter, and stores this token in the database.

Relationships are stored in tables within the database, an example of a relationship stored within one of these tables is as follows:

3. The gadget authentication page senses this URL parameter, and sets a cookie on the user’s machine which is accessible by the gadget itself.

Field Name: Value: course_id 32 resource_id 54645 value 14

4. When a user accesses the gadget, the token stored in the cookie is checked against the database entry stored in step 2.

This example depicts a relationship between resource 54645 and course 32, having a value of 14. This means people on course 32 have given resource 54645 a value of 14. Values are assigned based on resource views and subsequent relevancy ratings (if available).

Fortunately, the RISE Project was able to take advantage of hosting the gadgets directly on Open University servers, and as such can utilize a much more efficient & secure authentication approach:

For example, a user following such a recommendation would increment the above relationship to 15 by simply following the link, indicating at least the resource title was somewhat interesting to the user.

1. If user isn’t authenticated with SAMS, the gadget page displays a link (to open in a new window) to a SAMS-protected page.

If this user then chooses to rate, the following rating choices would result in the respective ‘Value’ after rating.

2. In order to view this page, a user must be successfully authenticated, as such this page, when viewed instructs the user to close the external window, and return to the gadget.

Rating Choice Resulting Relationship Value Change Logic,


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011

after +1 for resource visit Very Useful 16 +1 Somewhat Useful 15 0 Not Useful 13 -2


Going to http://tiny.cc/3n4z4 #cilips11 ? Be sure to catch @Graham_Stone's presentation about #jiscad #lidp http://bit.ly/ggN69M :-)

From the table above, it is evident that the logic is weighted in favour of the resource being relevant (i.e. the value will always go UP by at least 1, unless ‘not useful’ is selected), this is based on the theory above that the resource link appearing useful enough to follow gives some indication that the resource is more likely to be relevant than irrelevant. Giving the ‘Not Useful’ rating a weight of -2 means the least useful resources can still ‘sink’ and appear less frequently. This approach also ensures the actions of those users choosing not to provide feedback can still be utilized by the recommendations engine.

MAY 24, 2011 04:34P.M.

It is also important to note that the ratings are purely for the actual relevance to a particular user, and not on the quality of the resource itself. This ability to rate both relevance and quality may be implemented in the future.


RT @librarygirlknit: Some notes legal issues around our data collection/analysis http://bit.ly/l3p6zc #jiscad #lidp

There is a relationship table for each of the three recommendation types, all are controlled and manipulated by actions of users in the RISE system, and the access logs generated by other library systems.


RT @daveyp: Going to http://tiny.cc/3n4z4 #cilips11 ? Be sure to catch @Graham_Stone's presentation about #jiscad #lidp http://bit.ly/ggN69M :-)

MAY 24, 2011 11:49A.M.


RT @aden_76 I've written about the data banks capture and what they do (don't do) with it [..] http://bit.ly/jOhaDD [#jiscad cc @serodavid]

MAY 24, 2011 05:57P.M.

MAY 24, 2011 10:56A.M.


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011


collaborators to use on their webpages:

RT @librarygirlknit: Some notes legal issues around our data collection/analysis http://bit.ly/l3p6zc #jiscad #lidp MAY 24, 2011 10:53A.M.

“When you search for and/or access bibliographic resources such as journal articles, your request may be routed through the UK OpenURL Router Service (openurl.ac.uk), which is administered by EDINA at the University of Edinburgh. The Router service captures and anonymises activity data which are then included in an aggregation of data about use of bibliographic resources throughout UK Higher Education (UK HE). The aggregation is used as the basis of services for users in UK HE and is made available to the public so that others may use it as the basis of services. The aggregation contains no information that could identify you as an individual.”


Focus groups have also been conducted with a briefing and a consent form to ensure participants are fully aware of data use from the group and of their anonymisation and advising them that they can leave the group at any point.

Some notes legal issues around our data collection/analysis http://bit.ly/l3p6zc #jiscad #lidp


if you have 4hrs to spare ... the full webcast from Farsight2011: Beyond the search box http://bit.ly/g1hGdE [#ukdiscovery #jiscad]

MAY 24, 2011 10:35A.M.


MAY 23, 2011 12:05P.M.

The legal stuff… MAY 24, 2011 09:53A.M. One of the big issues for the project so far has been to ensure we are abiding to legal regulations and restrictions. The data we intend to utilise for our hypothesis is sensitive on a number of levels, and we have made efforts to ensure there is full anonymisation of both students and universities (should our collaborators choose to remain so). We contacted JISC Legal prior to data collection to confirm our procedures are appropriate, and additionally liaised with our Records Manager and the University’s legal advisor.


Academic analytics resources from Educause MAY 23, 2011 07:56A.M.

Our data involves tying up student degree results with their borrowing history (i.e. the number of books borrowed), the number of times they entered the library building, and the number of times they logged into electronic resources. In retrieving data we have ensured that any identifying information is excluded before it is handled for analysis. We have also excluded any small courses to prevent identification of individuals e.g. where a course has less than 35 students and/or fewer than 5 of a specific degree level. To notify library and resource users of our data collection, we referred to another data project, EDINA, which provides the following statement for


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011

important in four to five years to important now (the joys of futurology). Its part based on survey and part expert discussion and provides a very broad brush overview of the technology. This year one of the areas that they have picked out for the four to five year time frame is learning analytics. Discussed on pp28-30. It provides a two page overview and some examples and further reading.

Signals: Applying Academic Analytics http://www.educause.edu/EDUCAUSE+Quarterly/EDUCAUSEQuarterlyMagazine or http://bit.ly/c5Z5Zu This is a fascinating case study from Purdue University, where they say that the use of analytics has improved results, and led those in greatest danger of failing to switch courses earlier. They ran the trial using a control group, so there results have some validity, and courses were sufficiently large for the results to be meaningful.Their statistics include Traffic light warning system used at Purdue University “Over the succeeding weeks, 55 percent of the students in the red category moved into the moderate risk group (in this case, represented by a C), 24.4 percent actually moved from the red to the green group (in this case, an A or B), and 10.6 percent of the students initially placed in the red group remained there. In the yellow group, 69 percent rose to the green level, while 31 percent stayed in the yellow group”

This post looks at some of the material available on the Educause web site relating to the use of what they call academic analytics (and we seem to be calling activity data) to support student success. What Educause http://www.educause.edu/ calls academic analytics is very similar to what we are calling activity data (though clearly the focus is different, as with activity data the focus is on the data while with the analytics it is on the tools, presentation and use, so I guess that I prefer the analytics term). In one report (Academic Analytics: The Uses of Management Information and Technology in Higher Education) they say that academic analytics “describe[s] the intersection of technology, information, management culture, and the application of information to manage the academic enterprise.”

Although they don’t say how the outcome compares with the control group. Academic Analytics: The Uses of Management Information and Technology in Higher Education http://www.educause.edu/ers0508, a book discussing analytics in HE. Dated 2005 it still has interesting stuff in it. Among the things to note is the sources of data people are (were) using in their analytics:

Anyhow over the last few years Educause has produced some very useful material, most of which is available from their Academic Analytics page Table 6-3. Information Contained in Data Stores or Warehouses (N = http://www.educause.edu/Resources/Browse/Academic%20Analytics/16930 213) Here I will pick out some of the things that might be of interest to you.


7 Things You Should Know About Analytics http://www.educause.edu/ir/library/pdf/ELI7059.pdf Educause produce a series of reports entitles 7 things you should know about x. These are very brief about two sides and include a story / case study, definition and some of the key issues. They can be very useful introduction to those who do not already know about what you are doing, and come from an independent authoritative source.


2011 Horizon Report




The Horizon report is produced annually by Educause and looks at technologies that are going to have an impact over the next year, twothree years and four to five years. Not all technologies make it from


Student information system 93.0% Financial system

HR system


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011


How the ICCOC Uses Analytics to Increase Student Success


Case study of the use of analytics at Iowa Community College Online Consortium to improve student retention at . http://www.educause.edu/EDUCAUSE+Quarterly/EDUCAUSEQuarterlyMagazine or http://bit.ly/m2HgnZ. Over the period 2005-9 they increased the student success rate from 77% to 85%. What is not clear is how much of the improvement relates to the analytics and how much derives from other work to improve student success.

36.2% Course management system 29.5% Ancillary systems (e.g., housing) 28.2%

There is much more and I will post some other summaries of articles that I think will be useful later.

Grants management 27.7%


Audio recordings of the first virtual meeting

Department-/school-specific system 22.5%

MAY 21, 2011 03:56P.M. Comparative peer data In three recordings, our first virtual meeting. These recordings miss out some of the meeting; apologies for that but we have projects describing themselves, xxx and some concluding discussion.

20.2% Feeder institutions (high schools) 9.4%

Brief instructions, if you havent met SoundCloud before: In the embedded sound tracks below look for the for the markers in the tracks for a place that is interesting

Note that there is no mention of Library systems of any type and the strong emphasis on administrative systems rather than academic systems.

Presentation on analytics: Academic Analytics: A New Tool for a New Era http://www.educause.edu/Resources/AcademicAnalyticsANewToolforaN/162057 the slides themselves are a bit thin, but No 19 is interesting Results to Date Typically 10-20% of students receive a message Highest Risk: hen you have found an interesting place go to it by clicking there on the sound track (You have to get rid of the grey box pop-up, and make sure play is pressed too).

• Most remained “at risk” • Still unlikely to take advantage of resources Lower Risk:

The activity data projects described • Majority were able to leave the “at risk” status ADS first virtual meeting - part 1

• More likely to take advantage of resources


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011


heading to #thatmanchester to rendezvous with @serodavid, @tomfranklin and the untweeting Mark van Harmelen to talk #jiscad synthesis

Part two

I’ll edit in some markers in due course.

ADS first virtual meeting - part 2

MAY 20, 2011 11:06A.M.

Cookbooks and the next meeting JISCAD - TWITTER SEARCH

Feeling knackered after a productive day, filled with meetings with @DawsonBooks, loading ebook records and discussing #lidp #jiscad

David Kay, Andy McGregor, and others on cookbooks and recipes. David Kay on next meeting topics, and a fond good bye from your hosts.

ADS first virtual meeting - part 3

MAY 19, 2011 09:10P.M. Tech notes

AGPROJECTS Teething problems were endemic and we have decided to replace GoToMeeting with WebEx for future meetings, using a 50 seat subscription for the latter.

Weekly Tech Meetings #8-#10 MAY 19, 2011 08:46P.M. Combined Techy meetings – for the month of April. Two main meetings listed.

These recordings were made with Audacity and post processed with Levelator before being uploaded to SoundCloud and then embedded in this post, with a bit of HTML hand editing.

14 April 2011 – AGtivity Project Extra Meeting • Issues related to CO2 – and alternative definitions • Dropbox account storing data, code and new images – as well as hundreds of batch xls and png files • Experiment on Information Visualization system API (from Stephen Longshaw) 19 April 2011 – Combined Tech Meeting • Descripton of Maths AG ideas for ‘magical’ o-books including atomic vidoe segments


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011

• Data update


New RISE blog post http://bit.ly/iZEFIE Feedback on recommendations from search focus groups #jiscad #ourise

1. Backup issues solved and rolling backups occuring 2. No new data is required; thoughts “would there be a use in knowing how many cameras are switched on?” • IOCOM data sets – poor in quantity as raw data sets: case study in a proof of capture to be created (wk 26/4/11) Consider difference with current setup. • QAtest data to be gathered and incorporated then correlate through booking system

MAY 19, 2011 02:12P.M.

• GeoTag data to to be captured as frst stage of CO2 measurements • Stakeholders:


1. Contact CG users regarding room1:10

Search focus groups MAY 19, 2011 02:11P.M.

2. Inform TS: Gain access to registration files, via NS(SL) In parallel with the RISE project OU Library Services have been running an evaluation of the One-Stop search system now it has been in place for a few months. So we’ve had a survey running and Duncan from our Learning and Teaching team has now run a couple of focus groups, one with undergraduates and the other with postgraduates. As part of the focus group activities RISE asked the focus group team to raise the subject of whether they would find recommendations to be useful as one way of testing the project hypothesis.

3. Demos for groups proposed NGS, NNW and Mimas: aim for Carbon saving targets as an essential component • Next stages: 1. EVO to be considered as a data dsource – links are an issues through bridge 2. How to define terms; meeting, session, event

One of the suspicions that we had was that there might well be different attitudes to recommendations based on the level they were studying at. The team running the focus groups have now written them up and we have some initial feedback about what students had to say about the value of recommendations. Thanks to the One-Stop evaluation team and particularly to Duncan for covering recommendations in the focus groups.

3. GeoTagging (lon/lat) centre of mass and values (also contact Welsh research project on carbon cost for a JISC project) 4. New APIs: java script charting, hi-charts etc. to be considered

Undergraduates The focus group comprised six undergraduate students, three studying level 1 courses, 3 at level 2. Two had previously studied several modules up to level 3. The students were studying a range of subjects. The group were asked if they would make use of recommendations. There was a general consensus that ratings and reviews from other students would be beneficial (because ‘other people’s experiences are valuable’) especially if it was known which module the student leaving the rating had done, and how high a mark they had got for their module. Postgraduates This focus group was made up of five postgraduate students (one of which was also a member of staff) studying a range of different subjects through arts, science, social sciences and educational technology. The main feedback was that:


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011

• Students use citation information as a form of recommendation


on the relation between #selftracking and the #uciad project: http://uciad.info/ub/2011/05/ on-self-tracking/ #jiscad

• Students are wary of recommendations when they don’t know the recommender e.g. tutor recommendations are valued • It was felt that recommendations specific to a module should be fed through to that module’s website e.g. for good databases • Students would appreciate recommendations of synonyms when searching our collections e.g. stress/anxiety • Resources from the institutional repository are trusted as authors can be contacted (this comment from a student who is also a member of staff)

MAY 18, 2011 06:41P.M.

Reflections on the comments in the focus groups Knowing the provenance of a recommendation is clearly important and that seems to be a clear difference between academic recommendations and an ‘amazon-type’ purchasing recommendation. There is a critical element of trust that is needed. You could characterise it as ‘I don’t know whether to trust this information until I know more about who or where it comes from’ That implies a good level of academic caution about the quality of resource recommendations. So that is possibly a qualification to our hypothesis


On Self-Tracking MAY 18, 2011 06:29P.M. I have said it and repeated it numerous times, UCIAD is profoundly different from all the other JISC Activity data projects at many different levels. One of them, at the basis of our main hypothesis is that we consider activity data for the user’s own consumption, and to his/her own benefit. The team working in UCIAD has made this notion of usercentric personal information a guiding principle for research. With my colleague Matthew Rowe we recently described a major aspect of this research in a position paper for the W3C Workshop on Web Tracking and User Privacy: Self-tracking on the Web.

“That recommender systems can enhance the student experience in new generation e-resource discovery services” ‘Qual 1 … as long as it is clear where the recommendations come from and users have trust in their quality’

As described in the paper, entitled “Self-Tracking on the Web: Why and How“, self-tracking is “the activity of monitoring and analysing one’s own behaviour regarding personal information exchange and the consequences of such behaviour on their exposure, privacy and reputation“. We emphasize in this paper how existing tools and technologies to realise self-tracking on the Web are limited, especially in comparison with the tools and technologies used to track user activities and data to the benefit of organisations. The paper concluded that “achieving such a process of self-tracking can be very revealing to Web users, helping them reaching a better awareness of their own online behaviour, and a better understanding of the possible consequences of such behaviour on the exposure of their personal information. Such an approach appears to be crucially needed as the Web evolves to both a global information marketplace, and a major medium for all sorts of social interactions online. [...] We therefore argue that a more principled and comprehensive study of the activity of self-tracking on the Web and of the technological requirements for such an activity to take place should be conducted. This requires for both the social and conceptual models of the way personal information is exchanged on the Web to be related to the technological protocols that are used as mediums for instantiating these models. From a more concrete point of view, we believe that a new set of tools are to be created that will support users in monitoring their own activity on the Web”

Another reflection is that there is a slightly different focus between undergraduates and postgraduates. Undergraduates see quality as being represented by the success of students studying their modules, postgraduates see quality as being represented by recommendations being made by people they trust. Pushing recommendations into module websites is an interesting idea. There has been some discussion about methods of pushing tutor recommendations to students so this sounds like an area for further work at some stage. The idea of a synonym tool that could provide suggestions of related terms that could help with searching is also quite a good idea. Next steps RISE will be running some invidual sessions with users over the next month or so to test out the tools that have been built and to do some more detailed work to help to understand the circumstances where recommendations about e-resources are of use and what type of recommendations are best. Invitations are currently going out to a pool of students.


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011

UCIAD can be seen as an experiment in this direction. Focusing on Web data related to the interaction between an user and an organisation, it is looking at the techniques, the models and the tools that are necessary to enable users to have a personalised view on their own data, i.e., the data generated by their own activity. More generally, it is also setting up generic models of activity online i.e., the ontologies and the associated technological components, that can be reused in broader environments.

Sharing Code of Practice‘ and Tom wrote a useful overview of the report here on this blog. The #OURISE project team hit another project milestone with the launch of their RISE Google Gadget prototype. It will certainly be interesting to see how many users take the plunge and add the gadget without much provocation and what the pattern of uptake looks like in the weeks and months ahead. Maybe they’ll be able to compare notes with the #CULwidgets team at Cambridge who developed all manner of Google plug-ins as part of the JISC LMS programme.


@iamcreative With due respect, I don't use the jiscad hashtag for the majority of my ramblings ;-D

In addition to completing the relatively herculean task of exporting usage data for 46,575 graduates for the #LIDP project, Dave Pattern appeared to have the audience slightly aflutter following his presentation at the CILIP Cymru Conference. You can access a copy of his presentation via his blog and I’ll keep an eye out for the post-conference conversations as they emerge on the LIDP project blog.


Via @Fulup http://bit.ly/jV6OgP presenting #lidp usage data the @daveyp way #jiscad


this week's digest of #jiscad project activity: http://bit.ly/mnb9O1 *warning: contains @daveyp

MAY 18, 2011 09:16A.M.


RT @ChrisBanks @Graham_Stone: Catch the Library Impact Data Project out & about at conferences this year http://bit.ly/l3cd68 #lidp #jiscad


A Slight Return - Tabbloid #7ish MAY 18, 2011 09:54A.M. Open publication - Free publishing - More jisc

MAY 18, 2011 08:52A.M. So you’ll see that I managed to get Tabbloid up and running again - I asked it to arrive weekly, first thing on a Wednesday and it arrived on Monday instead and it goes all the way back to 12 April but I suppose you can’t have everything - I’m optimistic that it will settle down next week and we can move on and forget all about this sorry episode ;-) Last week the Information Commisioner’s Office published the ‘Data


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011


the EDINA example above as the basis of a paragraph in your own privacy policy – in consultation with your institution’s legal team. If your institution has already incorporated the paragraph because you are registered with the OpenURL Router, you may simply amend it to reflect the further activities that you undertake.

How To Guide [Draft]: ‘How to inform your users about data processing’

Additional resources:

MAY 18, 2011 03:17A.M.

The research undertaken by EDINA and the advice received prior to adopting this approach: http://edina.ac.uk/projects/Using_OpenURL_Activity_Data_Initial_Investigation

[This is a draft of a How To Guide that will be published as an deliverable of the synthesis team’s activities. Your comments are very much welcomed and will inform the final published version of this How To Guide.]

The University of Edinburgh’s Data Protection policies and definitions: http://www.recordsmanagement.ed.ac.uk/InfoStaff/DPstaff/DataProtection.htm http://www.recordsmanagement.ed.ac.uk/InfoStaff/DPstaff/DPDefinitions.htm

The problem: In planning the OpenURL Router activity data project, EDINA became aware that by processing activity data generated by the Router service, which is used by around 100 HE institutions, it effectively acts as a ‘data processor’. Even the act of deletion of data constitutes processing so it is difficult to avoid the status of data processor if activity is logged. In the project, EDINA is collecting, anonymising and aggregating activity data from the Router service but has no direct contact with end users. Thus, it can only discharge its data protection duties through individual institutions that are registered with the Router.

The University of Edinburgh’s Website Privacy policy: http://www.ed.ac.uk/about/website/privacy-policy JISC Legal’s ‘Data Protection Code of Practice for FE & HE’ [2008]: http://www.jisclegal.ac.uk/Portals/12/Documents/PDFs/DPACodeofpractice.pdf

Information Commissioner’s Office’s ‘Privacy by design’ resources: http://www.ico.gov.uk/for_organisations/data_protection/topic_guides/privacy_b Information about the EDINA ‘Using OpenURL Activity Data’ project: http://edina.ac.uk/projects/Using_OpenURL_Activity_data_summary.html

The solution: After taking legal advice, EDINA drafted a paragraph to supply to institutions that use the OpenURL Router service for them to add into their institutional privacy policies:


RT @Graham_Stone: Catch the Library Impact Data Project out and about at conferences this year http://bit.ly/l3cd68 #lidp #jiscad

“When you search for and/or access bibliographic resources such as journal articles, your request may be routed through the UK OpenURL Router Service (openurl.ac.uk), which is administered by EDINA at the University of Edinburgh. The Router service captures and anonymises activity data which are then included in an aggregation of data about use of bibliographic resources throughout UK Higher Education (UK HE). The aggregation is used as the basis of services for users in UK HE and is made available so that others may use it as the basis of services. The aggregation contains no information that could identify you as an individual.”

MAY 17, 2011 10:10P.M.

EDINA wrote to the institutional contacts for the OpenURL Router service giving them the opportunity to ‘opt out’ of this initiative, i.e. to have data related to their institutional OpenURL resolver service excluded from the aggregation. Institutions opting out had no need to revise their privacy policies. Fewer than 10% of institutions that are registered with the OpenURL Router opted out and several of those only did so temporarily, pending revision of their privacy policies. Taking it further: If you plan to process and release anonymised activity data, you may use


Todayâ&#x20AC;&#x2122;s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011



@LISResearch thanks for retweeting - the Library Impact Data Project is discovering some really interesting stuff #jiscad #lidp

Catch the Library Impact Data Project out and about at conferences this year http://bit.ly/l3cd68 #lidp #jiscad

MAY 17, 2011 09:42P.M.

MAY 17, 2011 05:20P.M.



RT @Graham_Stone: Catch the Library Impact Data Project out and about at conferences this year http://bit.ly/l3cd68 #lidp #jiscad

RT @Graham_Stone: There's only one @daveyp RT @hudeprints: new item: If you want to get laid, go to college... http://eprints.hud.ac.uk/10506/ #lidp #jiscad

MAY 17, 2011 08:58P.M.

MAY 17, 2011 05:19P.M.


RT @daveyp: my #cilipw11 slides... RT @hudeprints: new item: If you want to get laid, go to college... http://bit.ly/jV6OgP #jiscad


There's only one @daveyp RT @hudeprints: new item: If you want to get laid, go to college... http://eprints.hud.ac.uk/10506/ #lidp #jiscad

MAY 17, 2011 06:13P.M.

MAY 17, 2011 05:10P.M.


Todayâ&#x20AC;&#x2122;s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011



interesting article on 'the filter bubble' and the evils of overpersonalisation: http://cnet.co/jsLP6C #jiscad #rdtf [via @jillmwo]

Now doing project with partners (tags #lidp #jiscad ) Looking at 5 years of data to identify trends. Interesting results. #cilipw11

MAY 17, 2011 01:32P.M.

MAY 13, 2011 11:18A.M.



@benshowers I've got Tabbloid working again for #jiscad (by using a different browser) ... it's got 100 items in it #zoiks

#jiscad Library Impact Data Project currently being presented on at #cilipw11 by @daveyp. See http://j.mp/kaXMoo for presentation

MAY 16, 2011 11:21A.M.


many thanks for all the #cilipw11 tweet comments & questions! will try & collate and answer them on the #jiscad #lidp project blog :-)


2011 looks set to be the year of the Cook Book Metaphor: http://bit.ly/j9bLta and http://bit.ly/kHkJLG #opendata #jiscad

MAY 13, 2011 12:19P.M.

MAY 12, 2011 05:26P.M.


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011


Further details of how to find and use the Gadget are included on our new Search Interfaces page. This page provides details of both the main RISE search interface at http://library.open.ac.uk/rise and the Google Gadget.

RISE project Google Gadget now available http://bit.ly/k97KOa and new Search Interfaces page added http://bit.ly/kZguBm #jiscad #ourise


Phew — finished exporting & munging Hudders #jiscad #lidp data for @librarygirlknit to analyse. Usage data for a total of 46,575 graduates!

MAY 12, 2011 12:25P.M.

MAY 12, 2011 08:47A.M. RISE

Google Gadget and Search Interfaces page


MAY 12, 2011 12:23P.M.

Information Commissioner’s Office publishes UK code of practice on data sharing MAY 11, 2011 09:45A.M. For those of you thinking of sharing or publishing personal data as a result of these projects may be interested in the “Data sharing code of practice” from the Information Commissioner’s office. A mere 59 pages available as a pdf from the Information Commissioner’s office. A few quotes may give you a little of the flavour:

The RISE prototype Google Gadget is now available for use. This is a Google Gadget version of the main RISE interface that allows you to search our One-Stop e-resources service and see recommendations provided by RISE.

“As I said in launching the public consultation on the draft of this code, under the right circumstances and for the right reasons, data sharing across and between organisations can play a crucial role in providing a better, more efficient service to customers in a range of sectors – both public and private. But citizens’ and consumers’ rights under the Data Protection Act must be respected.”

It can be downloaded from the Google Gadgets directory here, or added by manually adding this link into the Add Stuff > Add feed or Gadget feature on your iGoogle desktop.

“Organisations that don’t understand what can and cannot be done legally are as likely to disadvantage their clients through excessive caution as they are by carelessness.”

The first time you use the Gadget it will ask you to sign in to the Open University using your computer login (external users can create a computer login and will be able to see search results and recommendations but won’t be able to connect to licensed resources).

“the code isn’t really about ‘sharing’ in the plain English sense. It’s more about different types of disclosure, often involving many organisations and very complex information chains; chains that grow ever longer,


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011

crossing organisational and even national boundaries.”

• What risk does the data sharing pose? ....

The code covers activities such as:

• Could the objective be achieved without sharing the data or by anonymising it? [my emphasis] It is not appropriate to use personal data to plan service provision, for example, where this could be done with information that does not amount to personal data.

• two departments of a local authority exchanging information to promote one of the authority’s services; • a school providing information about pupils to a research organisation;

• Do I need to update my notification?

By ‘data sharing’ we mean the disclosure of data from one or more organisations to a third party organisation or organisations, or the sharing of data between different parts of an organisation. Data sharing can take the form of:

• Will any of the data be transferred outside of the European Economic Area (EEA)?

• a reciprocal exchange of data;

Whilst consent will provide a basis on which organisations can share personal data, the ICO recognises that it is not always achievable or even desirable.

• one or more organisations providing data to a third party or parties;

If you are going to rely on consent as your condition you must be sure that individuals know precisely what data sharing they are consenting to and understand its implications for them. They must also have genuine control over whether or not the data sharing takes place.

• several organisations pooling information and making it available to each other; • several organisations pooling information and making it available to a third party or parties;

— it goes on to say where consent is most appropriate and what other conditions allow sharing (p14-15), with some examples of what is permissable

• different parts of the same organisation making data available to each other.

The general rule in the DPA is that individuals should, at least, be aware that personal data about them has been, or is going to be, shared – even if their consent for the sharing is not needed.

When we talk about ‘data sharing’ most people will understand this as sharing data between organisations. However, the data protection principles also apply to the sharing of information within an organisation – for example between the different departments of a local authority or financial services company.

The Data Protection Act (DPA) requires organisations to have appropriate technical and organisational measures in place when sharing personal data.

When deciding whether to enter into an arrangement to share personal data (either as a provider, a recipient or both) you need to identify the objective that it is meant to achieve. You should consider the potential benefits and risks, either to individuals or society, of sharing the data. You should also assess the likely results of not sharing the data. You should ask yourself:

followed by lots of useful guidance on this area covering both physical and technical security

• What is the sharing meant to achieve? ... • What information needs to be shared? ....

It is good practice to have a data sharing agreement in place, and to review it regularly, particularly where information is to be shared on a large scale, or on a regular basis.

• Who requires access to the shared personal data? .....

and outlines what should be covered by the agreement (p25)

• When should it be shared? ....

it is good practice to carry out a privacy impact assessment.

• How should it be shared? .... Agree common retention periods and deletion arrangements for the data you send and receive.

• How can we check the sharing is achieving its objectives? ....


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011

Things to avoid


• Misleading individuals about whether you intend to share their information.

So anyway onto the round up of the latest happenings in the world of #JISCAD ... if you look beyond my ‘curses to all technology’ headline tweets you will be treated to a rather uplifting post from the #JISCSALT project blog which reports a positively excited response from users to the prospect of academic library recommendations ... I don’t have a data server but if I did then I would print out their article and tape it above :)

• Sharing excessive or irrelevant information about people. • Sharing personal data when there is no need to do so • Not taking reasonable steps to ensure that information is accurate and up to date before you share it.

The #OURISE project have been dipping their toes into the robust anonymisation pool and delving beyond into the technical depths to look at how they will release their recommender data openly. They’re looking for feedback as to the most useful format for the data they release but their current thinking is both XML and as a MySQL database. They’re also soliciting feedback on their XML record format (which is based on the one developed by Mark van Harmelen as part of the MOSAIC project) so it looks like we have the makings of another ‘recipe’ emerging for our cookbook.

• Using incompatible information systems to share personal data, resulting in the loss, corruption or degradation of the data. • Having inappropriate security measures in place, Section 14 is on data sharing agreements pp41-3 Section 15 provides a data sharing checklist p46

The #OURISE project have also shared some useful information regarding how they’re making use of Google analytics to segment the behaviour of their users. And as if that wasn’t enough, they reported that development of the RISE Google Gadget is complete and ready to be put through its paces in the user evaluations. I don’t have the authority to hand out gold stars but if I did then the RISE project team would get one this week ;-)

the case study on p 55 covers research using data from other organisations


Tabbloid is dead, long live the tabloid

The rest of the stories in this week’s newspaper are from previous weeks and you’ll be glad to know that I won’t be treating you to a re-synthesis of those stories. Hopefully by next week technology will be behaving more co-operatively!

MAY 11, 2011 02:57A.M.


oh fruitloops ... looks like issuu.com doesn't like the #jiscad MakePDF file I uploaded: http://bit.ly/kI3g86 #shakesfistattechnology MAY 10, 2011 11:40P.M.

Technology has not been a friend of mine these past few weeks and after several attempts to rescusitate the failed weekly Tabbloid service I accepted defeat and looked for an alternative. So it is with great pleasure that I introduce you to the new weekly digest using FiveFeeds’ open source MakePDF PDF newspaper maker. Alas it rendered as a series of mostly blank pages when I uploaded it to Issuu.com so the battle is not quite won, but if you click on the image above then it will dynamically generate the PDF on the spot for you. I’ll also be emailing a copy out as


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011



the fivefilters open source pdf newspaper maker has come to the rescue: http://bit.ly/F6NQp #goodbyetabbloid #jiscad

Planning my jaunt to Llandrindod for #cilipw11 where I'll be talking about #jiscad #ildp

MAY 10, 2011 05:06P.M.

MAY 09, 2011 02:00P.M.



Just had team meeting with Tom Franklin for #jiscsalt #jiscad. Ended up visualisations. See 'visualcomplexity.com' Nice: http://bit.ly/V6bdE

janinerigby MAY 09, 2011 10:05A.M. I’ll admit it, I’m prepared to out myself, I’ve just finished a post graduate research degree and more than once I have used the Amazon book recommender. In fact when I say more than once, possibly over the course of my studies we’ll be getting into double figures. I’m not ashamed, (I may be about using Wikipedia, but let’s not go there), but I’m not ashamed because I did and so did many of my peers. There may be more traditional methods to conduct academic research, but sometimes, with a deadline looming and very little time for a physical trip to the library to speak to a librarian, finding resources in one or two clicks is just to attractive. My hunch is many other scholars also use this method to conduct research. Recently on another Copac project we facilitated some focus groups. The participants in the groups were postgraduate researchers, a mix of humanities and STEM. Some had used Copac before others had not. Although the focus groups were answering another hypothesis I couldn’t resist asking the gathered group, if they would find merit in a book recommender on Copac which was based on 10 years of library circulation data from a world class research library? It’s not often you see a group of students become visibly excited at the thought of a of new research tool, but they did that night. A book recommender, would make a positive impact on their research practices and was greeted with enthusiasm from the group. I thought it was worth mentioning this incident, because when the going gets tough, and we are drowning under data, it might be worth remembering that users really want this to happen.

MAY 10, 2011 03:28P.M.


in other news, my @tabbloid epaper thingy has failed to arrive for a third week .. anyone recommend an alternative collater? #jiscad #drats MAY 10, 2011 12:29P.M.


Today’s Tabbloid PERSONAL NEWS FOR iamhelenharrop@yahoo.co.uk

13 June 2011


AEIOU - The Business Case MAY 09, 2011 08:58A.M. The AEIOU project had an interesting visit from Tom Frankin on 21st April who helped us to develop two technical ‘recipes’ for the Activity Strand cookbook. These still need to be refined but it was an informative exercise, particularly for me as project manager, as it helped me to understand the processes involved in software development. Tom also took a look at the business case I am putting together and gave me some extremely useful advice about not trying to oversell the benefits as this weakens the message. I do need all the advice I can get given the current financial climate as it is not an easy task to convince a management team, trying to find ways to save money, of the benefits of a product. I suspect this will apply to all the projects.


Profile for Helen Harrop

JISC AD Tabbloid: 13 June 2011  

JISC Activity Data project blogs and #jiscad tweets

JISC AD Tabbloid: 13 June 2011  

JISC Activity Data project blogs and #jiscad tweets