Page 1

28 March 2011

Today’s Tabbloid PERSONAL NEWS FOR helen.harrop@sero.co.uk

JISCAD - TWITTER SEARCH

JISCAD - TWITTER SEARCH

RT @librarygirlknit: Update on the first Library Impact Data Project meeting - minutes are here: http://tinyurl.com/5vr7a84 #jiscad #lidp

More #lidp #jiscad updates. This time a discussion on the hypothesis: http://tinyurl.com/4o5emrs MAR 24, 2011 11:15A.M.

MAR 24, 2011 12:16P.M. LIBRARY IMPACT DATA PROJECT

Hypothesis musings. MAR 24, 2011 11:07A.M. JISCAD - TWITTER SEARCH Since the project began, I’ve been thinking about all the issues surrounding our hypothesis, and the kind of things we’ll need to consider as we go through our data collection and analysis.

RT @librarygirlknit: More #lidp #jiscad updates. This time a discussion on the hypothesis: http://tinyurl.com/4o5emrs

For anyone who doesn’t know, the project hypothesis states that: “There is a statistically significant correlation across a number of universities between library activity data and student attainment”

MAR 24, 2011 11:16A.M. The first obvious thing here is that we realise there are other factors in attainment! We do know that the library is only one piece in the jigsaw that makes a difference to what kind of grades students achieve. However, we do feel we’ll find a correlation in there somewhere (ideally a positive one!). Having thought about it beyond a basic level of “let’s find out”, the more I pondered, the more extra considerations leapt to mind! Do we need to look at module level or overall degree? There are all kinds of things that can happen that are module specific, so students may not be required to produce work that would link into library resources, but still need to submit something for marking. Some modules may be based purely on their own reflection or creativity. Would those be significant enough to need noting in overall results? Probably not, but some degrees may have more of these types of modules than others, so could be worth remembering. My next thought was how much library resource usage counts as supportive for attainment. Depending on the course, students may only need a small amount of material to achieve high grades. Students on health sciences/medicine courses at Huddersfield are asked to work a lot

1


Today’s Tabbloid PERSONAL NEWS FOR helen.harrop@sero.co.uk

28 March 2011

at evidence based assignments, which would mean a lot of searching through university subscribed electronic resources, whereas a student on a history course might prefer to find primary sources outside of our subscriptions.

JISCAD - TWITTER SEARCH

RT @librarygirlknit: Update on the first Library Impact Data Project meeting - minutes are here: http://tinyurl.com/5vr7a84 #jiscad #lidp

On top of these, there all kinds of confounding factors that may play with how we interpret our results: • What happens if a student transfers courses or universities, and we can’t identify that? • What if teaching facilities in some buildings are poor and have an impact on student learning/grades?

MAR 24, 2011 10:37A.M. • Maybe a university has facilities other than the library through the library gates and so skews footfall statistics? • How much usage of the library facilities is for socialising rather than studying?

JISCAD - TWITTER SEARCH

Update on the first Library Impact Data Project meeting minutes are here: http://tinyurl.com/5vr7a84 #jiscad #lidp

• Certain groups of students may have an impact on data, such as distance learners and placement students, international students, or students with any personal specific needs. For example some students may be more likely to use one specific kind of resource a lot out of necessity. Will they be of a large enough number to skew results? • Some student groups are paid to attend courses and may have more incentive to participate in information literacy related elements e.g. nurses, who have information literacy classes with lots of access to e-resources as a compulsory part of their studies.

MAR 24, 2011 10:30A.M.

A key thing emerging here is that lots of resource access doesn’t always mean quality use of materials, critical thinking, good writing skills… And even after all this we need to think about sample sizes – our samples are self-selected, and involve varying sizes of universities with various access routes to resources. Will these differences between institutions be a factor as well?

LIBRARY IMPACT DATA PROJECT

Notes from the first meeting 11.03.11

All we can do for now is take note of these and remember them when we start getting data back, but for now I set to thinking about how I’d revise the hypothesis if we could do it again, with a what is admittedly a tiny percentage of these issues considered within it:

MAR 24, 2011 10:18A.M. The group had its very first meeting on Friday the 11th, and it was a full house – almost all the group members managed to make it to Huddersfield, and were greeted with hot cross buns and biscuits aplenty.

“There is a statistically significant correlation between library activity and student attainment at the point of final degree result”

Introductions were made, and the meeting kicked off with Dave Pattern providing an overview to the background of the project. The germ of an idea began when the library started investigating the kind of people who were using the library, looking at an overall picture rather than something specifically course based. However, it became obvious that there were certain courses who used the library a lot, and some who barely entered, if at all. Creating a non/low usage group within the library at Huddersfield gave the team a chance to focus on targeting specific groups to examine use in more detail, but never created a

So it considers library usage overall, degree result overall, and a lot of other factors to think about while we work on our data!

2


Today’s Tabbloid PERSONAL NEWS FOR helen.harrop@sero.co.uk

28 March 2011

statistically sound basis to make assumptions, and so the LIDP was conceived!

LIBRARY IMPACT DATA PROJECT

Project plan now available

Graham Stone, the project manager, went through the project documentation and how information is to be disseminated via the blog (with comments welcome from all project members), and reminded members that we don’t consider a positive correlation between library use and attainment to be a causal relationship! The group is very aware of other factors that come into attainment and is by no means suggesting that library use is the only element of importance! Data protection and ethical issues were considered, keeping in mind pending information from Huddersfield’s legal advisor.

MAR 23, 2011 10:40A.M. The latest version of the project plan was aggreed at the last project group meeting. You can find it here.

JISCAD - TWITTER SEARCH

event: #Kasabi platform for data publ'ng & re-use, London, May 10th #Lotico Semantic Web http://tinyurl.com/6lyxyb4 [via @neumarcx #jiscad]

Graham asked for volunteers to join a project steering group based at Huddersfield (taking travel distance into consideration!), and it was agreed that Salford would have a representative join the group (a blog post dedicated to the steering group is coming soon). Bryony Ramsden, the project research assistant, talked about issues that might disrupt the hypothesis (see the main hypothesis blog post), and introduced the idea of running focus groups. Some qualitative data would help explain exactly why some people use the library a huge amount, and some don’t, and help discover why discrepancies between courses might develop. Samples would ideally be a mixture of student types, covering the main groups of undergraduates and postgraduates both full and part time across various schools/bodies. Groups will need to run soon to ensure students aren’t disrupted too much before exams and assignment due dates begin to take up their time, and having found term differences between institutions already the plan was modified from running groups in April and May to over March and April! Data collection could end up running a little tight here, but a move forward could actually be beneficial to all parties if the data is ready earlier than planned.

MAR 23, 2011 09:48A.M.

JISCAD - TWITTER SEARCH

RT @richardn2009: New blog post on time-sensitive recommendations http://bit.ly/gI3HCq #ourise #jiscad

Dave talked about data collection and emphasised that he realises not all institutions will be able to provide all same sets of data types. He talked through different routes of accessing data to maximise what could be available with a minimum of difficulty. He offered a number of options for passing the data back to him (SQL, Excel, or he can provide coding to help if required), with at least data from academic year of 2009/10. Concerns were expressed that because of variations in graduation dates data may not cover a full academic year, but if these courses are flagged up there may be potential for comparison between like courses. Dave said he’ll create a document detailing the systems of each institution so that he can offer advice easily on data gathering, and reminded everyone that if they have any other data they think might be useful, he’ll welcome suggestions. Data encryption issues were discussed to emphasise the data protection issues raised in the exchange process. Data should be submitted to Dave by 23rd April.

MAR 22, 2011 07:35P.M.

Having discussed all the core important elements to get things moving, the group went their separate ways, some to trains and car journeys, others to the pub (the Head of Steam, right on the train platform for convenience…).

3


Today’s Tabbloid PERSONAL NEWS FOR helen.harrop@sero.co.uk

28 March 2011

JISCAD - TWITTER SEARCH This passed me by at the time but last year Mozilla ran an open data visualization competition based on their Test Pilot data. The winners were announced back in January.

New blog post on time-sensitive recommendations http://bit.ly/gI3HCq #ourise #jiscad

ACTIVITY DATA

Major impressions at the kickoff

MAR 22, 2011 07:28P.M.

MAR 22, 2011 03:25A.M. It’s a couple of weeks since our start up meeting in Birmingham and I’m sure your thinking is developing rapidly. However, some reflections on the ‘sum of the parts’ presented by all 9 projects may still be of value – certainly as a marker for the Synthesis Project. So here are the high level observations that we shared at the end of the Birmingham meeting:

ACTIVITY DATA

Tabbloid #2: 21 Mar 2011 MAR 22, 2011 05:26A.M. Open publication - Free publishing - More jisc

Variety & Volume – Our group of projects is particularly impressive in terms of the variety of data sources (a wide range of library, learning, repository and admin applications), the available volumes of data (including multi-year) and the potential aggregations (with opportunities in library, repository and VLE spaces). What’s more many of you had already collected formal commitments to supply data as part of the bidding process. Given this potential feast of possibility, the next points are particularly important …

I was particularly excited to see a tweet signposting to this blogpost by Eric Hellman of Gluejar where he’s done some heavy grade analysis on the “motherlode” of data released by the University of Huddersfield in order to look at the impact of Harper Collins’ ebook expiration strategy. It’s interesting to note how using that data is opening up the debate within the blog comments. The article also got me thinking about data dissemination and wondering what else needs to happen beyond making data open and then telling Lorcan Dempsey about it. Hmmm, food for thought over the coming months. A quick google search unearthed this useful post from 2008 on ReadWriteWeb which in turn points to the Open Knowledge Foundation’s CKAN data hub which in turn holds information on Huddersfield’s dataset and other library datasets, both open and not so open. It would be interesting to see something similar for examples of data mashups and visualisations with links to the open data they’ve used.

Time constraints - Given the timeline (5 months from 1 March), it may be best to plan backwards from the end point as well as forwards from the start – it’s always a useful sanity check. Whilst projects using an agile methodology could fit several sprints / iterations of activity in to the period, you are likely to be restricted by major milestones such as getting hold of the data (and ingesting / amalgamating it) in the first place. Technical priorities – It may be wise to park the tech challenges relating to scalability and repeatability and to concentrate on low cost agile experiments that will prove your hypothesis … or not! Given that something worthwhile emerges, it is highly likely that second order issues of performance and automation can be addressed post-hoc – and with relative ease, given the available tools.

[Last week’s Tabbloid features updates from the STAR-Trak and AEIOU projects] A few websites that have caught my attention this last week or so:

Algorithmic investigations – The experience of projects such as MOSAIC indicates that investigation (theoretical and practical) of the algorithms that underpin data processing (e.g. ingest), analysis, filtering and presentation will be really important, and increasingly so as the data scales beyond an initial experiment. And if you come up with a rule or algorithm (like the Huddersfield example of discarding course activity data where there are less than 35 students), please share it.

http://libraryhack.org/ [data released by libraries in Australia and New Zealand] Also from New Zealand, the Reading Rooms project which “used ‘live’ 3D animation to rebuild the architecture of the Design Faculty, Unitec, Auckland according to what students were borrowing from the campus library.”

Legal concerns – Whilst data protection and other legal issues do not appear to be show stoppers for this work right now, it will benefit us all to catalogue issues and responses, risks and mitigations as they emerge; I’m going to start a legal issues register (without calling it a risk register!)

The Guardian reported on David McCandless’ ‘consensus cloud’ visualisation of 100 books everyone should read [based on this collated dataset]

4


Today’s Tabbloid PERSONAL NEWS FOR helen.harrop@sero.co.uk

28 March 2011

with the points raised in Birmingham, hoping you’ll contribute more as we progress. We also agreed that we should address these challenges (ghost busting?) in a ‘can do’ manner, evidencing where institutions are taking affirmative action (e.g. upgrading privacy statements as highlighted by the Edina project) rather than diving for cover.

JISCAD - TWITTER SEARCH

RT @pipstar: Librarians in NZ and Aus open up datasets & invite users to use data in new ways http://is.gd/ODYdfA @libraryhack2011 [#jiscad]

David

JISCAD - TWITTER SEARCH

RT @richardn2009: Blog posts - thoughts on recommendations http://bit.ly/eJHHaY #jiscad #ourise

MAR 21, 2011 05:04P.M.

JISCAD - TWITTER SEARCH

RT @olishaw: RT @rachelsterne: An interesting comparison of 5 countries' #opendata strategies - 13-page PDF: http://cpo.st/ezTb8H [#jiscad]

MAR 21, 2011 06:35P.M.

JISCAD - TWITTER SEARCH

RT @richardn2009: Blog posts - thoughts on recommendations http://bit.ly/eJHHaY #jiscad #ourise

MAR 21, 2011 02:12P.M.

MAR 21, 2011 06:09P.M. JISCAD - TWITTER SEARCH

@shirleyayres: Making & Saving Money with #OpenData [free, Birmingham, 13 April] @hadleybeeman http://j.mp/fEqF2V [#jiscad #datavis]

JISCAD - TWITTER SEARCH

Blog posts - thoughts on recommendations http://bit.ly/eJHHaY #jiscad #ourise MAR 21, 2011 06:00P.M.

MAR 21, 2011 12:16P.M.

5

JISC AD Tabbloid: 28 Mar 2011  
JISC AD Tabbloid: 28 Mar 2011  

JISC Activity Data project blogs and #jiscad tweets

Advertisement